Is there a bug in tshark pdml output?

asked 2021-03-24 10:26:51 +0000

TalH gravatar image

updated 2021-03-24 15:35:31 +0000

I extract the capture data of my network interface (which seems correct and legal in the wireshark GUI) using tshark in the format of pdml. The command is "tshark.exe -i 3 -T pdml".

I notice a consistent issue in one of the pdml fields. the field is in the tcp layer:

"<field name="tcp.flags.str" showname="TCP Flags: ┬╖┬╖┬╖┬╖┬╖┬╖┬╖AP┬╖┬╖┬╖" size="2" pos="46" show="" "<="" p="">

the issue seems like part of the cml line is missing.

Is there a known issue with that? Should this line just be ignored?

Thank you in advance.

edit: Wireshark version 3.4.3. OS windows 10 Enterprise.

For the console, I attempted to use the pdml output in my own dotnet program so I ran a tshark process and redirected the output to a dotnet stream. I tried it with a UDP data and it was fine. The problems started when I read TCP and TLS layers data.

Thanks to your question, I rechecked myself and ran tsahrk in a powershell and now I see the missing end of the fields, for some reason.

So now I know tshark does output the pdml correctly (and it is likely I am dropping part of the tshark fields in dotnet somewhere).

I will recheck my work and share if any issue that may interest the Wireshark community will rise. Thank you for the comment.

OS and Wireshark versions and are you viewing the output in a shell or an editor?

grahamb gravatar imagegrahamb ( 2021-03-24 12:08:31 +0000 )edit

answered 2021-03-24 15:06:09 +0000

Chuckc gravatar image

updated 2021-03-24 15:08:42 +0000

16649: tcp.flags.str doesn't display the string properly in either a powershell or cmd command prompt.

In recent versions of Windows you can switch to cp 65001 by doing the

Go to "Settings → Language → Administrative language settings"
Press "Change system locale"
Check "Beta: use Unicode UTF-8 for worldwide language support"

(related past issues: 12393, 12763)

I'd be a little wary of changing that setting, it's marked as "Beta" for a reason.

I don't have that checked, my current code page is 850 (MS-DOS Latin 1) in PowerShell and Windows Terminal which I think standard for the UK (maybe Western Europe) and using a 3.5.x build the tcp.flags.str string is output as expected when using regular tshark output or forcing PDML output (which explicitly states the output is UTF-8 with the document encoding element at the start).

grahamb gravatar imagegrahamb ( 2021-03-24 15:31:30 +0000 )edit

Change 37560 Windows: Set our locale to ".UTF-8". in issue 16649 updated the tshark man pageOUTPUT section with info on code pages and UTF-8.

Chuckc gravatar imageChuckc ( 2021-03-24 18:57:29 +0000 )edit

Asked: 2021-03-24 10:26:51 +0000

Seen: 695 times

Last updated: Mar 24 '21