Ask Your Question
0

Why do JSON and PDML exports have different data from the same session?

asked 2020-03-25 15:54:08 +0000

alohawireshark gravatar image

I am new to Wireshark. I filtered my captured packet traffic to an IP address associated with an HTTPS site I control. When exporting this data, I tried exporting it to .csv, .json, .pdml, etc.

Why do the contents of these files differ despite exporting the same data? For example, the PDML file contains information from the "info" column, which seems like a user-friendly column that summarizes the purpose of that packet. For example, "Client Hello" or "Application Data." This same information is absent from the .json file. Why is that the case? What other information is included in the .PDML but not in the .JSON, and vice versa?

I could not find any documentation for these differences and it is difficult to manually parse any patterns.

edit retag flag offensive close merge delete

Comments

Have you tried doing the exports with tshark ?
There is some information in the man page

Chuckc gravatar imageChuckc ( 2020-03-25 17:29:21 +0000 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2020-04-12 21:26:24 +0000

Guy Harris gravatar image

For example, the PDML file contains information from the "info" column

I don't see that in the PDML output I just got from TShark; it's generated from the packet detail information in the packet detail pane of Wireshark and the -V text output from TShark, not the packet summary information in the packet summary pane of Wireshark and the default text output from TShark.

That does include all layers of the packet detail information, including the top-level layers which usually have summary information such as the type of packet.

With XML, every single item in a hierarchical representation of data can have arbitrary properties, including the "showname" property in PDML, so the top-level layers can have that display information.

This same information is absent from the .json file. Why is that the case?

JSON objects are "an unordered set of name/value pairs". A value can be a string in double quotes, or a number, or true or false or null, or an object or an array". A value cannot simultaneously be a string and an object, and a name cannot have multiple values associated with it, so, for example, the top-level item for a protocol is an object, with entries for all the items below it; only the "simple" items at the bottom of the tree can have string or numerical or Boolean values, everything else has an object as its value, containing the items below it, so there's no place in JSON to put that information.

(JSON is not the best fit for a Wireshark-style packet dissection; it also has trouble handling a representation such as

Foo_protocol = {
    foo_address = 127.0.0.1;
    foo_address = 192.168.42.85;
    foo_comment = "This is the first foo";
    foo_comment = "This foo has two addresses";
}

where a given protocol may allow more than one instance of the same field. XML has fewer restrictions, so PDML could be defined as an XML document type and represent the dissection.)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2020-03-25 15:54:08 +0000

Seen: 501 times

Last updated: Apr 12 '20