Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

1) What exactly are "layers" in this context?

An attempt to make the JSON syntaxes for -T json without any -e options, and -T json with -e options, more like each other?

For -T json without any -e options, "layers" is an object containing multiple protocol layers; each protocol layer is an object containing the fields in that protocol layer.

For -T json with -e options, it's an object containing all of the fields being reported. Perhaps it should have been called "fields" in that case, but it wasn't.

2) Why is each field an array of one value (as opposed to not being an array)?

Because, without -E occurrence= or with -E occurrence=a, all occurrences of a given field in a packet are reported, and the code doesn't create "there's only one occurrence" as a special case. -E occurrence=f and -E occurrence=l cause only the first or last occurrence, respectively, to be reported, but the code doesn't treat that specially, either.

JSON, as specified by the second edition of the ECMA-404 spec, doesn't require that the names of members of an objects be unique. If you're trying to wire out JSON that faithfully reproduces the structure of a dissected packet, this is good, because:

  • there are no guarantees that there is, for example, at most one IPv4 header in a packet, because you can tunnel IP inside another IP packet with various mechanisms such as L2TP, so there are no guarantees that, for example, there's only one "ip.src" field in a packet;
  • there are no guarantees that there is, for example, only one instance of a given IPv4 option in a packet, so, even within one protocol layer, there may be more than one instance of a given field.

JSON, as specified by RFC 8259, says

...The names within an object SHOULD be unique.

...

An object whose names are all unique is interoperable in the sense that all software implementations receiving that object will agree on the name-value mappings. When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates.

so, while it permits object member names that aren't unique, it recommends that JSON writers not do that, and notes that not all JSON readers will handle that in a fashion that allows correct processing of the output of Wireshark/TShark.

The output of -T json without any -e options ignores "The names within an object SHOULD be unique."; its output faithfully reflects the way the packet is dissected, complete with, for example, multiple "ip" members of "layers" if there's more than one IPv4 header.

The output of -T json with -e options might do as "The names within an object SHOULD be unique." suggests, but if, for example, you have two layers of IPv4 headers as a result of tunneling, the "ip." fields will be arrays with two members, one from each IPv4 header, with what may not be enough information for whatever software is reading the JSON output to figure out that this is the result of tunneling.

(TL;DR: if whatever is reading the JSON gets confused if, for example, there's more than one "ip.src" or "ip.dst" field value, because it assumes, explicitly or implicitly, that there's always only one IPv4 header in a packet, it's assuming something that's not true in the real world, and will thus not be able to deal with all packets that are captured in the real world.)