1) What exactly are "layers" in this context?
An attempt to make the JSON syntaxes for -T json
without any -e
options, and -T json
with -e
options, more like each other?
For -T json
without any -e
options, "layers" is an object containing multiple protocol layers; each protocol layer is an object containing the fields in that protocol layer.
For -T json
with -e
options, it's an object containing all of the fields being reported. Perhaps it should have been called "fields" in that case, but it wasn't.
2) Why is each field an array of one value (as opposed to not being an array)?
Because, without -E occurrence=
or with -E occurrence=a
, all occurrences of a given field in a packet are reported, and the code doesn't create "there's only one occurrence" as a special case. -E occurrence=f
and -E occurrence=l
cause only the first or last occurrence, respectively, to be reported, but the code doesn't treat that specially, either.
JSON, as specified by the second edition of the ECMA-404 spec, doesn't require that the names of members of an objects be unique. If you're trying to wire out JSON that faithfully reproduces the structure of a dissected packet, this is good, because:
- there are no guarantees that there is, for example, at most one IPv4 header in a packet, because you can tunnel IP inside another IP packet with various mechanisms such as L2TP, so there are no guarantees that, for example, there's only one "ip.src" field in a packet;
- there are no guarantees that there is, for example, only one instance of a given IPv4 option in a packet, so, even within one protocol layer, there may be more than one instance of a given field.
JSON, as specified by RFC 8259, says
...The names within an object SHOULD be unique.
...
An object whose names are all unique is interoperable in the sense that all software implementations receiving that object will agree on the name-value mappings. When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates.
so, while it permits object member names that aren't unique, it recommends that JSON writers not do that, and notes that not all JSON readers will handle that in a fashion that allows correct processing of the output of Wireshark/TShark.
The output of -T json
without any -e
options ignores "The names within an object SHOULD be unique."; its output faithfully reflects the way the packet is dissected, complete with, for example, multiple "ip" members of "layers" if there's more than one IPv4 header.
The output of -T json
with -e
options might do as "The names within an object SHOULD be unique ... (more)
What version of Wireshark/TShark is this, and what were all the command-line arguments to tshark?
TShark (Wireshark) 4.0.4 (v4.0.4-0-gea14d468d9ca) on macOS (AARCH64)
tshark -r [some pcap] -T json -e frame.number -e frame.time -e eth.src -e eth.dst -e ip.src -e ip.dst -e ip.proto -E occurrence=f