In Tshark's JSON output, why are fields single-value arrays?

tshark
JSON

asked 2023-03-31 18:06:45 +0000

I noticed that in JSON outputs -T json or -T ek, all fields will be an array with 1 value:

  {
    "_index": "packets-[redacted]",
    "_type": "doc",
    "_score": null,
    "_source": {
      "layers": {
        "frame.number": [
          "1"
        ],
        "frame.time": [
          "[redacted]"
        ],
        "eth.src": [
          "[redacted]"
        ],
        "eth.dst": [
          "[redacted]"
        ],
        "ip.src": [
          "[redacted]"
        ],
        "ip.dst": [
          "[redacted]"
        ],
        "ip.proto": [
          "6"
        ]
      }
    }
  },

etc

So my questions are:
1) What exactly are "layers" in this context?
2) Why is each field an array of one value (as opposed to not being an array)?

edit retag flag offensive close merge delete

Comments

What version of Wireshark/TShark is this, and what were all the command-line arguments to tshark?

Guy Harris ( 2023-04-02 22:51:36 +0000 )edit

TShark (Wireshark) 4.0.4 (v4.0.4-0-gea14d468d9ca) on macOS (AARCH64)

tshark -r [some pcap] -T json -e frame.number -e frame.time -e eth.src -e eth.dst -e ip.src -e ip.dst -e ip.proto -E occurrence=f

person_with_account ( 2023-04-03 11:43:17 +0000 )edit

add a comment

0

answered 2023-04-06 06:05:51 +0000

Guy Harris
19905 ●3 ●682 ●207

1) What exactly are "layers" in this context?

An attempt to make the JSON syntaxes for -T json without any -e options, and -T json with -e options, more like each other?

For -T json without any -e options, "layers" is an object containing multiple protocol layers; each protocol layer is an object containing the fields in that protocol layer.

For -T json with -e options, it's an object containing all of the fields being reported. Perhaps it should have been called "fields" in that case, but it wasn't.

2) Why is each field an array of one value (as opposed to not being an array)?

Because, without -E occurrence= or with -E occurrence=a, all occurrences of a given field in a packet are reported, and the code doesn't create "there's only one occurrence" as a special case. -E occurrence=f and -E occurrence=l cause only the first or last occurrence, respectively, to be reported, but the code doesn't treat that specially, either.

JSON, as specified by the second edition of the ECMA-404 spec, doesn't require that the names of members of an objects be unique. If you're trying to wire out JSON that faithfully reproduces the structure of a dissected packet, this is good, because:

there are no guarantees that there is, for example, at most one IPv4 header in a packet, because you can tunnel IP inside another IP packet with various mechanisms such as L2TP, so there are no guarantees that, for example, there's only one "ip.src" field in a packet;
there are no guarantees that there is, for example, only one instance of a given IPv4 option in a packet, so, even within one protocol layer, there may be more than one instance of a given field.

JSON, as specified by RFC 8259, says

...The names within an object SHOULD be unique.

...

An object whose names are all unique is interoperable in the sense that all software implementations receiving that object will agree on the name-value mappings. When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates.

so, while it permits object member names that aren't unique, it recommends that JSON writers not do that, and notes that not all JSON readers will handle that in a fashion that allows correct processing of the output of Wireshark/TShark.

The output of -T json without any -e options ignores "The names within an object SHOULD be unique."; its output faithfully reflects the way the packet is dissected, complete with, for example, multiple "ip" members of "layers" if there's more than one IPv4 header.

The output of -T json with -e options might do as "The names within an object SHOULD be unique ... (more)

edit flag offensive delete link

Comments

Thank you, very complete answer

person_with_account ( 2023-04-13 18:02:21 +0000 )edit

add a comment

0

answered 2023-04-05 13:18:43 +0000

pac122
5 ●1 ●4 ●8

updated 2023-04-05 13:19:54 +0000

Parameters "eth.dst, ip.src, ip.dst and ip.proto" are captured at IP packet level. You have one single IP packet captured in JSON, see "frame.number=1". One packet, so single value for those parameters.

edit flag offensive delete link

add a comment

In Tshark's JSON output, why are fields single-value arrays?

Comments

2 Answers

Comments

Your Answer

Question Tools

Stats

Related questions

In Tshark's JSON output, why are fields single-value arrays? edit

Comments

2 Answers

Comments

Your Answer

Question Tools

Stats

Related questions

In Tshark's JSON output, why are fields single-value arrays?