duplicate fields -T ek

asked 2020-04-09 06:24:39 +0000

Elk gravatar image

updated 2020-04-09 08:46:01 +0000

grahamb gravatar image

Hello,

I find multiple duplicate fields in my JSON file that I get when I use tshark to convert pcap to JSON.

I know that this question has already been asked before, but the issue still persist:

https://ask.wireshark.org/question/50... https://bugs.wireshark.org/bugzilla/s...

I'm running tshark on windows 10 My tshark is of version:

tshark -version
TShark (Wireshark) 3.2.1 (v3.2.1-0-gbf38a67724d0)

I'm using the command:

tshark -r capture.pcap -T ek > packets.json

To generate my JSON file, it can be found here: https://www.elastic.co/blog/analyzing...

But when I try to push the JSON file to elasticsearch I get duplicate field error: Command:

curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/test/_bulk --data-binary "@test.json" | jq

Error:

{   "took": 3,   "errors": true,   "items": [
    {
      "index": {
        "_index": "packets-2020-04-07",
        "_type": "doc",
        "_id": "4YrWWHEBv6GDe8EVEwkp",
        "status": 400,
        "error": {
          "type": "mapper_parsing_exception",
          "reason": "failed to parse",
          "caused_by": {
            "type": "json_parse_exception",
            "reason": "Duplicate field 'eth_eth_addr'\n at [Source: org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper@587b250c; line: 1, column: 1150]"
          }
        }
      }
    },
    {
      "index": {
        "_index": "packets-2020-04-07",
        "_type": "doc",
        "_id": "4orWWHEBv6GDe8EVEwkp",
        "status": 400,
        "error": {
          "type": "mapper_parsing_exception",
          "reason": "failed to parse",
          "caused_by": {
            "type": "json_parse_exception",
            "reason": "Duplicate field 'eth_eth_addr'\n at [Source: org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper@4f25e307; line: 1, column: 1130]"
          }
        }
      }
    }   
  ]
 }

I'm in the need of being able to handle fairly large files as fast as possible to quickly detect and find errors. So the idea of trying to write a program that goes through every field to check if it is a duplicate of an earlier field is not an realist solution for me.

edit retag flag offensive close merge delete

Comments

Unfortunately this won't be fixed until someone steps to resolve the issue as noted by your linked Bug 15719.

grahamb gravatar imagegrahamb ( 2020-04-09 08:50:20 +0000 )edit

@Elk may want to follow over on Wireshark Gitlab.
15719 split into 15759 (now closed) and 15760

There was also a Merge144 - Elasticsearch: support for version >= 5

Chuckc gravatar imageChuckc ( 2020-10-22 17:21:43 +0000 )edit