Tshark performance problem

Question

I execute tshark command. Process is too slow. Sometimes take more than 6-7 seconds. Pcap file is really small (~500bytes). Actually strange thing is that re-execute same command consequently, process duration is decreasing dramatically (about 1 second). After wait for a while (without execution command), re-run same command and process completion duration increase again. Why process behave like that? How to solve this problem?

Command :

tshark -ta -r test.pcap

OS Details :

SunOS er 5.10 Generic_147441-01 i86pc i386 i86pc

Thanks

Accepted Answer

1

Actually strange thing is that re-execute same command consequently, process duration is decreasing dramatically (about 1 second)

That's most certainly due to DNS resolving. The seconds time it runs faster due to DNS caching. Please use option '-n' and tshark should run faster.

tshark -ta -n -r test.pcap

If that does not help try this:

tshark -ta -r test.pcap -o name_resolve:FALSE

Regards
Kurt

answered 13 Mar '13, 04:30

Kurt Knochner ♦
24.8k●10●39●237
accept rate: 15%

edited 13 Mar '13, 04:31

Both of options does not solve my problem. Tshark behaviour is not changed.

(13 Mar '13, 04:56) erdinc

what is you tshark version (tshark -v)?

Can you please add the output of the following command for the first call (long run time) and the second call (short run time)?

Case #1:

truss -a -d -e -f -o /var/tmp/tshark_case1.out tshark -ta -r test.pcap

Case #2:

truss -a -d -e -f -o /var/tmp/tshark_case2.out tshark -ta -r test.pcap

Please post the content of the files tshark_case1.out and tshark_case2.out somewhere.

(13 Mar '13, 16:49) Kurt Knochner ♦

I did your test, Biggest differences is at brk function. brk function consume lots of time. Output files link is below. http://www.speedyshare.com/AwzRh/tshark.rar What is wrong about memory (brk), why brk need to time, how to resolve it?

(29 Mar '13, 01:55) erdinc

As far as I know, brk() is used my malloc() to expand the programs heap. However, I have no idea why it takes that much longer in case #1 (first load) than in case #3 (second load).

Can you please post the output of the command 'limit' or 'ulimit -a' (whatever works)?

(02 Apr '13, 10:08) Kurt Knochner ♦

ulimit -a

core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 10
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 27493
virtual memory (kbytes, -v) unlimited

(03 Apr '13, 03:02) erdinc

can you please try to increase the stack size (ulimit -s) and see if that changes anything?

(03 Apr '13, 03:28) Kurt Knochner ♦

I set stack size to 100.000 (increase 10 times), result is same.

(03 Apr '13, 06:27) erdinc

Hm.. looks like there is 'something' on your system that makes memory reallocation slow for the first time (brk calls) and faster for consecutive calls. Could be a specific behavior of the solaris memory management or just a specific behavior of your installation, like a large memory consumption on the system and thus swapping. Are there any processes that consume large amounts of memory (output of 'top' and 'swapon -s')?

I googled for slow brk() calls, but was unable to find something interesting. Can you please check my assumption about memory usage (above).

(03 Apr '13, 07:02) Kurt Knochner ♦

Yes, system use most of memory. 12 GB memory is installed. Top command says %63 percentage is used. Only one process is about half of total memory consuming. I observe similar behaviour at windows machine, I guess differences is not as big as at solaris

(03 Apr '13, 10:11) erdinc

showing 5 of 9 show 4 more comments

Answer 2

How much time a process takes is dependent on a lot of things. It looks like your initial tshark execution is taking more time because of disk IO. The second time you run tshark it will be faster since it will have cached things in memory. When you run tshark for the 3rd time after waiting a while, the cache will have flushed your data from cache and will need to re-read it from disk (or the network in case of NFS, iSCSI, etc).

You can check this by using the "time" utility, although I'm not sure what options you have with it on solaris. But on my OSX I get the following:

$ /usr/bin/time -l tshark -ta -r bug4716.cap 
  1 14:41:28.843078     10.0.0.1 -> 10.0.0.1     TCP 62 swrmi > EtherNet-IP-1 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 SACK_PERM=1
  2 14:41:29.169626     10.0.0.1 -> 10.0.0.1     TCP 60 swrmi > EtherNet-IP-1 [ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=0
  3 14:41:29.712539     10.0.0.1 -> 10.0.0.1     TCP 93 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=39
  4 14:41:30.203991     10.0.0.1 -> 10.0.0.1     TCP 119 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=40 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=65
  5 14:41:32.235725     10.0.0.1 -> 10.0.0.1     TCP 60 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=105 Ack=2 Win=65534 [TCP CHECKSUM INCORRECT] Len=1
        1.21 real         0.22 user         0.20 sys
  88346624  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     22420  page reclaims
         0  page faults
         0  swaps
        55  block input operations
        34  block output operations
         0  messages sent
         0  messages received
         0  signals received
       161  voluntary context switches
        21  involuntary context switches
$ /usr/bin/time -l tshark -ta -r bug4716.cap 
  1 14:41:28.843078     10.0.0.1 -> 10.0.0.1     TCP 62 swrmi > EtherNet-IP-1 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 SACK_PERM=1
  2 14:41:29.169626     10.0.0.1 -> 10.0.0.1     TCP 60 swrmi > EtherNet-IP-1 [ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=0
  3 14:41:29.712539     10.0.0.1 -> 10.0.0.1     TCP 93 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=39
  4 14:41:30.203991     10.0.0.1 -> 10.0.0.1     TCP 119 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=40 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=65
  5 14:41:32.235725     10.0.0.1 -> 10.0.0.1     TCP 60 swrmi > EtherNet-IP-1 [PSH, ACK] Seq=105 Ack=2 Win=65534 [TCP CHECKSUM INCORRECT] Len=1
        0.56 real         0.22 user         0.19 sys
  88346624  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     22420  page reclaims
         0  page faults
         0  swaps
         0  block input operations
         2  block output operations
         0  messages sent
         0  messages received
         0  signals received
       106  voluntary context switches
       338  involuntary context switches
$

Please note the block input/output operations and also the "involuntary context switches" which are caused by other processes (and which could contribute to extra load time if the system is heavily loaded).

As you can see, the sys and user time are about the same for both executions, but the real time differs because of time spent waiting on the disk and/or other processes.