Ask Your Question
0

throughput issue dropped packet slow start

asked 2018-07-10 19:08:26 +0000

anonymous user

Anonymous

Peeps,

I have a 1g link and I am using iperf to test throughput. I am only getting around 20Mbps. If I use 10 parallel streams I can push 100Mbps on the link but still only 20Mbps per stream. I have to work remote so I can only capture on the device. I am guessing because of the dropped (looks like its drop at the receiving nic but i am not sure, nic reports no dropped packet and the interface also reports no dropped packet) the slow start can't reach its full potential.

Thanks for any insite you may have.

https://drive.google.com/open?id=165v...

https://drive.google.com/open?id=1ERS...

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
1

answered 2018-07-11 05:19:53 +0000

NJL gravatar image

updated 2018-07-11 05:27:52 +0000

The TCP Receive Window is way to low for you to fill that 1G circuit with such a high round trip time (RTT). You should be able to tune the TCP behaviour on the receiver to allow it to scale the Receive Window much higher than the 212992 bits that the receiver sets.

The optimum TCP Receive Window for a 1Gbps link with a RTT of 73msecs (as in your capture), the result is 9,125MB (Bandwidth in bits * RTT in seconds = Windows Size in bits / 8 = Window size in Bytes)

For more information one place to start is this presentation by Kary Rogers.

edit flag offensive delete link more

Comments

Hi Kary,

Are you the packetbomb.com guy?

If I do the same test but send on a different subnet (on the same router) the receiver window is larger (same receiver as the test I sent.)

Form my understanding the receiver window is determined the the servers ability to process packets ie buffer is full lower window.

Thanks!

Helpless gravatar imageHelpless ( 2018-07-12 20:26:23 +0000 )edit

Hi, No sorry, I'm not Kary - I'm NJL :-)

I'm not sure I follow what you mean.

Can you upload a new capture?

NJL gravatar imageNJL ( 2018-07-13 07:33:02 +0000 )edit

All,

I did 2 iperf test today the first test was sent from 172.21.2.150 going to 172.21.1.161. These servers are in two different data centers. I used the basic -c command and the throughput was around 20Mb over a Gig link.

The second test was sent from 172.21.1.80 going two 172.21.1.161 These servers are in the same subnet same location. I used the -c and -b 200m commands. The throughput was 200MB.

The calculated window size is the same for both but the throughput is way different

https://drive.google.com/open?id=16Ag...

https://drive.google.com/open?id=1IDd...

https://drive.google.com/open?id=1pTf...

https://drive.google.com/open?id=1_Cw...

Helpless gravatar imageHelpless ( 2018-07-13 15:46:58 +0000 )edit

First off, please don't answer a question in the comment section with an answer to your original question, that will only confuse other readers. You should use a comment instead.

The problem is the size of the receive window coupled with the latency between your servers. With a TCP Receive Window too low, the sender will never put enough data on the wire to "fill the pipe" towards the receiver and the sender must wait for the receiver to ACK the sent data, before he can send another chunk of data. This means that the sender is idle for the most part, waiting for ACKs from the receiver. Solution: Increase the TCP Receive Window on the receiver.

Regarding your capture between servers locally:

If you also have 1Gbps of bandwidth to/from both of the local servers, then you should see higher throughput and in the captures you've ...(more)

NJL gravatar imageNJL ( 2018-07-13 16:24:02 +0000 )edit

NJL,

Thanks for taking the time to help me.

So correct me if I am wrong but the Window size value is normal but the calculated window size is low because the Window size scaling factor is low. The reason that the Window size scaling factor is low is because of the receiver. If all that sounds correct why would my server running windows server 2016 have such a small scaling factor? That means that every one running windows doing backups across the WAN MUST modify the default settings?

Thanks, Helpless

Helpless gravatar imageHelpless ( 2018-07-13 18:38:20 +0000 )edit
0

answered 2018-07-11 20:36:03 +0000

mrEEde gravatar image

NJL is correct that the reason for your performance problem is the limited receive window size of 212992 bytes offered by the server.

As the receiver is a windows 2008 the following command can grow the RWIN per https://www.speedguide.net/articles/windows-7-vista-2008-tweaks-2574

netsh int tcp set global autotuninglevel=experimental

experimental: allows the receive window to grow to accommodate extreme scenarios (not recommended, it can degrade performance in common scenarios, only intended for research purposes. It enables RWIN values of over 16 MB)

As stated it might speed up your iperf throughput on this high latency speed test but may be detrimental to your other traffic . Regards Matthias

edit flag offensive delete link more

Comments

Thanks for your input. That same server sends a larger window size if I "Send" from a different server with a different subnet on the same router. FYI nothing is on the receiving server, just iperf so I know the server is capable of a larger window size.

Helpless gravatar imageHelpless ( 2018-07-12 20:30:48 +0000 )edit

In your traces the receiver offered a window scaling shift factor of 2 which means the window size is limited to 64K * 4 so it can not grow beyond 256K. If you see higher RWIN offerings than this coming from the server then the 3-way handshake must be different. Did you trace the 3-way handshake? What do the window_scale options look like ?

mrEEde gravatar imagemrEEde ( 2018-07-13 04:37:19 +0000 )edit

All,

I did 2 iperf test today the first test was sent from 172.21.2.150 going to 172.21.1.161. These servers are in two different data centers. I used the basic -c command and the throughput was around 20Mb over a Gig link.

The second test was sent from 172.21.1.80 going two 172.21.1.161 These servers are in the same subnet same location. I used the -c and -b 200m commands. The throughput was 200MB.

The calculated window size is the same for both but the throughput is way different

https://drive.google.com/open?id=16Ag...

https://drive.google.com/open?id=1IDd...

https://drive.google.com/open?id=1pTf...

https://drive.google.com/open?id=1_Cw...

Helpless gravatar imageHelpless ( 2018-07-13 15:46:55 +0000 )edit

I have added more captures. You are right the scale in the handshake is 2. When I did a iperf to a public server their window scale was 4096 and mine was only 2.

Here is where it gets funny...I do a packet capture on the server while doing a online speed test and my window scale is 128.

Is my 78ms delay really what causing the scale to be 2?

Helpless gravatar imageHelpless ( 2018-07-13 18:48:07 +0000 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2018-07-10 19:08:26 +0000

Seen: 846 times

Last updated: Jul 11 '18