Ask Your Question
0

Duplicate ACK and TCP Retransmission

asked 2020-03-03 20:09:35 +0000

ubu3389 gravatar image

I'm doing some scp transfers between an AIX and a Linux in two separated LAN (I have a VPN between them).

Sometimes the connection ends well, so I get 50MB/s. In many other cases, instead, the connections starts fast and then drops until it ends with a low throughput, around 4-5 MB/s.

These are the test I made (file dimension: 1GB)

Scp transfer

As you can see, there's some cases where my connection is very low.

I made some tcpdump AIX-side and I found lots of DupACK and TCP Retransmissions. This is my dump seen from the AIX: Dump AIX to Linux

Could you help me understand what is wrong?

I made also a dump from the firewall. Let me know if this may help

Thank you :)

edit retag flag offensive close merge delete

Comments

That looks like the same capture file with the large packets in it.
@Jaap suggested doing another capture that shows the smaller packets on the wire.
Does the firewall capture have the small IP packets?

Chuckc gravatar imageChuckc ( 2020-03-03 20:55:56 +0000 )edit

Hi, Here you will find the firewall capture, one with an MTU set to 1460 MTU_1460_pcap and another one with an MTU set to 1500 MTU_1500_pcap seen from the firewall. Here you can get an extra: I made an scp from a Linux (to Linux) - Linux_to_Linux.pcap.

ubu3389 gravatar imageubu3389 ( 2020-03-04 10:04:54 +0000 )edit

Great. All three captures look good. Wireshark calculates throughput about 41Mbit/sec.
Can you make the same type of capture for a slow transfer?

Chuckc gravatar imageChuckc ( 2020-03-04 22:15:49 +0000 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2020-03-05 06:28:51 +0000

updated 2020-03-05 06:46:28 +0000

Your original "AIX-to-Cloud" capture file is made more difficult to analyse due to the fact that there are a lot of packets that weren't captured. That is, missing from the trace file but were there in real life.

However, the reason for the slow transfer is actually quite basic and common. It is some relatively small packet losses at various times which trigger the TCP "Congestion Avoidance" mechanism/algorithm.

When a sender detects packet loss it is supposed to "halve" its transmit window and then ramp up very slowly.

A halving of bytes-per-round-trip or packets-per-RT results in a halving of throughput. The subsequent "gentle" ramp up (usually by one packet per RT) means that overall throughput suffers severely. When there are multiple packet loss events, a chart of the "Bytes-in-Flight" value forms a sawtooth pattern with fast decreases and slow increases. This is clearly visible, several times, in blue on the Wireshark chart (Statistics - TCP Stream Graphs - Window Scaling).

Window Scaling

An excellent presentation about Congestion Avoidance algorithms was made by Vladimir Gerasimov at SharkFest 2019 (@Packet_vlad). He did a lot of valuable research and preparation. See Tuesday Class #7 at https://sharkfestus.wireshark.org/sf19. There's a video and a PDF.

In this case you have to find the Selective ACKs - which are shown as "Dup-Acks" in Wireshark. If you delve into the packet detail of the Dup-ACKs, you'll see the "left edge" and "right edge" values that identify them as SACKs. There can be multiple left and right edges, indicating multiple "gaps" in the flow.

You will then see the retransmissions of the data that the SACKs reported as being lost. The retransmitted packets are the correct size in this capture (ie, not the very large ones due to being captured before the IP layer "packetised" them).

The first set of SACKs start at #3458 and the first set of retransmitted packets are #3569, #3570, #3571 & #3572.

In your case, the sender's transmit window initially ramps up to 2MB (1482 packets x 1400 MSS) per RT. However, the loss of those 4 small packets causes the sender to reduce the transmit window to just 850 KB (620 packets x 1400 bytes) per round trip. The result is that your throughput more than halves at that point.

A short time later, there is another loss event. We see SACK #5040 report that 2776 bytes are missing, #5042 is a retransmission of 1388 bytes. One RTT after that, SACK #5082 reports that #5042 was received and now only 1388 bytes are missing. The sender immediately retransmits that data in #5083.

Interestingly, this packet loss event does not cause a "halving" of the transmit window. Instead, the sender reduces the transmit window to about 780 KB.

The transmit window (proportional to throughput) again gently ramps up around 810 KB until there's another packet loss. SACK #10904 reports two gaps of size 6940 bytes (5 x 1388) and 5552 bytes (4 x 1388). Packet #10906 is an immediate ... (more)

edit flag offensive delete link more

Comments

Hi Phil, these are two captures I made simultaneously (sender+receiver). Test number 1 got lost on the street :)

Test 2 - AIX: https://drive.google.com/open?id=1-2D... Test 2 - Linux: https://drive.google.com/open?id=1tp9...

Test 3 - AIX: https://drive.google.com/open?id=1PPE... Test 3 - Linux: https://drive.google.com/open?id=1G2d...

ubu3389 gravatar imageubu3389 ( 2020-03-09 12:09:59 +0000 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

2 followers

Stats

Asked: 2020-03-03 20:09:35 +0000

Seen: 20,229 times

Last updated: Mar 05 '20