Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

What causes retransmissions?

Don't have enough points to post a picture so here's what's happening.

I have two servers, both running Linux Mint 19.3. Tried Mellanox 10Gbps cards (Mellanox DAC) and Intel 10Gbps NICs (Intel branded DAC), no switch..... 5 meter DAC attaching both servers directly. Both servers also have a 1Gbps NIC that was active the entire time. I edited the 'hosts' file on each server and entered the host name and 10Gbps IP address for the other box.

When I copy a 15, 30, 50 gig file between the two servers, I'll get about 450-500MB/s one way but copying the same file back in the other direction, speeds will start off around 350-400MB/s but quickly fall back to 150+MB/s. I've tested the IO subsytem on both servers and the SSDs inside them can read/write at about 550MB/s.

I used Wireshark on one of the boxes and saw this:

Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)

I see that error repeated non-stop during the time a file copy is going on. I'm not a Linux (or networking expert) but I'm thinking this might be a case for setting up a proper route on the Linux boxes so ALLLLLLLLLLLL traffic between these boxes must absolutely stay on the 10Gbps NICs. Since I'm no Linux expert, I'm stuck here.

iPerf shows 9.6Gbps back and forth.

If I can't figure out the Linux route stuff, should I just grab a switch that has 10Gbps ports and have these servers talking through that (and pulling their 1Gbps CAT5 cables)?