Low throughput between vmWare hosts in vxlan topology - spurious retransmissions.

asked 2019-04-26 17:30:52 +0000

naskop

Got this weird case of low throughput between hosts. Seems to me, the problem starts when there are 3 outstanding packets in flight that are not full size. The receiver seems to acknowledge them first, then the rest of the packets. By that time the receiver has generated 3 duplicate ACK packets which triggers the sender re-transmit packets that were already acknowledged. Took a capture at both sides while running iperf. Throughput from A to B is low (around 1Gbps), B to A is ok. If we remove one leg of the port-channel on B side, throughput improves significantly. download packet captures

image description

Have you ever tried to use the port-channel on B-side as an active-backup configuration instead of a load balancing configuration?

Christian_R ( 2019-04-27 21:37:33 +0000 )

BTW you are facing real packet loss here, too!

Christian_R ( 2019-04-27 21:38:08 +0000 )

Thanks Christian, we are not using active-standby port channels. Where do you see the packet loss? I see exactly the same number of sent and received packets at both sides:

naskop ( 2019-04-28 01:31:57 +0000 )

yes you are right no loss, I was missled by my wireshark. -> Different story

But then I think you should try an active-standby setup as the out of order arrivals are slowing down your session. Or you can post the trace, where you disconnected one leg, so I can proof my assumption.

Christian_R ( 2019-04-28 12:52:41 +0000 )

Captures with one of the port channel legs down.

naskop ( 2019-04-29 13:42:07 +0000 )

answered 2019-04-29 19:17:27 +0000

Christian_R

Nice question: In the One leg trace: We cannot spot any retransmissions. Seems that the Out Of order symptom is slightly improved compared to the slow trace. So my recommendations:

  • Check if active-standby config for the VMware access could solve the issue
  • Find out if the root cause for out-of-order. Maybe it is normal.
  • Check if you really need the connection between the leaf switches
Christian, we've ran in this configuration (active-active) for years and it will be a hard sell to have many 10 gig link sitting idle. I think the out of order behavior is normal due to the multi-path. The leaf switches run in a mlag configuration, the connection between them is a peer link for heartbeat.

naskop ( 2019-05-02 13:17:47 +0000 )

