1 | initial version |
There's a very consistent regularity to the way the packets flow from the client to the server. A particular pattern is repeated again and again - at roughly 7 second intervals - with about 500,000 KB transferred per interval. I'll define the large burst of packets as the start of the pattern.
Here are my key observations (with some supporting TCP Trace charts below):
1) The client sends a large burst of 250 KB, but large portions are lost after the first 100 KB. In the first TCP-Trace chart below, we see the 100 KB successful burst, the yellow area with no packets, a few subsequent packets that made it through, then one RTT later the second 100 KB burst.
2) The server's receive window is close to 1 MB, but the sender appears to use its own RWIN of 261,288 bytes as its own transmit "limit". The sender manages to maintain close to this "in flight" value throughout the whole period.
3) One RTT later, the sender receives ACKs for the 100 KB that wasn't lost and manages to transmit a further 100 KB without any errors. The large number of original lost packets trigger many Dup-ACKs and in response, the sender retransmits a single packet to begin to fill the gap. Following the horizontal "Ack line" on the chart we see the single retransmitted packet and the step up of the Ack line.
4) One RTT after that, there's another single packet retransmission.
5) The large initial gap in the received data is then filled in at the rate of just one packet per RTT. Also, after several RTTs (perhaps as the sending congestion window is opened), the sender begins to send small bursts of new data so that the in-flight value of 250 KB is maintained.
6) On the second chart below, we've zoomed-out to encompass a full pattern and the start of the next one. The dark blue circle is around the initial two large bursts, the red circle is around all the single packet retransmissions and the light blue circle is around all the small bursts of new packets. It looks like the sender eventually waits for every sixth round trip so that it can send a full 6-packet application "block" (there's a Push flag at the end of these 6-packet bursts).
7) Eventually, the original large gap has been completely filled-in and we see the Ack line jump all the way up as the original two large bursts and all the smaller "new" bursts are fully acknowledged. At this point, in-flight data is zero and the sender is now free to begin the whole pattern all over again.
So, what are the things that need to investigated further?
A) The bulk packet loss, always after a large burst of 100 KB, points to a device in the path that only has a 100 KB buffer space. The most likely candidate will be the router where the path bandwidth drops (the initial queue modelling suggests that your minimum path speed is 100 Mbps - because the modelled queue waiting times closely match the measured RTTs only at that speed).
B) Can something be done to make the AIX TCP stack ramp-up the single packet per trip retransmission rate? Enabling SACKs may help here. Perhaps also installing a newer version of the AIX TCP stack and/or tuning TCP parameters.
C) Why does the AIX sender seem to use its own RWIN value as its transmit (or send buffer) "limit". Perhaps its TCP stack has some setting that limits both values? However, if the 100 KB buffer space limit isn't fixed, we don't want to send larger bursts.
Suggestions.
In my view, the best fix would be to find the network device with the 100 MB buffer and upgrade it so that it has more space. This fixes the source of the packet loss. It may be difficult though, especially if your network is outsourced.
The biggest time killer here is the single packet per round trip retransmission rate. So another option is to enable SACKs and investigate other AIX TCP options that could improve this behaviour.
A not particularly desirable "last resort" option could be to reduce the size of the bursts that AIX sends. For example, reduce its send buffer somehow - either via TCP settings or SFTP settings. The problem with this is that it may negatively impact other AIX applications and/or users at other locations.
Now that the behaviours are known, you can think of other options too.
Please let us know what you do and how it affects your results.
2 | No.2 Revision |
There's a very consistent regularity to the way the packets flow from the client to the server. A particular pattern is repeated again and again - at roughly 7 second intervals - with about 500,000 KB transferred per interval. I'll define the large burst of packets as the start of the pattern.
Here are my key observations (with some supporting TCP Trace charts below):
1) The client sends a large burst of 250 KB, but large portions are lost after the first 100 KB. In the first TCP-Trace chart below, we see the 100 KB successful burst, the yellow area with no packets, a few subsequent packets that made it through, then one RTT later the second 100 KB burst.
2) The server's receive window is close to 1 MB, but the sender appears to use its own RWIN of 261,288 bytes as its own transmit "limit". The sender manages to maintain close to this "in flight" value throughout the whole period.
3) One RTT later, the sender receives ACKs for the 100 KB that wasn't lost and manages to transmit a further 100 KB without any errors. The large number of original lost packets trigger many Dup-ACKs and in response, the sender retransmits a single packet to begin to fill the gap. Following the horizontal "Ack line" on the chart we see the single retransmitted packet and the step up of the Ack line.
4) One RTT after that, there's another single packet retransmission.
5) The large initial gap in the received data is then filled in at the rate of just one packet per RTT. Also, after several RTTs (perhaps as the sending congestion window is opened), the sender begins to send small bursts of new data so that the in-flight value of 250 KB is maintained.
6) On the second chart below, we've zoomed-out to encompass a full pattern and the start of the next one. The dark blue circle is around the initial two large bursts, the red circle is around all the single packet retransmissions and the light blue circle is around all the small bursts of new packets. It looks like the sender eventually waits for every sixth round trip so that it can send a full 6-packet application "block" (there's a Push flag at the end of these 6-packet bursts).
7) Eventually, the original large gap has been completely filled-in and we see the Ack line jump all the way up as the original two large bursts and all the smaller "new" bursts are fully acknowledged. At this point, in-flight data is zero and the sender is now free to begin the whole pattern all over again.
So, what are the things that need to investigated further?
A) The bulk packet loss, always after a large burst of 100 KB, points to a device in the path that only has a 100 KB buffer space. The most likely candidate will be the router where the path bandwidth drops (the initial queue modelling suggests that your minimum path speed is 100 Mbps - because the modelled queue waiting times closely match the measured RTTs only at that speed).
B) Can something be done to make the AIX TCP stack ramp-up the single packet per trip retransmission rate? Enabling SACKs may help here. Perhaps also installing a newer version of the AIX TCP stack and/or tuning TCP parameters.
C) Why does the AIX sender seem to use its own RWIN value as its transmit (or send buffer) "limit". Perhaps its TCP stack has some setting that limits both values? However, if the 100 KB buffer space limit isn't fixed, we don't want to send larger bursts.
Suggestions.
In my view, the best fix would be to find the network device with the 100 MB buffer and upgrade it so that it has more space. This fixes the source of the packet loss. It may be difficult though, especially if your network is outsourced.
The biggest time killer here is the single packet per round trip retransmission rate. So another option is to enable SACKs and investigate other AIX TCP options that could improve this behaviour.
A not particularly desirable "last resort" option could be to reduce the size of the bursts that AIX sends. For example, reduce its send buffer somehow - either via TCP settings or SFTP settings. The problem with this is that it may negatively impact other AIX applications and/or users at other locations.
Now that the behaviours are known, you can think of other options too.
Please let us know what you do and how it affects your results.
3 | No.3 Revision |
There's a very consistent regularity to the way the packets flow from the client to the server. A particular pattern is repeated again and again - at roughly 7 second intervals - with about 500,000 KB transferred per interval. I'll define the large burst of packets as the start of the pattern.
Here are my key observations (with some supporting TCP Trace charts below):
1) The client sends a large burst of 250 KB, but large portions are lost after the first 100 KB. In the first TCP-Trace chart below, we see the 100 KB successful burst, the yellow area with no packets, a few subsequent packets that made it through, then one RTT later the second 100 KB burst.
2) The server's receive window is close to 1 MB, but the sender appears to use its own RWIN of 261,288 bytes as its own transmit "limit". The sender manages to maintain close to this "in flight" value throughout the whole period.
3) One RTT later, the sender receives ACKs for the 100 KB that wasn't lost and manages to transmit a further 100 KB without any errors. The large number of original lost packets trigger many Dup-ACKs and in response, the sender retransmits a single packet to begin to fill the gap. Following the horizontal "Ack line" on the chart we see the single retransmitted packet and the step up of the Ack line.
4) One RTT after that, there's another single packet retransmission.
5) The large initial gap in the received data is then filled in at the rate of just one packet per RTT. Also, after several RTTs (perhaps as the sending congestion window is opened), the sender begins to send small bursts of new data so that the in-flight value of 250 KB is maintained.
6) On the second chart below, we've zoomed-out to encompass a full pattern and the start of the next one. The dark blue circle is around the initial two large bursts, the red circle is around all the single packet retransmissions and the light blue circle is around all the small bursts of new packets. It looks like the sender eventually waits for every sixth round trip so that it can send a full 6-packet application "block" (there's a Push flag at the end of these 6-packet bursts).
7) Eventually, the original large gap has been completely filled-in and we see the Ack line jump all the way up as the original two large bursts and all the smaller "new" bursts are fully acknowledged. At this point, in-flight data is zero and the sender is now free to begin the whole pattern all over again.
So, what are the things that need to investigated further?
A) The bulk packet loss, always after a large burst of 100 KB, points to a device in the path that only has a 100 KB buffer space. The most likely candidate will be the router where the path bandwidth drops (the initial queue modelling suggests that your minimum path speed is 100 Mbps - because the modelled queue waiting times closely match the measured RTTs only at that speed).
B) Can something be done to make the AIX TCP stack ramp-up the single packet per trip retransmission rate? Enabling SACKs may help here. Perhaps also installing a newer version of the AIX TCP stack and/or tuning TCP parameters.
C) Why does the AIX sender seem to use its own RWIN value as its transmit (or send buffer) "limit". Perhaps its TCP stack has some setting that limits both values? However, if the 100 KB buffer space limit isn't fixed, we don't want to send larger bursts.
Suggestions.
In my view, the best fix would be to find the network device with the 100 MB buffer and upgrade it so that it has more space. This fixes the source of the packet loss. It may be difficult though, especially if your network is outsourced.
The biggest time killer here is the single packet per round trip retransmission rate. So another option is to enable SACKs and investigate other AIX TCP options that could improve this behaviour.
A not particularly desirable "last resort" option could be to reduce the size of the bursts that AIX sends. For example, reduce its send buffer somehow - either via TCP settings or SFTP settings. The problem with this is that it may negatively impact other AIX applications and/or users at other locations.
Now that the behaviours are known, you can think of other options too.
Please let us know what you do and how it affects your results.
Update: 11th Dec 2018
Further analysis of your capture points to shaping/policing settings - most probably in your WAN router.
It appears that you have a PIR of 100 Mbps, CIR of 50 Mbps and an initial token bucket size of 20 KB. You seem to be allowed an ongoing buffer size of 55 KB with an initial one of 75 KB.
Some more possible options to improve this particular throughput are therefore:
I'm very keen to hear what you do to make this work - especially whether or not enabling SACK improves things.