Slow download from server
Hello all,
Many clients download data from a server farm. Occasionally the download is slow. This server farm is located behind an F5 loadbalancer. The F5 does terminate the https connection from the client. It then starts its own https session to the real server.
2 traces are attached one is slow and the other is fast. My goal is to find out which device is the root cause of this issue. Trace is made via tcpdump at LB
https://drive.google.com/open?id=1Zrv...
https://drive.google.com/open?id=1buz...
I could see that the server does not send data to LB when the LB advertises a window size of an MSS or lower. I could measure around 6 seconds in the slow trace caused due to this behavior. So can I then conclude that this is the Server issue?
Why does the server not send data when the window size is <= 1 MSS? is it due to some congestion avoidance algorithm implemented at the server? Is there any known OS which behaves in this manner?
Please assist. Regards
It looks like 10.41.196.49 is the client - correct? In the 13sec trace the iRTT in the TCP Handshake is 2msec, but in the 5sec trace, the iRTT is 407 microseconds - why is there such a huge difference?
To me it looks like the client is simply too slow to process incoming data, and that coupled with the rather small TCP Receive Window advertised by the client is the cause of the slow transfer rate. The small TCP Receive Window is also an issue in the 5sec trace and I'm sure increasing that would help, however it's also clear that something is not right on the client i.e. lack of resources. This can be seen by the high(er) delta times seen between consecutive ACKs from the client and the fact that it takes the client forever to increase its TCP Receive Window once it's filled.
No, probably this is because of "Silly window syndrome avoidance" mechanism (RFC1122 sec. 4.2.3.4)
Could you please also provide detailed info on IP addresses and connections seen in the traces?
Traces are made at f5 which offloads TCP to NIC.
in each trace there is 2 sessions. One is from the client to LB:1443 and the other is from LB to real server:443.
real client IP 192.168.244.129 LB Virtual IP 192.168.231.150 LB IP towards real server 10.41.196.49 Real server IP 192.168.40.79
Have you ever tried to enable "Scaling Window" feature at the 10.41.196.49?
Ahh! Good point Christian_R. The loadbalancer is actually removing the Window Scale feature, because the real client (192.168.244.129) is announcing that option, but once it's crossed the LB, that is not present any longer. Window Scaling is however not announced by the server 192.168.40.79, so there it should be enabled and the LB should be configured to allow Window Scaling.