Ask Your Question
0

TCP session ended early - missing client ACK?

asked 2021-08-20 22:45:29 +0000

CAiello gravatar image

updated 2021-08-24 14:11:33 +0000

image description

pcap file

I'm troubleshooting a connection between a client PC and an HTTP server. In this example, the client is requesting a file and only receives a few KB before the connection is reset.

The server sends four packets, then the client ACKs the first packet only, which appears to trigger the server to retransmit the un-ACKed packets. After several retransmissions the connection times out.

There is an IPsec tunnel between the client and the server. Any thoughts on this behaviour?

Edits:

There is a site-to-site IPsec tunnel between the client (192.168.120.105) and server (10.21.8.54). The captures were taken from the client (Win10).

extended pcap

edit retag flag offensive close merge delete

3 Answers

Sort by » oldest newest most voted
0

answered 2021-08-24 07:37:56 +0000

SYN-bit gravatar image

Although at first glance it looks like a MTU problem, it isn't. The "full" sized frames are indeed received at the client's end and one of these packets gets acknowledged, but the other ones did not.

First of all an analysis of the MSS being used by the server. The client advertised 1460, but the server used an MSS of 1352 for the data packets. I assume the MSS in the SYN packet was altered to 1352 on the way to the server. However, the MSS in the SYN/ACK is 1382, which suggests that the alteration of the MSS is different in both directions. This could be due to different configurations or that traffic might be routed asymmetrical. Interesting to investigate further, but not part of the problem I think.

The server sent 4 "full" segment and then stopped sending, this is in line with TCP slowstart, after receiving the ACK to the first segment, it sent only two more segments, which is also in line with TCP slowstart (where each acked segment increses the congestion window with 1 MSS). So the server behavior seems to be normal.

On the client however, I would have expected different behavior. Under normal circumstances, the client would send an ACK immediately after receiving 2 full segments. However, it takes ~40 ms to send an ACK and the ACK only acknowledges the first received segment and not the first 2 received segments. This looks like the TCP stack on the client did only receive 1 segment, waited for a second segment until the delayed ack timer timed out and so it sent an ACK for only the first segment. None the other segments are ACKed (which triggers the server to send retransmissions of the secend segment to finally give up when no ACKs are received for it).

It looks like, even though the segment are seen on the client at the capture point, they are not seen by the TCP stack. You said there is a VPN connection between the client and the server, is this VPN terminated on the win10 client doing the request, or is it a site-to-site VPN terminated outside the win10 client. In case of a VPN client on the win10 system, I suspect it is not functioning correctly. In case the VPN tunnel was already terminated before the traffic reached the client, there must be something else on the win10 client that is interfering. In either case, I'm sure the problem is somewhere inside the win10 client.

edit flag offensive delete link more

Comments

Another thing, this trace shows only the transfer of "/NetMedicalLogin.png", I assume this picture was referenced in an HTML file. Was that file received OK? And other objects? In other words, does this problem happen all the time or on specific objects only? And is there a pattern in which objects are causing problems? Could you post a capture file on the transfer of all the objects on this webpage?

SYN-bit gravatar imageSYN-bit ( 2021-08-24 07:54:18 +0000 )edit

Thank you! I was using the NetMedicalLogin.png file as an example of a failing transfer. On the web page, only the first few rows of that image load before timing out. There is some basic test on the page that loads without issue. I have uploaded an extended pcap file that includes more data, repeated tests and some pings. The VPN is site-to-site, but as a workaround, a temporary VPN client has been installed on the workstation which enables communication to the same server without issue. The problem is reproducible on all PCs on the network, and only when using the site-to-site connection.

CAiello gravatar imageCAiello ( 2021-08-24 14:15:11 +0000 )edit

Thanks for the second pcap file. Since you mentioned that it does not work over the site-to-site vpn and that that behavior is the same for all clients, I was a bit puzzled as I was sure something was wrong on the client-side. However, it turns out that the TCP checksum of some of the TCP packets is wrong. This is best visible in the first pcap (the PowerShell test). The first part of the response has a proper TCP checksum, but the other parts have a bad TCP checksum.

It would be interesting to see with a TAP if the Ethernet FCS's are correct, but I assume they are, as the NIC would have dropped the frames when they FCS is incorrect.

So, somewhere between the server and the client, the TCP checksums get messed up. You would need traces between the server and the remote VPN device ...(more)

SYN-bit gravatar imageSYN-bit ( 2021-08-24 16:27:51 +0000 )edit

That is interesting. I found this post referencing a similar issue with the same setup (IPsec VPN with source NAT on Ubiquiti EdgeRouter hardware): https://community.ui.com/questions/ER...

I'm going to try a firmware downgrade as was suggested there to see if that fixes it.

CAiello gravatar imageCAiello ( 2021-08-24 16:55:14 +0000 )edit

Good find! Were you able to do the downgrade? And did it help? I have an EdgeRouter X, I might try to reproduce the issue for some nice captures for my classes. Which firmware are you using? Anyhting special in the config apart from IPsec VPN with source NAT?

If reproduction fails, can I use your captures in my classes/presentations? Do you want me to anonymize then any further (like IP addresses and a bit of the L7 info)? You can reach me at [email protected] if you'd like to discuss this privately.

SYN-bit gravatar imageSYN-bit ( 2021-08-25 21:12:15 +0000 )edit
0

answered 2021-08-23 07:27:25 +0000

hugo.vanderkooij gravatar image

This has all the tell tale signs of a MTU issue where the tunnel requires a smaller MTU but fails to learn that. A capture on the sending end should show the ICMP traffic giving this away.

edit flag offensive delete link more
0

answered 2021-08-21 11:01:22 +0000

BigFatCat gravatar image

It doesn't look like client timer inactivity because the client is only sending a GET for a PNG. Most likely, it is 10.21.8.54 sending a TCP RESET instead of retransmitting the data segment, because it has reached the" maximum number of times it will retransmit an individual data segment". Active TCP sessions utilize resources, and it makes sense to abort any broken TCP sessions.

The real issue is to determine why the client, 192.168.120.105, is not responding to the larger TCP segment(s). Try to capture at 192.168.120.105, to verify if the server's larger TCP segment made it. The alternative, is to ping the same size packet as the larger TCP segment with the do-not-frag option.

edit flag offensive delete link more

Comments

Thank you - that’s essentially the same conclusion I came to - why is the client not responding to those later segments? This capture was taken from 192.168.120.105 (Windows 10), so the client is clearly receiving the packets. An MTU-related issue was my initial suspicious but troubleshooting with that in mind so far hasn’t led anywhere.

CAiello gravatar imageCAiello ( 2021-08-22 01:30:16 +0000 )edit

Try troubleshooting MTU with the ping do-not-frag option. I would start with the standard to verify that the firewall allows pings. Next, I would try to ping the same size packets that are failing. Pinging from a Cisco router, the ping size is 1392. Pinging from Windows, Linux or Juniper router, the ping size is 1364. If that fails, then ping different sizes and see what works and what fails.

BigFatCat gravatar imageBigFatCat ( 2021-08-22 17:30:00 +0000 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2021-08-20 22:45:29 +0000

Seen: 1,490 times

Last updated: Aug 24 '21