Can you explain this TCP sequence
Hello, Could anyone explain the behavior I observe below?
I have the following registration extract: a packet with seq=3020828 is sent out but never acknowledged. The packet is also retransmitted multiple times, but the receiver keeps acknowledging the previous seq:
57788 2018-07-16 15:36:20.552618000 10.245.40.74 10.245.54.13 TCP 2974 64613 -> 14004 [ACK] Seq=3020828 Ack=73535403 Win=65536 Len=2920
...
58376 2018-07-16 15:36:20.851770000 10.245.40.74 10.245.54.13 TCP 1514 [TCP Retransmission]
64613 -> 14004 [ACK] Seq=3020828 Ack=74313583 Win=1296 Len=1460
58378 2018-07-16 15:36:21.101721000 10.245.54.13 10.245.40.74 TCP 1350 14004 -> 64613 [PSH, ACK] Seq=74313583 Ack=3020828 Win=4096 Len=1296 [TCP segment of a reassembled PDU]
...
60992 2018-07-16 15:36:22.652682000 10.245.40.74 10.245.54.13 TCP 1514 [TCP Retransmission] 64613 -> 14004 [ACK] Seq=3020828 Ack=77762103 Win=1296 Len=1460
60994 2018-07-16 15:36:22.658427000 10.245.54.13 10.245.40.74 TCP 1514 14004 -> 64613 [ACK] Seq=77762103 Ack=3020828 Win=4096 Len=1460 [TCP segment of a reassembled PDU]
On the receiver side, I do see the packet coming in, but it is someone ignored and it keeps acknowledging the previous seq:
13878 2018-07-16 15:36:20.825430000 10.245.40.74 10.245.54.13 TCP 1514 [TCP Retransmission] 64613 -> 14004 [ACK] Seq=3020828 Ack=74313583 Win=1296 Len=1460
18810 2018-07-16 15:36:36.047313000 10.245.54.13 10.245.40.74 TCP 29254 14004 -> 64613 [ACK] Seq=114189103 Ack=3020828 Win=20480 Len=29200 [TCP segment of a reassembled PDU]
...
13991 2018-07-16 15:36:21.425465000 10.245.40.74 10.245.54.13 TCP 1514 [TCP Retransmission] 64613 -> 14004 [ACK] Seq=3020828 Ack=74834803 Win=1296 Len=1460
13993 2018-07-16 15:36:21.433178000 10.245.54.13 10.245.40.74 TCP 27794 14004 -> 64613 [ACK] Seq=74834803 Ack=3020828 Win=4096 Len=27740 [TCP segment of a reassembled PDU]
According to the ACK packet, there is enough room in the recv window, but somehow the packet is ignored. After 5 unsuccessful retransmission, the sender eventually drops the connection.
Thanks.
Without a trace it is hard to say, camn you share us a trace: https://blog.packet-foo.com/2016/11/t.... And the receiver side is done with Segmentation Offloading which does not reflect the situation on the wire.
Hello, Christian. Thanks, I uploaded the receiver capture on Google Drive: https://drive.google.com/open?id=1V1m...
The capture shows the connection established at 15:35:47 from port 64613 to 14004. The TCP conversation goes on for almost a minute, transferring amount of data, with occasions of congestion in both directions.
The issue seems to start at 15:36:20.526157000: from this time on, the receiver seems to ACK seq=3020828 for each of the 5 packet retransmissions, until the connection RESET, received at 15:36:39.431508000 (when the Windows sender gives up and closes the connection).
The capture file appears to have been taken on the machine with IP 192.168.50.157. Is there a corresponding capture file available from the other side, namely at 192.168.195.45?
Yes, cmaynard. Note the registrations were sanitized via TraceWrangler, so the IPs will appear randomized. I do have the corresponding registration from the connection peer: I will just upload and make it available tomorrow, when I can access the registration file again. Thanks!
The problem is very strange. The missed 1460 byte segment w/seq # 3020828 is retransmitted with the next seq # correctly indicating 3022288; yet 192.168.195.45 continues to only ack 3020828. It would appear that 192.168.195.45 never receives it, but the capture file from the other side would confirm or deny that.
I can anticipate that the other peer registration is exactly dual: it contains the retransmissions for that packet, followed by the receipt of the wrong ACK for #3020828. 5 times, then the RESET is sent out. Somehow we cannot explain why the receiver appears to discard that packet, silently: others have suggested the recv buffer is full so the packet is discarded, but the ACK message clearly states the recv window has enough room for that and there is no "Zero Window" notification back to the sender, which would have allowed for congestion control to kick in.
Hi, I uploaded the peer registration: https://drive.google.com/open?id=1F57...
Notes: 1) TraceWrangler randomized the IPs, so this now shows 10.247.166.16 / 172.28.12.164, but it is a TCP dump for the exact same session. 2) This is the initiator of the TCP connection, so you see the SYN message (TTL=128) going out at 15:35:47.684812000 (there is a little clock skew between the two hosts). 3) The suspicious packet seq=3020828 goes out at 15:36:20.552618000 and it is 2920 bytes long: we think the sender has jumbo frames off, so the packet is being segmented down in the network stack. This is not the first split happening during the conversation, so not sure it matters.
Could you please tell - is there some node in between? Because we receive TTL 127 packets.
Yes, the conversation is across different networks, so as far as we know there is a Cisco device being traversed.