Ask Your Question

Revision history [back]

If a packet is sent but not received, can the problem not be the network?

We are having an issue with slow responses to our application from our memcached servers. The client capture shows a TCP retransmission from our application to the memcached server. The capture on the server machine shows the original packet was never received. Filtering by the sequence number, we see something like this:

Client

Time                            Length          Info
10:09:41.496303                     66          39126 -> 11211 [ACK] Seq=7040425, Ack=122270281 Win=182272 Len=0
10:09:41.497324                    306          39126 -> 11211 [PSH, ACK] Seq=7040425, Ack=122270281 Win=182272 Len=240
10:09:41.697515                    306          [TCP Retransmission] 39111 -> 11211 [PSH, ACK] Seq=7040425 Ack=122270281

Server

Time                            Length          Info
10:09:41.511636                     66          39126 -> 11211 [ACK] Seq=7040425, Ack=122270281 Win=182272 Len=0
10:09:41.706877                    306          39126 -> 11211 [PSH, ACK] Seq=7040425 Ack=122270281 Win=182272 Len=240

The puzzling thing is netstat shows no packet drop. Also if we run memaslap, an optimized load testing tool, with similar configuration we don't see any dropped packets and performance is consistently excellent.

Our ops team is saying it seems like the packet is being dropped on the client machine. I.e., the application is delivering the packet to the kernel, but the kernel drops the packet before sending it to the NIC. However, they don't know of a good way to confirm this. Is this a plausible explanation, and if so is there some way to confirm? I would think there would be some kind of logging that could be enabled to show this.