We have two windows XP pc's 10.10.10.1 and 10.10.10.2 with a direct connection to each other (So not via a router). These both have a user application running, which continuously exchange information via TCP/IP. When the communication between the 2 pc's is logged with wireshark, it shows every 10 minutes a ARP request is broadcast to refresh the arp cache. However once in a while, the ARP message exchange seems to set off the whole network communication. In wireshark it is followed by a TCP previous segment lost message. Here an example:
Anyone any idea why this is happening?
asked 12 Jul '12, 05:18
Let's sum it up in an answer.
Based on the information gathered during analyzing the problem, I conclude:
Windows: ARP Behavior
Look at the end.
Conclusion: It looks like Windows dropped one (ore more) of your TCP packets, ALTHOUGH the information above seems to be only related to UDP. Maybe the information in the first link is wrong or the behavior is the same for TCP and other protocols.
HOWEVER, This behavior does not fully comply with the explanation above. At least one packet should have been queued and sent, after ARP finished. But maybe that's just the packet Wireshark marks with "TCP Previous segment was not captured" and another packet with payload data was dropped. Without insight into the application, we will never know.
Question: Why does it happen only once in a while?
Solution for your problem, in the order I would recommend:
It's apparently the same for other operation systems, however the "ARP queue length" is different.
Linux: You can configured the ARP queue len with the parameter
AIX: Same for AIX with
answered 27 Jul '12, 02:15
Not a real idea, just some guessing for now.
Is that "once in a while" in the regular 10 minute schedule (Windows XP ARP cache renewal time for used entries)?
If NO: You say, the computers are connected directly to each other. I assume a CAT5/6 cable. Maybe the RJ45 connector is not plugged in fully (at either end) and due to small movements (vibrations) the connector loses contact with some pins, which could cause packet loss. Maybe even the link state is lost. However: the time difference between the last TCP packet and the ARP request is probalby to short to reestablish the link. Anyway, can you please check the physical connection (unplug/plug) at both ends?
If YES: I have no idea (yet) what causes the loss of at least one packet (10.10.10.2 -> 10.10.10.1).