retransmissions in our network
Good morning wiresharkers,
I am new to some of this and am currently taking my CCNA so please educate me.
Long story short, we have some delay in AD and printer scripts, which has led me to dig into other areas of our system, and its complicated as it has grown. From vmware to blades to flex fabric. So I am kind of in the middle of a long troubleshooting experience.
We have a trunk line, that I found with no LACP active and we turned that on yesterday, we are getting roughly 100k drops on a this trunk per day (according to solar winds, and a sho int bri on the switch). Today I pulled a pcap, and the retransmission's are more than 10 times what we saw before we turned on LACP. I was hopping that this would be less. Drops seem to be the same.
I have a pcap from the line coming into the switch from another segment of the network, and this is clean. I have a pcap from the trunks going into our blade chassis and these are clean. The pcap in question is a trunk line from a 5412zl switch to a Palo alto 3020, using aggregate links.
I want to reduce the drops, and the retransmissions on this trunk. Perhaps this will speed things up.
Trunk 2 2x 2 1g eth lines from 5412zl to Palo Alto 1 Trunk 3 2x 2 1g eth lines from 5412zl to Palo Alto 2 (standby) Trunk 5, 2x 40gig links active\active from 5412zl to Flex Fabric A Trunk 6 2x 40 gig links active\active from 5412zl to Flex Fabric B Spanning tree is turned on on the 5412zl Spanning tree is not supported by the flex fabric units
---------------------------------------------------'DATA-------------------------------
82 0.006557 10.10.80.19 10.100.100.239 TCP 143 [TCP Retransmission] 60989 → 3389 [PSH, ACK] Seq=1 Ack=1 Win=253 Len=85
83 0.006597 10.100.100.245 10.100.110.44 TCP 382 [TCP Retransmission] 445 → 53706 [PSH, ACK] Seq=357 Ack=809 Win=508 Len=324
92 0.007847 10.100.140.3 10.100.100.250 TCP 58 [TCP Dup ACK 91#1] 44818 → 3487 [ACK] Seq=449 Ack=55 Win=8138 Len=0
I cant post files yet.
If you are indeed dropping packets then it is impacting your traffic.
There are lots of reasons why a switch will drop a packet.
I don't see how LACP could increase drops and retransmissions itself.
Maybe enabling LACP has changed the load balancing hashing on the links and you now have more packets traversing a "faulty" link than before. (long shot)
This would be a hardware or cabling issue then.
You also need to capture traffic at the same spot and at the same time of day so you can hope to compare your trace files.
You won't be able to analyze one trace file taken during peak usage and another taken during low usage properly.
The DATA part of your post shows 3 retransmissions for three different conversations.
Are you able to confirm that you are dropping packets relating to your "AD and printer scripts" issue?
Thanks for the comment.
I just quickly posted via copy an paste. Indeed its 100k - 140k packets a day, so it does not matter when I grab the pcap. Its roughly 4000+ packets an hour dropped. And it seems from all over the place, ALMOST everv lan I have (except printers, wireless local, wireless guest). I have put a ticket in with HP today to get some additional support on it.
When we fixed LACP the volume of packets dropped did not change very much. We also did tcp-push-preserve which did not help.
Its possible that its faulty, I could change the cables (they are just 3 foot prefabs from the palo alto to the hp5412) For sure a long shot though. Although, this still happens when we fail over to our second palo alto (trunk3)
Our palo alto guy had us do some stuff on that device as well ...(more)