This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Printing and TCP Receive Windows

0

Hi all !!

So I have the following scenario. A print server in Sydney Australia and Ricoh MP 5054 printers in a number of sites in Singapore. At one site, for all four printers, users were complaining that a large print job ( 26M ) was taking up to 3 mins to be sent from Sydney to start printing. The same model printer at other sites took in the order of 20-30 seconds.

Latency is about 100ms between Sydney and SG. I have a NetShark in Sydney and so ran captures to all 5054 printers. According to both WireShark and Riverbed Packet Analyser, all four printers at the slow site were advertising a maximum TCP window size of 16KB. The other sites showed TCP RWIN of 196KB, thus explaining the dramatic difference in job transfer times. These values are completely different to the Printer IO buffer settings on the device itself which seem to bear no resemblance to what I am actually seeing on the wire ;-)

Before I go asking the remote team to start shifting a huge printer from one site to the other to prove my point, and making me look stupid if it doesnt fix the problem, 1 basic question... Can anything in the network change the advertised TCP window size that I am seeing from the printer in the captures ?

I capture the 3 way handshake, so scaling is not the issue, WS on the printers is 1. Or is this really a bad coincidence and case of a whole bad batch of printers at this site ? All firmware and settings are identical across all sites.

The TCP Receive Window size in a packet capture, provided scaling is catered for, doesnt lie right ;-)

Thanks all.

asked 08 Oct '17, 14:17

Seandavies's gravatar image

Seandavies
6113
accept rate: 0%

edited 09 Oct '17, 03:27

grahamb's gravatar image

grahamb ♦
19.8k330206


2 Answers:

0

I agree that the TCP window size seems to be the problem here. Do you have captures close to the printers, or is all you have on the print server site? That would be required to prove that some device in the path messes up the receive window size. Or, to answer your question: yes, there may be devices that change the window size, e.g. load balancers, traffic shapers and other black boxes.

My recommendation would be to try to get a capture on the printer side, as close as possible to the printer itself (e.g. TAP on the cable to the printer, or SPAN on the same switch) to see if the device itself sends such a low window size.

answered 08 Oct '17, 15:14

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

Thanks for the quick response Jasper !. Yes. That was the next thing I requested last Friday ( I posted this before getting the results ) ... I mirrored the printer port to another port on the same switch and got a capture from there. The results actually show the same result as the printer server side ( 16KB TCP RWIN ). Short of the switch itself doing something weird ( very unlikely ), I can only assume that eliminated any possible window size change by a random device. (UPDATE - the ack from the printer showing the 16KB RWIN actually comes from the Ricoh MAC address itself - so surely nothing would be changing the TCP RWIN size and not the source MAC address ? ;-) ) Next step today or tomorrow is to do exactly the same at another 'fast' site with the same printer and capture at the printer side again. If that result is as expected ( ie 196KB TCP RWIN ), then I am going to have to bite the bullet and get one of these working printers shipped over to the 'slow' site. Thank you for the sanity check and clarifying my thought process on this ;-) I am just watching your SF '13 Trace File Sanitisation preso just in case I need to post a capture file ;-) I will let you know what happens. Of course I hope its the printers and we have a bug, but I can see that the RWIN surely must be the issue here - the RWIN / latency almost exactly matches the throughput and transfer time experienced by the users - bandwidth is not problem ;-)

(09 Oct '17, 14:45) Seandavies

0

So I discovered the answer ;-) After performing packet captures on a mirrored port of a printer at the 'fast' site I saw something interesting in the delta between the SYN/ACK from the printer and the ACK from the 'Print Server'. The delta was 0.07ms !! With a RTT between Sydney and Singapore of 150ms thats impossible right ?! ;-) So something on the LAN at the fast site was answering on behalf of the print server. That something happened to be WANx !! I thought of WANx initially but was lookng at things the wrong way round. I elimiated WANx from the equation because the slow site didnt have WANx ( I know my bad ). ALL the printers in fact have a TCP Receive window of 16KB ONLY, but WANx was effectively masking/fixing this inherent problem. Short of rolling out WANx to every single site just to fix print times - which is not cost effective for me - I have to go back to the vendor to persuade them to fix their issue. Looks like these printers are designed for local printing only and are extremely inefficient over high speed high latency links.

answered 12 Oct '17, 20:58

Seandavies's gravatar image

Seandavies
6113
accept rate: 0%