I was privileged to take a look at the trace from an SMB level. Here are a few observations from the trace:
- In general, the server is somewhat sluggish when responding to write operations. The RTT is 312 microseconds. The service response for write operations is usually 5.x milliseconds
- The steady pattern of 5.x msec response time is rarely broken by responses as quick as 2 msec and the longest response time 25 seconds. The latter is caused by the famous retransmissions.
- The client creates the target file with options "read, write, append" on the file server. At the same time, the client permits sharing for read, write, and delete operations. As a result, the server has to prepare for lock management, even if the file is not really shared.
- The client uploads a file without specifying the size of the target file. As a result, the server has to expand the file on the fly. This opens the road to fragmentation on the file system.
- The client is not using SMB pipelining. Instead, it sends the next data block only, after the last operation completed.
Unless the transfer is stopped by the packet loss (as analyzed by Sake, Christian and Phil) data is transmitted at approx. 750 MBit/sec. This does not leave a lot of room for improvement for a 1 GBit link.
A few things to check on the server side (if possible, or still necessary)
- Check, if the variations in response times are related to operations from another client (more Wireshark)
- Check disk fragmentation
- Check the hard disk health status
- Run onboard diagnostics to identify other bottlenecks on the server side (CPU, memory etc.)
- Check the server's RAID architecture. RAID-5 is probably not the best choice for applications involving lots of write operations.
Thank you Tom, for the interesting case