Ask Your Question

Revision history [back]

macOS SMB uploads to Windows Server share hang for dozen of seconds

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a Window Server share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Windows server version not known

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • Windows Server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

[Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2]

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a Window Server PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Windows server version not knownServer is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • Windows Server NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

[Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2]v8.2.2] [Update #2 : A sliced trace with 78 bytes is available here : [https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap](https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap)

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Server is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

[Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2] v8.2.2]

[Update #2 : A sliced trace with 78 bytes is available here : [https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap](https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap)[https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap](https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap)]

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Server is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

[Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2]

[Update #2 : A sliced trace with 78 bytes is available here : [https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap](https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap)]

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Server is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

[Update Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2]

[Update v8.2.2

Update #2 : A sliced trace with 78 bytes is available here : [https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap](https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap)]https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcap

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Server is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :

macOS SMB uploads to Windows Server share hang for dozen of seconds

Update #1 : Server is not Windows Server but directly a NAS PowerScale Dell/EMC - OneFS OS v8.2.2

Update #2 : A sliced trace with 78 bytes is available here : https://iwaxx.com/stuff/Transfert-Mac-ko-WithSyn-1TCP-sliced78.gz.pcapDeleted

Update #3 : New trace sliced dynamically after TCP header by @SYN-bit (previous fixed slice at 78 bytes was cutting TCP header in case of several SACK blocks, leading to confusion) https://iwaxx.com/stuff/Transfert-Mac-ko-anon.gz.pcap

Good old TCP analysis guys !

Context :

1st quick troubleshoot at a customer, not all traces are ok, not all info from them, but I'll go back for them

The problem :

  • On some specific sites only (private optical fibers between sites in same city : iRTT is 0.2 ms in this example, and always inferior to 1ms)
  • macOS 10.15.7 uploads of a 40GB folder to a PowerScale NAS share hang for dozen of seconds or even stop with an error
  • Downloads seem to be fine on macOS
  • Windows 10 clients don't have any problem
  • Server is a NAS PowerScale (ex-Isilon) Dell/EMC - OneFS OS v8.2.2

First analysis :

  • Upload to server, trace is made directly on client
  • iRTT is 0.2ms
  • SACK permitted on both sides
  • Timestamps enabled on both sides
  • Client window size coeff : x128 (max: 8MB)
  • Server window size coeff : x64 (max: 4MB)
  • Network is loosing packets for sure, looks like burst in switch -> will investigate that appart
  • My question here is on the difference of TCP behavior of the server & between Windows and macOS.

Just before the problem, status is :

  • SACK is working
  • NAS server window size : 1MB (since the begining)
  • Client opens window size at 5MB
  • Windows size is not full (bytes in flight: 725kB)
  • At 82.41s, macOS starts to send missing packets 1 by 1. (Before, he's used to retransmit several of them)
  • Server is acking them in iRTT time
  • At 82.522, after a few 1 by 1 retransmitted packets, acked immediately, it looks like the server is saying this rule : "Now, you gonna pay me every bytes currently in flight, but I impose a static 105ms delay in ACK."
  • Then the long seconds, consists of Retransmit + ACKed in 105ms + Retransmit + ACKed in 105ms ... till all bytes in flight are ACKed.
  • At 113.548, uploads starts again at full speed

My question about :

  • Sending packets 1 by 1 makes sense on server side to reduce congestion window to 1 MSS
  • But the 105ms delay from server before acking -> is it a known "congestion" behavior ? Or kind of slow start ?
  • It looks to a consequence of a behavior of macOS to send packets 1 by 1 (supposition to understand next question)
  • Why would this behavior would happen only on macOS and not Windows clients ? Knowing that same packets losses are experienced by macOS and Windows Clients (and Windows servers)
  • Could it be due to SACK ?

tcptrace_full tcptrace close before the problem First retransmits 1x1 acked quickly Begining of 105ms acks End of retransmit : no more bytes in flight

Screenshots attached in case inside pics are not ok :