TCP Slow Start Graph

answered 2018-09-16 05:47:39 +0000

updated 2018-09-16 05:56:57 +0000

Quick answer to your question is: there are both Slow Start and Congestion Avoidance in your trace. Let's dig into this somewhat deeper.

1) Slow Start. Think of it as of ruleset, not a form. Yes, sometimes it looks like "nice and shiny" exponential growth graph (if a link is of LFN type). In your case it doesn't because your link is not LFN-like. Slow start just says: "for every ACKed full MSS increase cwnd by one full MSS". Nothing more.

Look at screenshot from your trace: image description

Launch calculator app and follow.

I'll start not from very beginning to skip some application chatter and to focus on bulk transfer itself.

Packet 135. ACK N 56461. Before it we have Bytes in flight = 67680 = 47 full MSS. I suspect this is cwnd limit.

So, packet 135. ACK N 56461. It ACKed 56461 - 53581 = 2880 Bytes or 2 full MSS. Sender's reaction is - releasing 4 full MSS packets into network. Looks familiar? :) BIF after packet N135 and 4 MSS sending became 70560 or 49 full MSS.

Packet 140. ACK N 59341. Acked 59341 - 56461 = 2880 Bytes. Again 2 full MSS. What does the sender? Correct, it releases 4 full MSS. BIF becomes 73440 or 51 full MSS.

Main point is - this is what Slow Start is. Why doesn't it look like exponential? Because ACK's are coming not in bulk manner, but evenly paced. Every ACK comes separately with 20ms interval between them. That's It!

2) Slow start phase continues up to the point where packet losses and retransmissions occur. After that the sender switches into Congestion Avoidance mode. It can be clearly seen on the next graph:

image description

Congestion avoidance pattern is very consistent. It somewhat looks like CUBIC as Christian said before.

3) Your additional questions.

what causing problem to have so many DUP ACK starting on packet #345

This isn't a problem. One (or more) packets was (were) lost in transit. But consider the fact you have a lot of other out-of-sequence packets in flight when you send Fast retransmission. Every one of these out-of-sequence packets (which are already flying somewhere in the network) will be causing a receiver to issue one more DUP ack. The more data in flight you had before Fast Retransmission - the more identical DUP ACKs you'll see in case of packet loss.

how to know if some packet is Fast Retransmssion or it's retransmited because of RTO

This is tough. You don't know current RTO because it's internal to sender. You can guess it (to some degree!) if you spot current RTT. DIfferent Congestion Avoidance algorithms calculate it differently. If you have received 3 DUP ACKs and almost instantly see a retransmission - probably it is Fast retransmission.

Back to Packet 446. Your suspicion is correct. This is probably not Fast retransmission. Why? Because packet N348 had the same SEQ, and this one actually WAS Fast retransmission - it's been issued almost instantly after 3 ... (more)

edit flag offensive delete link

Comments

Packed_vlad Thank you very much,

Now, I have three more questions:

1) In this graph you posted, in the Congestion Avoidance mode there is a linear increasing of BIF but then there is something exponential like slow start why is that? As I know in congestion avoidance there is just linear increasing of packets(bytes)?

2) I noticed in Wireshark that ACK RTT time is increasing constantly, starting from 60ms (without SYN ACK packet in which I see iRTT of 55ms) and going up to around 785ms before Windows Full packet is received.

Why is this happening? Is this normal condition in which receiver need to process the packets and that's why increasing the ACK's time or this is something else...?

I see that, as you say, ACK's arriving at about 20ms but "The RTT to ACK this segment was" filed in Wireshark is increasing constantly.

3 ...(more)

ille ( 2018-09-17 12:49:10 +0000 )edit

Hi @ille,

Cool questions!

1) "As I know in congestion avoidance there is just linear increasing of packets(bytes)"

This is very, very oversimplified view. There are tens of different Congestion Avoidance algorithms and - funny enough - not many of them have linear growth pattern. Your algorithm looks like CUBIC although it's not 100% certainly.

Check it out - it doesn't look linear at all.

2) This is happening because of buffer filling somewhere on the path. Try to get this graph as deep as possible, down to intuitive level.

When connection begins the upper funnel is empty so your packets doesn't have to sit there waiting for being transmitted. If incoming rate is faster then bottleneck's rate, upper funnel will eventually be filled. "Sitting time" is exactly what causes RTT to grow. At the same time lower link will be not so busy because it transfers small ...(more)

Packet_vlad ( 2018-09-17 13:35:00 +0000 )edit

@Packet_vlad, Thank you again,

When you already mentioned SACK, I remembered on one more question: In the first 8 DUP ACK's there is SACK for packet which start with SEQ 179161 but after DUP ACK #8 SACK is full and this packet is no longer shown in SACK options. So, does this mean that the sender should retransmit this packet or it "remember" old SACK and do not retransmit?

ille ( 2018-09-17 20:56:48 +0000 )edit

@ille, I'd suggest you to refer to SACK RFC2018. Part 5 describes it in details. As I understand it a sender might keep these segments flagged as "SACKed" even if they're not in DUP SACK anymore. Mandatory condition to clear SACKed flags is RTO for the segment.

Packet_vlad ( 2018-09-18 06:46:56 +0000 )edit

@Packet_vlad I still working on this trace, so couple more questions: 1) Around 0,8 seconds from the start of the communication Server stop updating it's windows (this server is a Dropbox server) this can be seen from Windows Scaling Graph. Why is this?

2)When TCP retrransmits packets sometime packets are divided in two part. For example packet number 190 is sent as 1494 Bytes, but when TCP Retransmission on this packet occurs in packet number 369, it is divided in two packet of which first of 918 Bytes and the second (packet number 371) of 630 Bytes. Why this happens?

3) Most confusing for me is this question: I noticed that after packet number 424 sender stop retransmmision of the packets even if there are others packet that should be retransmited due to Time Out. Here is a link of picture of TCP Sequence (tcptrace) graph with ...(more)

ille ( 2018-11-09 09:03:12 +0000 )edit

This is great example of a downstream bottleneck (1 Mbps) with a limited buffer queue (roughly 100 KB). As the packets fill the queue, the later ones have to spend more time waiting in that queue and so we see the apparent RTTs get longer and longer. In fact, the measured times aren't just "round trip" times but are "round trip" (~55ms) plus "queue" (up to 800ms) times.

Every time the queue fills up (usually ~100KB) we get packets being dropped. Congestion Avoidance mode slows our output flow, reducing the queue size. As we ramp up to larger bursts per round trip, the whole process repeats. Sometimes the packet drops happen at less than 100 KB, presumably when the queue is being shared with traffic not in this trace.

@ille, do you have an Internet upload speed limit of 1 Mbps? Can you tell if your Internet router has ...(more)

Philst ( 2018-12-06 07:58:19 +0000 )edit

@Philst Yes, my internet upload speed in time of capturing this trace was 1 Mbps, and I'm not sure about router buffer size. I agree that @Packet_vlad did a great job.

So @Philst you think that my router has a buffer size of 100KB and that router slowly processes the packets and that causes packet drops and retransissions?

ille ( 2018-12-06 08:22:17 +0000 )edit

Hi @ille , the latest question Q1: I think the server didn't increase it's window because it didn't observe incoming traffic volume high enough to do it. Keep in mind our perspective. We see we've reached RWIN boundary from our (sender's) point of view, but almost all these packets came and got stuck in the buffer for some time waiting for being transmitted, while only a small part of them (with 1Mbps speed) were coming to the server by the moment.

As for Q2 and Q3 they are quite tough I'd say, I need some time to think about them.. I like your approach to understand all the tiny things.

Packet_vlad ( 2018-12-06 08:53:31 +0000 )edit

You are sending packets into your router at local LAN speed but the router can only send them to the Internet at 1 Mbps. The router therefore has to store them all while it waits for the early packets to "dribble" out. It can only store 100 KB, so all packets after that have nowhere to go and must get dropped.

This is a common behaviour when a device has a fast input on one side and a slower output on the other side.

All the packets that were dropped didn't get to the other end and so they all have to be retransmitted.

Many of them are retransmitted twice. There seems to be two independent mechanisms:

1) Retransmissions because SACKs say they didn't make it.

2) Retransmissions possibly due to an RTO (even though mechanism 1 already retransmitted them).

As an extra funny behaviour, packet #498 contains ...(more)

Philst ( 2018-12-07 06:44:17 +0000 )edit

see more comments

We need more information to be able to help you. It's better to share actual trace, not screenshot. And also please tell the next: capture point location, network environment details, software used to generate traffic.

Packet_vlad ( 2018-09-13 11:20:22 +0000 )edit

Packet_vlad I've changed the question in which I added actual trace in the link. This is done for the purpose of traffic analyzing and it's upload of a file to the Dropbox cloud, so I'am the sender and Dropbox is a receiver. I use WiFi internet. Also, now when you have actual trace, can you tell me what causing problem to have so many DUP ACK starting on packet #345 and how to know if some packet is Fast Retransmssion or it's retransmited because of RTO (this situation is on packet #446-it's say Fast Retranssmition but I'm not sure about it)

ille ( 2018-09-13 19:01:22 +0000 )edit

Well, your trace is not the easiest... I recommend to start with this traces here: https://sharkfesteurope.wireshark.org...

and then go back to your trace. But Keep in mind, that Win7/8 mostly uses CTCP and Linux and Win10 mostly uses Cubic I guess.

It can be clearly seen, that the packet loss has an impact to the performace, due to congestion avoidance mode is entered.

Christian_R ( 2018-09-15 18:41:35 +0000 )edit

Comments

1 Answer

Comments

Your Answer

Question Tools

Stats

TCP Slow Start Graph edit

Comments

1 Answer

Comments