This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

[RST, ACK] Sent to both Server and client, but neither is the sender

0
  1. On the client we see the following sequence:
    455 752.488904  47.29.0.122  24.114.118.166  TCP  76 50776 > http-alt [**SYN**] Seq=0 Win=5840 Len=0 MSS=1460 SACK_PERM=1 TSval=3355706 TSecr=0 WS=32
466 755.488538  47.29.0.122   24.114.118.166  TCP  76 50776 > http-alt [**SYN**] Seq=0 Win=5840 Len=0 MSS=1460 SACK_PERM=1 TSval=3358706 TSecr=0 WS=32

467 755.507154  24.114.118.166 47.29.0.122    TCP  56 http-alt > 50776 [**ACK**] Seq=1 Ack=1 Win=49680 Len=0

468 758.317805  24.114.118.166 47.29.0.122    TCP  56 http-alt &gt; 50776 [**RST**, **ACK**] Seq=1 Ack=1 Win=5840 Len=0</code></pre><pre><code></code></pre><ol><li>On the server at the same time we see the following</li></ol><pre><code>    132 775.318404  47.29.0.122      192.168.140.26    TCP      74     50776 &gt; http-alt [**SYN**] Seq=0 Win=5840 Len=0 MSS=1380

133 778.312651  47.29.0.122      192.168.140.26    TCP      74     50776 &gt; http-alt [**SYN**] Seq=0 Win=5840 Len=0 MSS=1380

134 778.312828  192.168.140.26   47.29.0.122       TCP      54     http-alt &gt; 50776 [**ACK**] Seq=1 Ack=1 Win=49680 Len=0

135 778.701649  192.168.140.26   47.29.0.122       TCP      58     http-alt &gt; 50776 [**SYN**, **ACK**] Seq=0 Ack=1 Win=49680 Len=0 MSS=1460

136 781.123193  47.29.0.122      192.168.140.26    TCP      60     50776 &gt; http-alt [**RST**, **ACK**] Seq=1 Ack=1 Win=49680 Len=0

Note that the timing is not synchronized between client and server

Observations:

  1. Two SYN packets are sent from the client, the reason is because the first one did not get a SYN ACK reply. After 3 seconds another SYN is sent. This is strange because the BOTH SYN packets got to the server (i dont have the latency, unfortunately)

  2. Then the client received a ACK (and not a SYN ACK as expected) from the server, which the server sent in replay to a SYN

  3. Both client and server get a RST ACK although neither sent a RST to the other. This happens 3 seconds after the second SYN. Like I pointed out below this could be because of something as quoted in this article:

    “While this connection stays open in the connection table, if a SYN comes from the client that matches the same IP port combination then ACE closes that connection with a RST as a connection for that IP and port is already open in the connection table. The timeout value in our environment was setup to very high (8 hours), whih made the issue worse as more and more orphaned connections accumilated in the connection table. The issue at ACE level was resolved by reducing the timeout to 20 minutes."

Any help would be appreciated

[EDIT]

Added capture files here:

Client: http://www.cloudshark.org/captures/5407bc01958e

Server: http://www.cloudshark.org/captures/3bf196dc6b3f

Note that there are several successful handshakes and ensuing traffic, but at some point something breaks as described above.

This question is marked “community wiki”.

asked 28 May ‘13, 14:53

RomanM's gravatar image

RomanM
16115
accept rate: 0%

edited 29 May ‘13, 18:25

krishnayeddula's gravatar image

krishnayeddula
629354148

It would really help to see these packets in full, like on www.cloudshark.org for example. I suspect the mac-addresses, ip ID’s and IP TTLs can tell a lot more to pinpoint the problem.

(29 May ‘13, 09:53) SYN-bit ♦♦

done, added

(29 May ‘13, 10:06) RomanM


3 Answers:

1

The same issued was discussed in a Oracle Forum thread

"The Solaris response, with just ACK instead of the typical SYN ACK, is good according to RFC 793. The FW obviously doesn't agree to this behaviour and RST's the connection towards both ends.

Note that the correct (= expected) SYN ACK from the server is sent out delayed. So it might be the http server is having problems acctepting the new connection in a timely manner causing an unexpected ACK to flow out before the SYN_ACK.

answered 29 May '13, 23:29

mrEEde2's gravatar image

mrEEde2
3364614
accept rate: 20%

i think this article is most insightful and relevant to the issue.

I guess the double SYN packet is the root cause and what makes things cascade, we will need to investigate it

(30 May '13, 11:38) RomanM

The same issued was discussed in a Oracle Forum thread

good find!

is good according to RFC 793.

I doubt that! That's just a comment of one forum user, without any explanation why that would be O.K.

It's rather a bug, as also mentioned in the forum article. Unfortunately the link is dead. But if you search for the BUG ID, you'll find some information.

6972966 SYN-ACK-ACK is not handled properly when accepting connection from Linux client using HTTP benchmark

Anyway, is there any Solaris system involved?

(30 May '13, 12:11) Kurt Knochner ♦

The command to start the trace at the server was snoop. So this must be a solaris system.

(30 May '13, 22:57) mrEEde2

The double syn packet is not the root cause, it is a retransmission because the linux client didn't see a syn_ack within 3 secs. In total the http server didn't accept the new connection for 3.3 secs. This is the problem that needs to be investigated

(30 May '13, 23:12) mrEEde2

So you are saying that the double SYN is normal and is per the implementation of the linux stack ? It was odd to me because I know that a lot of network equipment will be suspicious of multiple SYN packets 'flooding' - hence the RST...

(31 May '13, 07:56) RomanM

I know that a lot of network equipment will be suspicious of multiple SYN packets 'flooding'

no device I know of will block a second SYN (after a few seconds) as 'flooding', as that's just regular TCP retry mechanisms.

(31 May '13, 07:59) Kurt Knochner ♦

That's standard tcp behaviour: retransmit, when you don't get an ack within your retransmission timer. Later in the flow we can use the rtt measurement to adjust this but initially we have to take a guess. Linux uses 3 secs, which is far away from 'flooding'

(31 May '13, 08:03) mrEEde2
showing 5 of 7 show 2 more comments

0

Looks like Load balancer is not translating server ip address to virtual ip address when replying back to client. Assuming 47.29.0.122 is the client,if you check ACK-RST packet which is 136 is having SIP:47.29.0.122 and DIP:192.168.140.26(In normal case the DIP should be Load balancer IP a.k.a VIP which is 24.114.118.166).ACK-RST is generated because somehow client didn't liked the previous packet it got and it failed to process it.One case is, It opened connection to LB ip but it got a response directly from server instead of LB ip. Is asymmetric flow triggered(Forward traffic hits LB-A and Reverse hit LB-B and LB-B instead of translation do a plain routing which will break the session)

"Then the client received a ACK (and not a SYN ACK as expected) from the server, which the server sent in replay to a SYN" I didn't get this part.

Why syn-ack is not expected?

SYN/SYN-ACK and ACK are must and should for any TCP Based communication right?How come client will send an ACK with out seeing SYN-ACK from server?

Better to wait for some expert analysis here. I am sure you will get.

answered 28 May '13, 15:08

krishnayeddula's gravatar image

krishnayeddula
629354148
accept rate: 6%

edited 28 May '13, 15:45

0

It looks as if the server ignores the first SYN packet and then answers with ACK (Frame #134).

Possible Reason: Asymmetric routing. You see only one half of the communication and the other half is router through a different interface/path. This usually happens in cluster environments (Firewalls, Loadbalancers, etc.), hence the different IP addresses. They are either NATed (Firewall) or balanced (Loadbalancer).

Questions:

  • Is there any cluster tool involved?
  • Where exactly did you take the server capture?

If the capture was taken on the server itself, then there must be two interfaces in the server (possibly with an IP address in the same subnet) and the OS does send the replies to the same interface where the requests came into the system. You don't see the SNY-ACK for the SYN in Frame #132, as it may have taken a route you did not monitor.

The strange thing here is Frame #134, which should not exist in this conversation at all, as it is an ACK from the server to the client.

If the capture was taken on a TAP or switch, there is most certainly a cluster tool involved. You see the requests coming from one cluster node and you don't see the answers as they are sent to the second cluster node. The second node possibly drops those packets, and that's why you don't see the SYN-ACK at the client.

Suggestion: Check your environment for misconfigured clustered devices (Firewalls, Loadbalancers) and/or a misconfigured server with dual interfaces (possibly in the same subnet - some versions of windows do allow that!).

Regards
Kurt

answered 28 May '13, 23:35

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

edited 28 May '13, 23:45

I think i ought to better describe the environment.

I have a client application sitting behind a FW and a LB. The LB has a VIP (24.114.118.166) and it balances the client traffic to either of two clustered servers. 192.168.140.26 is the payload IP from which a particular server rx\tx with LB.

I run the capture on both servers monitoring the payload IP on which it communicates with the LB and on the client.

One more comment is that this problem occurs not 100% of the time - this is about 1/30 attempts to establish a 3 way handshake that fails.

As you, I am also puzzled why we are seeing frame 134, i am not sure why this ACK is sent (i can see that it has a seq num=1 which is different but im not sure what it means)

The other thing which i find weird is that the server receives the first SYN packet - but since i didnt start the capture on the client and server at synchronized times i dont know how long after it was sent

After receiving the second SYN we see that odd ACK, and then a SYN ACK and 3 seconds after both client and server get a RST ...

(29 May '13, 09:00) RomanM

I run the capture on both servers monitoring the payload IP on which it communicates with the LB and on the client.

was there any capture filter in place? If so, what was it?

(29 May '13, 09:54) Kurt Knochner ♦

On the server snoop -d e1000g1 port 8080 src 47.29.0.122 or dst 47.29.0.122 (anything leaving or coming to payload IP on port 8080)

On client tcpdump./tcpdump -i dav0 tcp port 8080 (just anything on port 8080 since traffic is light)

(29 May '13, 09:59) RomanM

Are there several interfaces in the server, possibly with interface bonding?

If so, can you please capture the traffic on all interface in parallel. I still suspect, that the SYN-ACK for the first SYN, was sent through another interface and was then blocked/dropped somewehere (Firewall, LB, etc.)

(31 May '13, 08:11) Kurt Knochner ♦