TCP FIN with Data causing RST

asked 2020-10-20 18:00:17 +0000

Wes Evernden
1 ●3 ●1

Hi,

I am working on a problem where we get tcp resets from a a cloud service provider every time the tcp FIN is sent with data. I have done quite a bit of searching and haven't come across anything saying a segment can't be combined with a FIN but at the same time haven't come across any examples of a tcp close where the FIN packet has data.

Here is the tcp close sequence with my comments:

Server: sends FIN with 165 bytes of data // expect client to ack of 165 +1 (data + 1 for the FIN), ACK & PSH flags present
Client: sends ACK ack’ng 166 // client has ack'd data and FIN, nothing outstanding to ack, no data in this packet
Client: sends 149 bytes // expect server to ack this 149 bytes, ACK flag present
Client: sends FIN // no data in this packet, now expecting ack of the above 149 bytes and + 1 for the FIN flag, ACK flag present
Server: send RST // ack of 149 bytes, client FIN not ack'd, ACK flag present

The the data flow is from client to server. The problem is when the client app is notified a RST was received from the server it considers the data wasn't received and needs to resend it. So that RST is causing us quite a bit of grief.

It seems to me the problem is on the server side, it shouldn't be sending a RST.

Thanks,

-Wes

edit retag flag offensive close merge delete

Comments

Wireshark Wiki: tcp-ecn-sample.pcap
There are a few in Ultimate pcap running over IPv6.

Do you control the client? Why not combine #3 and #4 or hold #4 until #3 has been ACK'ed?

Chuckc ( 2020-10-20 21:02:57 +0000 )edit

add a comment

2 Answers

Sort by » oldest newest most voted

answered 2020-10-21 12:08:10 +0000

SYN-bit

18585 ●9 ●359 ●255 https://SYN-b.it

Setting the TCP FIN flag just means you are done sending data. That is usually done in a separate packet with no data, but it is also allowed to set the FIN flag on the last segment of data that is being sent. After the FIN from side A, side B is still allowed to send data until it sends a FIN itself.

So nothing out of the ordinary in your packet flow. The only thing I can imagine is that there is a session based device between the client and the server. Think of firewalls, load-balancers (Application delivery controllers), etc. Usually these devices will have a long timeout for an established session and will have a short timeout after one of the sides has sent a FIN. So in your case, how much time is there between the packets? Also please check the IP TTL of the RST and compare it to the other packets coming from the server. If they differ, that could be an indication that an intermediate device was responsible for the RST and not the server itself.

If these tips don't help, please provide the pcap files if possible (use Tracewrangler to remove sensitive information and translate IP addresses if there is sensitive data in the pcap files). You can use any public file sharing service (like One Drive, Dropbox, Google Drive, etc)

edit flag offensive delete link

Comments

Thanks for the responses. Yes, I thought some device in the middle might be sending the reset but that doesn't seem to be the case. The timing is right, the TTL is right and the IP ID is consecutive. The entire close I described happens with 1 RTT plus 1ms for the client sends. I am capturing right at the edge of our network so the reset are not coming from our gear.

Of course the cloud provider, the ISP, and even us are all saying the same thing, that is, we have thousands of computers and no one is experiencing this problem so its not us. Well, at least we can all agree on that.

Wes Evernden ( 2020-10-21 17:00:18 +0000 )edit

Just to verify, does this session closure cause problems? As all data does get acked. The only thing that does not get acked is the FIN from the client.

Also, is this a TLS based protocol? If so, are the last segments of the connection "Application data" or "(Encrypted) Alerts"?

How often do you see these RST packets? Every stream? Or just once in a while?

Since this is a cloud service provider, is their service publicly reachable?

SYN-bit ( 2020-10-21 22:51:47 +0000 )edit

Yes, the reset from the server is logged as a Socket Error by the client application. The client app which is sending email to the cloud service for scrubbing marks the emails as not sent. Result is email stops flowing and the send Q grows. About 6 out of 10 connections end in a reset so some email does get through. Workaround is a script that keeps restarting the send Q with the effect of trying to send more frequently, which in turn means more emails get through. This keeps the Q to a reasonable level.

Yes, this smtp inside ssl. The last FIN with payload is not parsed as an SSL Alert so presumably is app data.

Yes, a publicly accessible service. That has me thinking extreme hardening on the cloud service side and this may be a by product of that.

Wes Evernden ( 2020-10-22 00:56:04 +0000 )edit

Interesting, so THEY are sending data in the FIN to you and only when THEY do than, the session is not properly closed. In a session that is normally closed, does the last data and the FIN from their side come right after another or does the FIN from their side come after the last data from YOUR side? If so, than it looks like something on their side loses it's session straight after the (first) FIN was seen.

If it is a publicly accessible cloud provider, can I test it myself? If you don't want to mention their servername here, you could send it by email (my address is in my profile).

SYN-bit ( 2020-10-22 13:46:03 +0000 )edit

Yes, the FIN THEY send is right after a data packet (< 1 ms). This packet payload is always 135 bytes. For a connection that is reset the data in the FIN packet is always 165 bytes. That's interesting, and there are other interesting observations like we have one server that doesn't have this problem, and that the problem comes and goes at times outside any of our changes. All interesting but that still is enough to convince the cloud service provider to get a packet capture in their side.

Wes Evernden ( 2020-10-22 18:48:24 +0000 )edit

add a comment

answered 2020-10-20 21:09:41 +0000

Jaap

13782 ●717 ●115

It's a rather course way of the server to break off the TCP connection completely, allowing it to free its resources. Otherwise it would have to keep them for the time it takes for the connection to be really timed out. You said in item 5 that the ack for the 149 bytes is sent by the server, so the client should be able to conclude that this data was transferred.

Maybe if you could put a capture file on a publicly accessible share and post the link here, this might prove more insightful.

edit flag offensive delete link

add a comment