This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Router appears to block proper TCP Session Teardown (FIN, ACK), Ever See This?

0

We have several Cisco (Linksys) RV082 routers in our networks. I have been trying to run down a problem with a new application that appears to be network related. After 3 weeks of investigation, it appears our RV082 router is terminating some TCP conversations before the full teardown handshake can take place. I have done simultaneous packet capture on both sides of the router and compared the conversations/sessions. (I can provide full packet captures if it comes to that).

Anyway, The client is INSIDE the network and the SERVER is outside. The client establishes an HTTP session with the server. This all looks OK. Data is requested and received. The problem starts when the server then requests that the session be terminated.

[Start of session...] Server: [FIN, ACK] Client: [ACK]

The client then sends 10 FIN, ACK packets (in exponentially increasing delay) which never make it through the router. It appears that the router has slammed the door after the client ACK's the server's FIN, ACK...

This is not an isolated problem and I can see exactly the same behavior on many occasions. Although it appears that the conversation payload is being delivered, I suspect the application in question is throwing away the data if it thinks the port was not closed properly. I've been trying to educate myself on whether this is the root cause of our issue, and everything I read seems to indicate that this is not proper behavior.

I've posted this info on the Cisco SMB support forum, but they don't seem to have any interest in commenting on this behavior. (They reply with some boilerplate about making sure I don't have an Virus' on my network and that all my NIC cards are working properly!)

Has anyone ever seem this before? Is this likely a bona-fida bug in our router's firmware?

asked 25 Feb '13, 11:52

evanevery's gravatar image

evanevery
1113
accept rate: 0%


2 Answers:

3

Replacing the routers may solve your problem, but I think it's premature to put all the blame on the routers. It seems that you also have an application that's not as well behaved as it could be. Let's start with the image you posted of captured traffic at http://homecommunity.cisco.com/t5/Wired-Routers/RV082-Dropping-Packets-captures-included/td-p/612638 .

It appears that the time display format was set to "Seconds since beginning of capture." In packet 8, the server sends its FIN/ACK. In packet 9, 39 ms later, the client responds with an ACK. In packet 10, over 39 seconds later, the client sends its own FIN/ACK.

So, the first question for the application developers is, "Why does the application wait 39 seconds to issue its own FIN/ACK when it has no data to send?"

On the Cisco site, you said the TCP timeout is set to 1800 seconds. That's a lot more than 39 seconds, but that's probably the timeout for an idle connection, that is, one on which no data has moved. Your router likely has a different, shorter timeout for a connection on which a FIN/ACK has been seen. I'd guess that this timer is 30 seconds or less. So, is the router "slamming the door prematurely"? Well, it's slamming the door before the client responds, but 39 seconds is a long time in networking. If this other timer is user configurable, you might be able to fix your problem by changing the timeout value to, say, a minute.

I see this behavior on my home network. My anti-virus software contacts the update server. After the data exchange is complete, the server sends a FIN/ACK. Sometimes the client responds with its own FIN/ACK within a second or two. In that case, the connection closes normally. Other times, the client waits a full minute or more before sending its FIN/ACK. By this time, the router has cleared the session information, so the router returns an ICMP "Destination unreachable, network unreachable" packet. The client sends six FIN/ACKs, and when it doesn't get a response to any of them, it finally sends a RST and then gives up. However, if I wasn't using Wireshark, I would never know anything was wrong. In fact, from the user perspective nothing is wrong.

However, in your case, as you posted on the Cisco site, "This causes a lot of problems with a particular application which simply throws away the whole conversation since the port/conversation 'wasn't closed properly.'"

So the second question for the application developers is, "Why does the application throw away an entire conversation that is known to be good simply because there was a technical problem with the last packet of the tear down procedure?"

The client knows that the server has no more data to send because it has received the server's FIN packet. The client can tell from the sequence number in the FIN packet that it has successfully received all the data that the server sent. The client can tell from the ACK number in the server's FIN packet that the server successfully received all the data that the client sent. So, regardless of whether the connection terminated correctly or not, the client can tell that all the data was transferred successfully.

It seems to me that the developers have been a bit lazy, and they're just assuming that the data may be incomplete or bad if the connection didn't close properly. If they would reprogram the app to actually determine from the sequence and acknowledgment numbers if all the data has been received properly, the app would be much more robust under real-world network conditions, like my anti-virus app.

The fact that you're not having a similar problem with all your applications indicates that it's at least partly an application problem. The problem will go away when you replace your routers if the new routers have a FIN/ACK teardown timer that is longer than your application's response time, otherwise you will still have the problem.

answered 25 Feb '13, 15:25

Jim%20Aragon's gravatar image

Jim Aragon
7.2k733118
accept rate: 24%

edited 25 Feb '13, 18:08

Thanks Jim! I'm in 100% agreement with your analysis. I've asked the developer about this on several occasions. Why does it appear they ignore a perfectly good payload just because the teardown (port closure) may have returned an error. I still have not got any response from them... Its somewhat comforting to see that I was/am on the correct path here.

(26 Feb '13, 07:22) evanevery

It sounds as if this applications was developed and tested on a LAN where it did not have to pass through any NAT or firewall devices, but it's not actually ready for service under real-world WAN network conditions. Any NAT router you buy will have a FIN teardown timer, and on a SOHO class router like the RV082, it will not generally be user-configurable. As mentioned above, if the teardown timer is longer than the application's response time, you won't see this problem any more; however, the problem won't be gone, it will just be masked by the long timer. They really need to tune the app.

(26 Feb '13, 09:07) Jim Aragon

Is the application related to Commtouch Antispam (see the last comment in my answer)? If so, is your developer using the API of Commtouch?

(26 Feb '13, 11:25) Kurt Knochner ♦

1

As you've added much more information in the Cisco forum, I'll post the links here. If you don't like that, please add a comment and I'll delete the links.

https://supportforums.cisco.com/thread/2199005
http://homecommunity.cisco.com/t5/Wired-Routers/RV082-Dropping-Packets-captures-included/td-p/612638

So, to me it looks like the routers firewall clears the session table entry very soon after it receives the first FIN packet from the server. Any further packet from the client (FIN,ACK) will then be dropped and logged as "Connection Refused - Policy violation"). So, this seems to be a firmware issue with the RV082 and should be solved by Cisco.

As an alternative, you could try to install OpenWRT, dd-wrt or NSLU2 on the RV082 (not sure if any of those properly support the RV082 - please google yourself ;-)).

Regards
Kurt

answered 25 Feb '13, 12:41

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

No problem posting those links... Thanks for asking though!

Thank you for confirming my suspicion. I could not get a straight answer anywhere else! Based on the lack of support from the CISCO SMB group, we are already in process of getting the routers replaced. We are replacing them with Netgear UTM150's which have some additional security features...

It really is a shame when someone works hard to isolate a pretty major problem and then the vendor does not even spend the time to read the findings...

Thanks so much for your information. I feel better knowing that the replacement of the RV082's is the correct thing to do!

(25 Feb '13, 13:19) evanevery

before you replace the router, please consider what @Jim Aragon said about the behavior of your application (see his answer).

BTW: Is that an application you developed or simply a web application accessed by a standard browser. If so, what happens if you use a different browser? Is any client code involved (java, java script, flash)?

The mentioned IP address belongs to c1iprep1.ctmail.com, which is part of a mail security system (Antispam) of Commtouch Software.

(25 Feb '13, 17:30) Kurt Knochner ♦