This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Server 2008 send RST packet

0

Hi

We have the problem, that all our Win7 X64 computer have a dissconnect to the 2008 R2 servers with mapped drives. We don't have this problem with WinXP, only with Win7 und not just to one server. After login, the script map all the drives, but after while the mapped drives have a red X in the Explorer on the Win7 clients, WinXP never has any disconnected drives.

After a Whireshark trace on both sides, we could see that after the login script mapped the drives, the Server 2008 sends the keep alive packets on TCP445 to the client and the client responds with ACK each 120 seconds. But after a while the server doesn't send any keep alive packet anymore, instead it sends out a TCP445 RST, ACK packet and resets the connection. Immediately the clients have a red X in the Explorer. Sometimes it takes longer when the client is keeping the Explorer open instead of closing the Explorer, but it happens as well.

What we did already:

  • disabled autodisconnect on the servers kb297684
  • set keepconn to 65535 on each Win7
  • disabled Chimney Offload State
  • completly disabled IPv6 on both sides
  • updated all NIC drivers
  • disabled power saving on all NIC
  • played with the Spanning-tree and portfast settings on Cisco switches
  • set the keep alive interval on 120 seconds
  • disabled Netbios over TCPIP on some test computers

Here the trace from the computer. Keep alive packets were send for a few minutes, sometimes hours, and the booom, the RST packet is sent from the server.

9876    15:57:30.826511000  Note    172.17.37.15    172.17.37.130   TCP 60  [TCP Keep-Alive] 445 > 49348 [ACK] Seq=2444 Ack=5180 Win=508 Len=1
9879    15:57:30.826662000  Note    172.17.37.130   172.17.37.15    TCP 66  [TCP Keep-Alive ACK] 49348 > 445 [ACK] Seq=5180 Ack=2445 Win=251 Len=0 SLE=2444 SRE=2445
10742    15:59:05.711556000         Chat     172.17.37.15     172.17.37.130     TCP      60
         445 > 49348 [RST, ACK] Seq=2445 Ack=5180 Win=0 Len=0

Frame 10742: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) on interface 0
    Interface id: 0
    WTAP_ENCAP: 1
    Arrival Time: Dec 11, 2012 15:59:05.711556000 Mitteleuropäische Zeit
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1355237945.711556000 seconds
    [Time delta from previous captured frame: 0.000112000 seconds]
    [Time delta from previous displayed frame: 94.884894000 seconds]
    [Time since reference or first frame: 993.086588000 seconds]
    Frame Number: 10742
    Frame Length: 60 bytes (480 bits)
    Capture Length: 60 bytes (480 bits)
    [Frame is marked: True]
    [Frame is ignored: False]
    [Protocols in frame: eth:ip:tcp]
    [Coloring Rule Name: TCP RST]
    [Coloring Rule String: tcp.flags.reset eq 1]
Ethernet II, Src: Vmware_99:00:09 (00:50:56:99:00:09), Dst: Hewlett-_42:2b:83 (00:1f:29:42:2b:83)
    Destination: Hewlett-_42:2b:83 (00:1f:29:42:2b:83)
        Address: Hewlett-_42:2b:83 (00:1f:29:42:2b:83)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: Vmware_99:00:09 (00:50:56:99:00:09)
        Address: Vmware_99:00:09 (00:50:56:99:00:09)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IP (0x0800)
    Padding: 000000000000
Internet Protocol Version 4, Src: 172.17.37.15 (172.17.37.15), Dst: 172.17.37.130 (172.17.37.130)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
    Total Length: 40
    Identification: 0x4914 (18708)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 128
    Protocol: TCP (6)
    Header checksum: 0x0f08 [correct]
        [Good: True]
        [Bad: False]
    Source: 172.17.37.15 (172.17.37.15)
    Destination: 172.17.37.130 (172.17.37.130)
    [Source GeoIP: Unknown]
    [Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 445 (445), Dst Port: 49348 (49348), Seq: 2445, Ack: 5180, Len: 0
    Source port: 445 (445)
    Destination port: 49348 (49348)
    [Stream index: 2]
    Sequence number: 2445    (relative sequence number)
    Acknowledgment number: 5180    (relative ack number)
    Header length: 20 bytes
    Flags: 0x014 (RST, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 0... = Push: Not set
        .... .... .1.. = Reset: Set
            [Expert Info (Chat/Sequence): Connection reset (RST)]
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
    Window size value: 0
    [Calculated window size: 0]
    [Window size scaling factor: -1 (unknown)]
    Checksum: 0x649c [validation disabled]
        [Good Checksum: False]
        [Bad Checksum: False]

Any suggestions how we can find out more details from the RST packet sent from the server. I it possible to see the process or something inside the data packet?

Many thx Wayne

asked 11 Dec '12, 08:08

wayne7215's gravatar image

wayne7215
1112
accept rate: 0%

edited 11 Dec '12, 10:39

SYN-bit's gravatar image

SYN-bit ♦♦
17.1k957245

Can you provide a trace file of the disconnect? The trace should start with the 3-way handshake between client and server. For comparison, a trace file showing the handshake between your XP systems and the server would be perfect.

(11 Dec '12, 12:47) packethunter

Hi Packethunter

Thx alot for your investigation, I really appreciate :)

The idle Session Time is set to -1 to turn it off as mentioned in Microsoft KB297484. Should I better change it for a while to 65535? Could it be that this KB is wrong and -1 is not really turning it off? And yes, we rebooted the server always after we changed some settings.

If you say Session 1 ended because the client issued a tree disconnect......the only thing we did on this client was open the Windows Explorer to check if the drives are still connected and then closing the Explorer again. Nothing else, no logoff!

Session 3 is our DC server. Here again, never logged off, only closed the Explorer. But just closing the Explorer is not disconnecting the drives.

So if the session is closed from the server after 15 minutes and the net config server /autodisconnect:-1 is creating a value of -1 for the "Idle session time (min)", then it looks like this KB article from MS is really wrong!?!?!

I will change this to 65535 and see what happens.

@Landi

I have no idea why we havent NBNS keepalive, but we want to disable Netbios anyway as soon as possible. We don't use any netbios related apps anymore and I would like to get rid of this broadcasts in our network. TCP Keepalive we enabled on all Clients as a first step because of this red X drive problems. Haven't tried yet to disable SMB2, but will try as well.

Thank you all for your suggestions!

(13 Dec '12, 02:21) wayne7215

4 Answers:

2

Hi Wayne!

Short answer:

What is the idle session timer on the Windows server? This timer can be identified with the command "net config server" on a command line (requires admin privileges).

You mentioned in your post, that the parameter has already been changed. Did you reboot the server (or restart the server process) after changing the parameter?


Analysis and long answer:

Thank you for uploading / sharing the file. First of all, we see 3 connections to three different SMB file servers. The client-handshakes (negotiate protocol request) matches a Windows 7 system. All connections use SMB2. This indicates Windows Server 2008 or a filer licensed and configured for SMB2.

  1. Client 172.17.37.130 to server 172.17.37.15 (Ended by server with RST in frame 10742)
  2. Client 172.17.37.130 to server 172.17.37.70 (still running at end of trace)
  3. Client 172.17.37.130 to server 172.17.36.3 (Ended by client with RST in frame 1980)

Session 1 ended after the client issued a tree disconnect (in other words disconnected the share) and session logoff (in other words: User log out). That's either due to user intervention or a script issuing a "net use /delete" command

Session 3 is disconnected by the server. In frame 802 the client disconnects the share (tree disconnect).

The tree disconnect in frame 802 refers to the IPC-connection (connection to named pipes, used for example for remote registry access and several other features). In this trace the IPC connection accesses the name pipe MsFteWds, which MSDN relates to the "Windows Search Protocol".

The last activity in the file system can be observed in frame 646 (Close response for file "dg"). Session 3 is terminated by the file server 948 seconds after the last file system activity or 935 seconds after the last SMB-activity. So the server resets the connection after 15 minutes plus change.

The idle session timer has a default value of 15 minutes.

A similar issue is described in http://social.technet.microsoft.com/Forums/en/winserverManagement/thread/dd476b45-59a6-49f8-9c1a-35e846c5587b.

A similar article related to older Windows versions is found in http://support.microsoft.com/kb/297684

answered 12 Dec '12, 07:49

packethunter's gravatar image

packethunter
2.1k71548
accept rate: 8%

edited 12 Dec '12, 08:06

1

935 seconds could relate to the default 15min idle disconnect value plus 35sec, for me well known as the time it takes after a fin until the RST closes the connection if no other fin was recieved

(12 Dec '12, 07:52) Landi

Hi Packethunter

Thx alot for your investigation, I really appreciate :)

The idle Session Time is set to -1 to turn it off as mentioned in Microsoft KB297484. Should I better change it for a while to 65535? Could it be that this KB is wrong and -1 is not really turning it off? And yes, we rebooted the server always after we changed some settings.

If you say Session 1 ended because the client issued a tree disconnect......the only thing we did on this client was open the Windows Explorer to check if the drives are still connected and then closing the Explorer again. Nothing else, no logoff!

Session 3 is our DC server. Here again, never logged off, only closed the Explorer. But just closing the Explorer is not disconnecting the drives.

So if the session is closed from the server after 15 minutes and the net config server /autodisconnect:-1 is creating a value of -1 for the "Idle session time (min)", then it looks like this KB article from MS is really wrong!?!?!

I will change this to 65535 and see what happens.

@Landi

I have no idea why we havent NBNS keepalive, but we want to disable Netbios anyway as soon as possible. We don't use any netbios related apps anymore and I would like to get rid of this broadcasts in our network. TCP Keepalive we enabled on all Clients as a first step because of this red X drive problems. Haven't tried yet to disable SMB2, but will try as well.

Thank you all for your suggestions!

(13 Dec '12, 02:24) wayne7215

Owh, realized some strang coherences! I checked now on all our 2008 servers the "Amount of idle time required before suspending a session" in the local GPO and there was on all the value of 0! But default should be 15 minutes. After I changed this value to 99999, it automatically changed the "Idle session time (min)" to 34463! So I guess when we did the command "net config server /autodisconnect:-1" on all our servers, it was touching the "Amount of idle time required before suspending a session" policy as well!?

So now it looks like that: Amount of idle time required before suspending a session = 99999 Idle session time (min) = -1

(13 Dec '12, 03:07) wayne7215

arrr, server is still sending RST packets. It looks like it doesnt care at all on the "Idle session time". With -1 and with 65535 as value, after a while the server is sending a RST packet to the client. Should we really disable SMB2 now and abstain from the benefits of SMB2. I'll give it a try.

Current capture http://cloudshark.org/captures/1e1bfb6a3b44

If I set a filter "ip.addr==172.17.37.15" until frame 705 I was browsing through the mapped drive folder DG and on frame 797 I closed the Explorer. Then the Keep-Alive sessions started every 2 minutes and on frame 10488 again RST.

(13 Dec '12, 05:37) wayne7215

18 hours now with SMB1 protocol and without RST packet! So the problem is just and only with SMB2! But in our case SMB2 is more than twice as fast as SMB1. So we have to decide now, continue to analyze the SMB2 protocol or use the slower SMB1 protocol.

(14 Dec '12, 00:08) wayne7215

I am just browsing through your file. As I am checking a few bits in the SMB2 headers this will take some time. Please be patient. (Or someone in the community might have the right idea in the meantime)


First of all: Congratulations on isolating the symptoms to SMB2. Yes. SMB2 is a lot faster, and pipelining helps in a few situations, too.

As Landi already recommended, get rid of NetBIOS over TCP. The trace shows a lot of background noise and users staring at hour glass cursors.

On top of that, you want to get rid of LLMNR (configured in the name resolution policy).

(14 Dec '12, 11:04) packethunter
showing 5 of 6 show 1 more comments

1

Just a few more parameters that control server behavior:

  • SessionKeepAlive in HKLM\SYSTEM\CurrentControlSet\Services\Netbt\Parameters
  • ConnectionNoSessionsTimeout in HKLM\SYSTEM\CurrentControlSet\Services\LamanServer\Parameters

Several other NetBIOS parameters control the number of users that can be logged on simultaneously. I assume, that these limits were not changed.

answered 11 Dec '12, 13:16

packethunter's gravatar image

packethunter
2.1k71548
accept rate: 8%

SessionKeepAlive = 3600000 and the ConnectionNoSessionsTimeout doesn't exist.

I will try to get the 3-way handshake packets filtered tomorrow. Can I do with a capture filter like that? tcp[13] & 2!=0 as a capture filter

Thx

(11 Dec '12, 14:56) wayne7215

Please publish the whole TCP session from 3-way handshake to RST. Don't use a capture filter. The interesting part includes NetBIOS protocol negotiation, session setup, tree connect etc.

A look at http://technet.microsoft.com/en-us/library/cc957471.aspx will show that the server disconnects the client if no share is mounted (i. e. the client is authenticated, but has not mapped a drive). The trace file will show, if this is the case.

(11 Dec '12, 15:13) packethunter

Hmmm, I'm a little bit afraid to upload a whole severside capture file with all the netbios names and informations. Wouldn't it be a security leak to spread around a serverside trace without filtering to 1 computer? The other thing is, it can take hours until the first disconnect, so the file could be huge without filtering. I have already 2 captures, one from the client (1.2MB) http://cloudshark.org/captures/f8b85369c2c3 and one from the server (13MB). Would be a the trace from the client enough or you need both?

Win7=172.17.37.130 server=172.17.37.15

Thx for your help, I really appreciate.

(12 Dec '12, 02:27) wayne7215

Are those IPs all real, or is there ANY device in between both machines, is the firewall on Server2008 enabled? If so, did you try disabling it?

Can you confirm that the actual RST packet was really sent from the server with the exact same timing by looking up a trace taken on the server?

(12 Dec '12, 03:12) Landi

The IP's are all real, no NAT or routers between, just 1 Cisco switch. And yes, the RST packet was sent from the server, can see the same packet at the same time. There is no firewall running on the server or on the Win7 client, nothing between at all.

(12 Dec '12, 03:26) wayne7215

I can't spot an indicator for this behaviour from looking at the packets. Can you check the windows event logs if there is something maybe related?

(12 Dec '12, 04:49) Landi

No, nothing in the event logs, not on the server and neither on the clients.

The strange thing is, I don't believe it's a server problem, because it's not sending an RST packet in regular intervals. So if it would be a timeout setting, it must be always after the same interval, but it can happen after 40 minutes or after 4 hours. That's why we started to analyze the traffic with Wireshark. And these servers really don't have any high latency or utilization at all, that can't be the reason as well. We have it with brand new Win7 computers, without any third app installed.

Could it be that a server is sending a RST packet, because of lost packets between client and server? Then the only reason could be our Cisco Cores switch and we have to replace it, but our Cisco engineers told me everything looks just as normal on the switches.

I am rather at a loss.

(12 Dec '12, 06:00) wayne7215
1

I just did some research and got stuck on this: Few things you might want to take a look at:

  • Why is there no NBNS keepalive (should trigger before TCP keepalive)
  • Why IS there a tcp keepalive? I found a statement, that except for the application asking for it, tcp in Server2008 and Vista+ doesn't use it
  • What happens when you disable SMB2 and use SMB

I see different techniques that "sound" related to that:

  • TCP Keepalive

  • NBNS Keepalive

  • SMB2 Keepalive

  • SMB Echo Request/Response

I don't know when which of those applies, but in SMB(1) I see Echo and NBNS Keepalive!?

(12 Dec '12, 06:18) Landi
showing 5 of 8 show 3 more comments

0

Just a question:

There is no 'additional' security software on that server, right? Something like Symantec Endpoint Protection or similar software?

Regards
Kurt

answered 14 Dec '12, 11:53

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

There is, Symantec Endpoint Protection 12.1, but only AV without Firewall. Should I give it a try and uninstall SEP?

(14 Dec '12, 12:12) wayne7215

There are reports about similar problems in conjunction with SEP.

http://www.networksteve.com/forum/topic.php/Keep_losing_Win_2008_Server_share_in_XP_My_Network_Places/?TopicId=13241&Posts=4`

.

Uninstall?

If you have a current backup of that server: Yes.

(14 Dec '12, 12:21) Kurt Knochner ♦

Sorry for the long delay. Yes we uninstalled SEP 12.1 on the server and the problem still occurs. What we know for sure now is, it's definitly a SMB2 problem! If we disable SMB2 on the server and all Win7 clients use the SMB1 protocol, we haven't any disconnects anymore! That's also the reason, why it happens nonly with Win7 clients. But disabling SMB2 is the worst case for solving this disconnect problem, because of the performance we will lose.

@Landi The TCP keep alive packets was the first step we tried to handle this problem, we enabled it on the 2008R2 server.

(11 Jan '13, 03:44) wayne7215

If there any other software on the server, that does not come with the default MS installation?

If no: this sounds like a support case for Microsoft !?!
If yes: what kind of software (non 'standard' network drivers, etc.)?

(11 Jan '13, 12:32) Kurt Knochner ♦

0

Wayne,

It's been a while since this thread was last updated but I wanted to report that I had run into a similar sounding problem. In my case, I was seeing TCP connections to a postgres database (from one Win 2008 server to another) terminating after 5 minutes of inactivity despite the client applications database connection pool (org.apache.commons.dbcp) wanting to keep idle connections pooled for 10 minutes. The terminated connections were causing havok on the server which thought it had open connections sitting in the connection pool.

The solution was to force the "client" Windows 2008 box to generate KeepAlive packets. To do this, I had to add the key "KeepAliveTime" to HKLM\System\CurrentControlSet\Services\Tcpip\Parameters and set a value of something less than 5 minutes. This has the desired effect of generating KeepAlive packets on the TCP connections between the client and server thus preventing whatever bit of the network stack that was generating the RST packets from doing so until the connection was actually released.

Dean

answered 21 Feb '14, 18:18

Dean's gravatar image

Dean
111
accept rate: 0%