Ask Your Question
0

SSH Connection randomly drops (Palo Alto FW in between)

asked 2019-05-27 10:21:02 +0000

an.schall gravatar image

An SSH connection to a particular server drops randomly (usually 20-60 seconds after login). Between the client and the server is a Palo Alto firewall with SSH decryption disabled.

What I tried so far

  • regenerated ssh keys on the server
  • added to server config: ClientAliveInterval 30 ClientAliveCountMax 5
  • added ServerAliveInterval=10 to ssh command
  • added ServerKeepAlive=true to ssh command
  • tried various ssh clients

Nothing worked so far. Notice the debug3: send packet: type 80 and debug3: send packet: type 1 messages just at the moment before/ after the connection is dropped. The firewall logs the SSH session and the termination reason is "tcp-rst-from-client".

I did a packet capture from within the firewall. Palo Alto allows to capture four different flows:

  • drop —When packet processing encounters an error and the packet is dropped.
  • firewall —When the packet has a session match or a first packet with a session is successfully created.
  • receive —When the packet is received on the dataplane processor.
  • transmit —When the packet is transmitted on the dataplane processor (from here)

It seems like the client sends a TCP RST message to the server. I am not an expert on analyzing such traces and hence would appreciate any support from you experts. I would like to append the capture to this thread, however it seems like my karma is pretty bad ;)

Thanks in advance.

edit retag flag offensive close merge delete

Comments

Post your capture on a public file share, e.g. Google Drive, DropBox etc. and post a link to it back here as a comment.

grahamb gravatar imagegrahamb ( 2019-05-27 10:29:04 +0000 )edit

Hi Graham, thanks for advice. Here you go: https://we.tl/t-zNf4VJ45YU

an.schall gravatar imagean.schall ( 2019-05-27 12:12:35 +0000 )edit

A network diagram would be of a great help here because there is an asymmetric path involved together with FHRP protocol (check MAC addresses).

Client resets a connection probably because of timeout. There is a gap in SEQ# between last client's ACK and RST which means some client's packets were lost before capture point.

Packet_vlad gravatar imagePacket_vlad ( 2019-05-27 13:16:37 +0000 )edit

1 Answer

Sort by » oldest newest most voted
1

answered 2019-05-27 16:29:18 +0000

SYN-bit gravatar image

There is a gap of 220 bytes in the sequence numbers in frames 770 and 1074. There are no packets in the RX trace of the PaloAlto for this TCP session between frames 770 and 1074. This indicates that the client has sent a 220 byte TCP segment that did not arrive at the PaloAlto. If this was random packet-loss, the retransmission of this packet would have been seen at the PaloAlto.

As the RST comes after a while, my guess would be that the packet from the client with sequence number 4082 and tcp segment length 220 is retransmitted a couple of times but systematically dropped somewhere between the client and the PaloAlto. Then the client sends the RST as it is not receiving an ACK to any of the retransmissions and so it thinks the connection is lost. Is there a intrusion prevention system in place between the client and the PaloAlto?

One other note, the packets from the client are routed over the Standby router of the HSRP cluster with cluster IP address 10.5.226.254. Is this asymmetrical path by design or might there be something wrong with the routing?

edit flag offensive delete link more

Comments

It is too late I cannot see reason at the moment for the following observation. But does anybody have an idea why in the Transmit trace the RST in 747 comes with a SEQ of 4302 instead 4082 (relative SEQ)?

Christian_R gravatar imageChristian_R ( 2019-05-27 22:42:15 +0000 )edit

That's because the client emitted some data packets we don't see in the capture. For the rest network diagram is needed.

BTW as I understand Wireshark doesn't pay attention to SEQ# jump in RST. Would it be useful to fix this?

Packet_vlad gravatar imagePacket_vlad ( 2019-05-28 03:58:24 +0000 )edit

It would be helpful to see a capture at the clients end as well as a capture at the PaloAlto at the same time. And preferably with a couple of SSH sessions in it to see if the behavior is consistent. Would that be possible @an.schall ?

SYN-bit gravatar imageSYN-bit ( 2019-05-28 06:34:23 +0000 )edit

@SYN-bit: Yes, I will try to find some time today to create the client-side capture.

an.schall gravatar imagean.schall ( 2019-05-28 07:54:36 +0000 )edit

Great, please also make the PaloAlto traces at the same time again :-)

SYN-bit gravatar imageSYN-bit ( 2019-05-28 08:05:26 +0000 )edit

I uploaded the four firewall traces as well as the client-side trace, which can be found here:

https://we.tl/t-Ie6QeCrQnD

an.schall gravatar imagean.schall ( 2019-05-28 10:23:32 +0000 )edit
1

There seems to be a proxy between the client and the server. It has mac-address "RealtekU_12:35:02 (52:54:00:12:35:02)" and terminates the session at least at the TCP layer (as I see different absolute sequence numbers and different ip.id numbers and a ip.ttl of 64). It also aggregates some TCP messages into one.

This proxy ACKs the last 3 data packets from the client, but they never arrive at the PaloAlto. Then after a while it sends a RST to both sides, to the client side with the proper SEQ and ACK, and to the server side with a SEQ number that includes the missing data. So the problem must be between this proxying device and the PaloAlto.

It would be helpful if an overview can be given of all involved systems (endpoints, proxies, routers and switches).

And could you make a trace ...(more)

SYN-bit gravatar imageSYN-bit ( 2019-05-28 13:01:26 +0000 )edit

@SYN-bit good spot

Christian_R gravatar imageChristian_R ( 2019-05-28 18:41:37 +0000 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2019-05-27 10:21:02 +0000

Seen: 2,622 times

Last updated: May 27 '19