[AusNOG] Detecting "hung" ssh sessions.

Nick Stallman nick at agentpoint.com
Mon Feb 22 14:14:39 EST 2016


You might be looking for the ServerAliveInterval and/or 
ClientAliveInterval settings so the dead connections drop faster.

On 22/02/16 14:07, Ross Wheeler wrote:
>
> Hi Noggers.
>
> Looking for a "bright idea" or a point in the right direction.
>
> I have a bunch of remote devices that live behind nat and firewalls, 
> in uncontrolled environments. It's not always (or even frequently) 
> possible to get those in charge of said NAT boxes to do PAT to my 
> devices, so instead I have each device ssh to one of my hosts and 
> create a reverse tunnel. (The tunnels are bound only to the loopback 
> interface on my server, so the end devices are not significantly 
> exposed to the outside world).
>
> As and when I need to access remote boxes, I ssh to the terminating 
> host, ssh to the appropriate port and have immediate shell access on 
> the remote box.
>
> Each remote box also periodically (cron) checks that the ssh session 
> is (still) running (simple ps) and (re)starts it if not.
>
> This generally works well.
>
> Alas, this morning, my provider had a brief oopsee (no explanation 
> forthcoming) where 100% of my external connectivity dropped for a few 
> minutes.
>
> This resulted in every last one of these tunnels breaking, but they've 
> broken in such a way that they didn't restart. The terminating host 
> shows no connections from any of the remote devices, yet all of the 
> remote devices still have their ssh session running. They simply are 
> not passing any traffic. Yes, I have keepalives enabled.
>
> Does anyone know of a simple, effective, reliable way to detect (from 
> the client end) the loss of end-to-end function of a tunnel like this 
> without going completely overboard - installing replacement versions 
> of ssh isn't going to work for me, running autossh similarly.
>
> Things I've looked at but lucked out with include adding a static 
> route to my server and looking for either byte counters or last-used 
> timers with netstat, looking for per-process traffic or tcp counters 
> and a number of other failed avenues.
>
> I could add ipfw and pass traffic through a rule to observe if its 
> passing traffic or not, but that has lots of other negative impacts, 
> especially on a few machines that are already balls-to-the-wall on 
> their network interfaces.
>
> I considered tcpdump to see when a packet was last received, but it 
> too has lots of other overheads.
>
> I'm sure I'm not the only person to have ever faced this, lots of 
> people will have overcome it, but I can't find any information on it. 
> (Any amount of help for unresponsive/stuck *interactive* sessions, but 
> that doesn't help me!).
>
> Fingers crossed someone here can throw me a line....
>
> RossW
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog

-- 
Nick Stallman
Technical Director
Agentpoint Pty Ltd
The Real Estate Web Developers
Melbourne | Sydney | Miami
nick at agentpoint.com
www.agentpoint.com.au | www.zooproperty.com | www.ginga.com.au | 
www.business2.com.au

Business2.com.au is a real estate agent information website that helps 
you understand Portals, Technology and comes with FREE tools to help 
your Agency become an online success!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20160222/c5837d33/attachment.html>


More information about the AusNOG mailing list