ssh icon indicating copy to clipboard operation
ssh copied to clipboard

Spikes in connection time

Open chrypnotoad opened this issue 6 years ago • 3 comments

We are seeing intermittent spikes in the time it takes for a user to establish a connection with our SSH Server. The delay is in between PublicKeyHandler and Handler starting. So the customer successfully authenticates, then something happens anywhere from a couple seconds to multiple minutes (9 minutes was the most this weeks), then then the Handler kicks in and they are connected as usual. it seems to be inside the gliderlabs/ssh code while this is happening.

We have not been able to reproduce the issue ourselves. It does not happen to any particular customers or with any particular client, but we see some customer connections being delayed like this every day.

We only have questions right now:

  • Is it possible that there are some timeout values that we could configure around this?
  • Is it possible there is a prompt or some other customer interaction that it is waiting on?

chrypnotoad avatar Sep 30 '19 21:09 chrypnotoad

I don't believe there are timeouts, but adding them would requiring knowing what needs to be timed out. How are you "seeing" their connections delayed? Logs? Have you tried writing a test case that tries it a bunch to see if it happens?

On Mon, Sep 30, 2019 at 4:29 PM Chris DuVall [email protected] wrote:

We are seeing intermittent spikes in the time it takes for a user to establish a connection with our SSH Server. The delay is in between PublicKeyHandler and Handler starting. So the customer successfully authenticates, then something happens anywhere from a couple seconds to multiple minutes (9 minutes was the most this weeks), then then the Handler kicks in and they are connected as usual. it seems to be inside the gliderlabs/ssh code while this is happening.

We have not been able to reproduce the issue ourselves. It does not happen to any particular customers or with any particular client, but we see some customer connections being delayed like this every day.

We only have questions right now:

  • Is it possible that there are some timeout values that we could configure around this?
  • Is it possible there is a prompt or some other customer interaction that it is waiting on?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gliderlabs/ssh/issues/121?email_source=notifications&email_token=AAAAFB3VV6T6WJOKJH26P6LQMJVSRA5CNFSM4I4A54R2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HOVSRTQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAAFBZXORTUPITJLRJ3J7LQMJVSRANCNFSM4I4A54RQ .

-- Jeff Lindsay http://progrium.com

progrium avatar Sep 30 '19 21:09 progrium

We are seeing these in our logs and monitoring. We log the time it takes between multiple different steps. We have one specifically for the time in between our authentication completing and the handler starting. We have a load test that we run which does not ever hit this issue.

chrypnotoad avatar Oct 01 '19 20:10 chrypnotoad

When I was maintaining the server for Bitbucket, we had a number of users who would open a connection then use it later. We added a timeout to kill connections which took too long to open a session.

Would one of the timeouts mentioned in https://github.com/gliderlabs/ssh/issues/11 work? It seems like this issue might be a duplicate of that.

belak avatar May 10 '22 05:05 belak