Disconnects
I have hosts that I have used with Guacamole and the session stays open for ages.
With sshwifty, I now get disconnects in some cases after a few mins.
Is there any reason why the heartbeat might not be getting though? Or some other reason. It seems the server thinks the client is dead. Running in docker on bridge network.
Try run Sshwifty with environment variable SSHWIFTY_DEBUG set to 1.
$ SSHWIFTY_CONFIG=./sshwifty.conf.json SSHWIFTY_DEBUG=1 ./sshwifty
This enables Sshwifty to print more detailed logs, including every request it received on the server end. Maybe that will provide more clues on what's going on.
I can't swear this is the same issue, but the symptom is at least the same.
Debug log isn't very helpful, saying [WRN] Wed, 19 Nov 2925 08:57:14 CET Sshwifty > Server (127.0.0.1:3000) > Client (127.0.0.1:51592): Request ended with error: "/sshwifty/socket": websocket: Close 1006 (abnormal closure): unexpected EOF (5m1.676125988s)
Nginx logs are empty, but I guess the error (for me, can't talk for @andymarden) is in the reverse proxy settings, since I can't trigger it on LAN.
Hi @ech0corday, not really sure what's happened based on the log. But I feel the 5m1.676125988s runtime (the time Sshwifty backend spent on the request) is interesting since it's almost exactly 5 minutes.
If you willing to dig deeper, see if it's always around 5 minutes when the disconnect happens.
Sshwifty do have mechanisms to keep it's own websocket connection alive, but the intermediaries might impose their own limits.
I'll keep digging!
Is it possible to set log level? This is my first go (pun intended) with bug chasing in go, so I don't really know it by heart.
After testing a multitude of things it seems it has something to do with running it as a systemd service - at least at home. I can get it to disconnect when running it as a service, but not when running it through screen or tmux.
I'll do more tests with it at work tomorrow.
@ech0corday You can set environment variable SSHWIFTY_DEBUG to anything but empty to enable more detailed logs. But it's mainly helpful for indicating internal procedures, I don't think it'll provide much useful info to help debugging the connectivity problem.
Tho, I have no clue as to why it worked fine under screen/tmux, but become problematic under systemd. Sshwifty should act the same if same setting is provided.
I've tested it for a couple of days now, and in screen it works flawlessly.
Only thing I can think about with the service is type. I'm running it as simple right now (because I'm lazy, don't check things and that's my default) but I guess this might be a forking thing, yes? I'll test that next.
@ech0corday I don't really believe screen v.s. systemd would be the cause, assuming everything else is configured the same.
The log Close 1006 (abnormal closure): unexpected EOF indicates that the (websocket) client was disconnected without a proper sign off handshake. This could happen for many reasons, for example, the sign off handshake was sent but not received by Sshwifty, or the sign off was never sent (say client just unplugged).
Though, if the client failed to send the sign off, it is unlikely for the client to successfully send a EOF as well. And if Sshwifty didn't receive a EOF, it'll wait for a configured timeout to expire, then drop the connection by itself. It is possible to trigger an unexpected EOF error when this happens, but it's rare, more often the error would just be something like read tcp ...: i/o timeout.
For comparison, if the connection was perfectly signed off, the log would instead be showing something like: websocket: close 1001 (going away).
In this case, unexpected EOF indicates that Sshwifty did received a EOF to drop it's side of the connection, but the sign off handshake was missing.
Since there's a Nginx in the middle, meaning it can drop the link too (and sending EOF to both parties), my bet is on the Nginx side, not systemd, unless systemd applied some magic that I'm not aware of.
So my two cents is, something, maybe a timeout etc, triggered Nginx to shutoff the connection.
There's a server setting in Sshwifty, ReadTimeout (or SSHWIFTY_READTIMEOUT). Try set it to a matching or lower value than the timeout limit imposed by Nginx, see it improves things.
By default, Nginx sets their proxy_read_timeout to 60 seconds, so maybe try set ReadTimeout to 30 on Sshwifty.
Yeah, using a different type didn't do any difference. Trying the ReadTimeout next - maybe I've just been lucky with running it from screen.
Setting "ReadTimeout": 3600 resolved the problem on my end.
For reference, here is the Nginx configuration I’m using:
location / {
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_pass http://10.0.9.19:8182;
}
Longer ReadTimeout (I settled on 300) worked fine for me as well!
Odd. In my theory, a longer timeout should actually make this problem worse. Maybe I guessed it wrong. I need to check something up on my end to dig this further.
@rebase-and-cry ReadTimeout is kinda security related, it limits how long a client is considered connected without sending any data. A timeout that's too long may expose the server to DoS attack. Maybe try the value @ech0corday provided, which is 5 minutes instead of 1 hour.
Update: My theory above was indeed wrong.
Sshwifty will work fine under Firefox, but under Chrome, the browser slows down all timers created under a tab if the tab is considered inactive. That includes the timer which is responsible for sending signals to the Sshwifty backend to keep the connection alive.
In my own test, a ReadTimeout of 120 seconds should solve this problem, as Chrome still ticks the timer once for every 60 seconds under idle condition. So in the newer version of Sshwifty, I've set the default ReadTimeout to 120 seconds.
If you've already set your own ReadTimeout to something greater than 60 seconds, then it should be already good for you. Though during my test, when I set the timeout to 90 seconds, Chrome still got disconnected, that's why I choose 120 seconds as default.
The updated default ReadTimeout is released with version 0.4.2.
Thank you all for the effort :)