shellhub icon indicating copy to clipboard operation
shellhub copied to clipboard

bug: SSH Connection not Working

Open KimiyaMorozova opened this issue 11 months ago • 12 comments

Description

SSH Connections immediatly disconnects after authentication is complete.

Image

Note: nsenter: reassociate to namespace 'ns/time' failed: Operation not permitted only happens when using docker as an agent.

Steps to Reproduce

Connect to any Machine

Expected Behavior

Terminal Connection should be established successful.

Edition

Community

Version

0.18.0 or 018.3, same thing happenend

Related Logs

ssh-1      | {"cols":80,"device":"d4995bc4de6651e62064aec8ef1c347169755f4708620771d72f4f015d3e81f6","error":"failed to read the message from socket\nread tcp 172.18.0.7:8080-\u003e172.18.0.8:53374: use of closed network connection","ip":"10.10.12.92","level":"error","msg":"failed to read the message from the client","rows":24,"time":"2025-03-11T14:46:07Z","user":"root"}

Related Code

No response

Additional Information

No response

KimiyaMorozova avatar Mar 11 '25 14:03 KimiyaMorozova

Can you please run the following command and share the output?

docker info

This will help us determine if Docker is running in rootless mode which could be related to the issue.

gustavosbarreto avatar Mar 12 '25 15:03 gustavosbarreto

Shellhub is running underneath proxmox in an unpriviliged LXC Container, in case that matters.

 Version:    27.5.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.20.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.32.4
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 7
  Running: 5
  Paused: 0
  Stopped: 2
 Images: 13
 Server Version: 27.5.1
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.4-0-g6c52b3f
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.12-8-pve
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 512MiB
 Name: ANK-CTShellHub
 ID: 87b926f5-3618-4b3a-b49c-47368dc1cd04
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

KimiyaMorozova avatar Mar 12 '25 15:03 KimiyaMorozova

Thanks for the additional information. Since ShellHub is running inside an unprivileged LXC container, this is likely the root cause of the issue.

gustavosbarreto avatar Mar 12 '25 17:03 gustavosbarreto

Running as a priviliged LXC Container does result in the same error (same logs)

KimiyaMorozova avatar Mar 12 '25 17:03 KimiyaMorozova

Running as a priviliged LXC Container does result in the same error (same logs)

Since the issue persists, it indicates that the restriction might be coming from other factors within the LXC environment on Proxmox. Even in a privileged LXC container, Proxmox may be restricting the use of time namespaces by enforcing additional security restrictions through AppArmor or seccomp.

I'm not a Proxmox expert, but it's clear that something is blocking access to the time namespace in your environment. This restriction needs to be addressed for nsenter to work properly.

If modifying the LXC configuration or host settings is not an option, you might need to use the ShellHub binary directly.

gustavosbarreto avatar Mar 12 '25 18:03 gustavosbarreto

The nsenter Error only consists when connecting to another docker agent. When the Standalone option is used the error does not appear, however this log entry does still occour on the server side.

ssh-1      | {"cols":80,"device":"d4995bc4de6651e62064aec8ef1c347169755f4708620771d72f4f015d3e81f6","error":"failed to read the message from socket\nread tcp 172.18.0.7:8080-\u003e172.18.0.8:53374: use of closed network connection","ip":"10.10.12.92","level":"error","msg":"failed to read the message from the client","rows":24,"time":"2025-03-11T14:46:07Z","user":"root"}

KimiyaMorozova avatar Mar 12 '25 18:03 KimiyaMorozova

After Installing the Server in an VM and investigating some more i found these logs in the Client. I hope they Help a little more.

Mar 25 08:32:21 ANK-CTDash runc[24762]: time="2025-03-25T08:32:21Z" level=info msg="Starting ShellHub" mode=multi-user version=v0.18.3
Mar 25 08:32:21 ANK-CTDash runc[24762]: time="2025-03-25T08:32:21Z" level=info msg="Listening for connections" mode=multi-user preferred_>
Mar 25 08:32:21 ANK-CTDash runc[24762]: time="2025-03-25T08:32:21Z" level=info msg="Server connection established" hostname=bc-24-11-94-e>
Mar 25 08:32:21 ANK-CTDash runc[24762]: time="2025-03-25T08:32:21Z" level=info msg="Sleeping for 24 hours" mode=multi-user preferred_host>
Mar 25 08:37:25 ANK-CTDash runc[24762]: time="2025-03-25T08:37:25Z" level=info msg="Using password authentication" user=root
Mar 25 08:37:25 ANK-CTDash runc[24762]: time="2025-03-25T08:37:25Z" level=info msg="New session request"
Mar 25 08:37:25 ANK-CTDash runc[24762]: time="2025-03-25T08:37:25Z" level=info msg="Request type got" type=shell
Mar 25 08:37:25 ANK-CTDash runc[24762]: time="2025-03-25T08:37:25Z" level=warning msg="inappropriate ioctl for device"
Mar 25 08:37:25 ANK-CTDash runc[24762]: panic: runtime error: invalid memory address or nil pointer dereference
Mar 25 08:37:25 ANK-CTDash runc[24762]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x95dad3]
Mar 25 08:37:25 ANK-CTDash runc[24762]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x95dad3]
Mar 25 08:37:25 ANK-CTDash runc[24762]: goroutine 29 [running]:
Mar 25 08:37:25 ANK-CTDash runc[24762]: os.(*File).Name(...)
Mar 25 08:37:25 ANK-CTDash runc[24762]:         /usr/local/go/src/os/file.go:56
Mar 25 08:37:25 ANK-CTDash runc[24762]: github.com/shellhub-io/shellhub/pkg/agent/server/modes/host.(*Sessioner).Shell(0xc000494300, {0xc>
Mar 25 08:37:25 ANK-CTDash runc[24762]:         /go/src/github.com/shellhub-io/shellhub/pkg/agent/server/modes/host/sessioner.go:65 +0x173
Mar 25 08:37:25 ANK-CTDash runc[24762]: github.com/shellhub-io/shellhub/pkg/agent/server.(*Server).sessionHandler(0xc000424000, {0xc92708, 0xc0004a0340})
Mar 25 08:37:25 ANK-CTDash runc[24762]:         /go/src/github.com/shellhub-io/shellhub/pkg/agent/server/session.go:121 +0x6cf
Mar 25 08:37:25 ANK-CTDash runc[24762]: github.com/gliderlabs/ssh.(*session).handleRequests.func1()
Mar 25 08:37:25 ANK-CTDash runc[24762]:         /go/pkg/mod/github.com/shellhub-io/[email protected]/session.go:263 
Mar 25 08:37:25 ANK-CTDash runc[24762]: created by github.com/gliderlabs/ssh.(*session).handleRequests in goroutine 27
Mar 25 08:37:25 ANK-CTDash runc[24762]:         /go/pkg/mod/github.com/shellhub-io/[email protected]/session.go:262 >
Mar 25 08:37:25 ANK-CTDash systemd[1]: shellhub-agent.service: Main process exited, code=exited, status=2/INVALIDARGUMENT`

KimiyaMorozova avatar Mar 25 '25 08:03 KimiyaMorozova

I am facing the same issue. This is a unprivilidged LXC on an up to date Proxmox host.

When i watch the agent's log I get

time="2025-08-19T17:28:38Z" level=info msg="Using password authentication" user=user
time="2025-08-19T17:28:38Z" level=info msg="New session request"
time="2025-08-19T17:28:38Z" level=info msg="Request type got" type=shell
time="2025-08-19T17:28:38Z" level=info msg="Session started" ispty=true localaddr="<ip>:33382" pty=/dev/pts/4 remoteaddr="<ip>:443" user=corvock
time="2025-08-19T17:28:38Z" level=warning msg="Write failed" err="binary.Write: some values are not fixed-sized in type *utmp.Utmpx" file=/var/run/utmp
time="2025-08-19T17:28:38Z" level=warning msg="Write failed" err="binary.Write: some values are not fixed-sized in type *utmp.Utmpx" file=/var/log/wtmp
time="2025-08-19T17:28:38Z" level=warning msg="exit status 1"
time="2025-08-19T17:28:38Z" level=info msg="Session ended" localaddr="<ip>:33382" pty=/dev/pts/4 remoteaddr="<ip>:443" user=corvock
time="2025-08-19T17:28:38Z" level=warning msg="read /dev/ptmx: input/output error"
time="2025-08-19T17:28:38Z" level=warning msg="Write failed" err="binary.Write: some values are not fixed-sized in type *utmp.Utmpx" file=/var/run/utmp
time="2025-08-19T17:28:38Z" level=warning msg="Write failed" err="binary.Write: some values are not fixed-sized in type *utmp.Utmpx" file=/var/log/wtmp
time="2025-08-19T17:28:38Z" level=info msg="Session ended"

corvock avatar Aug 19 '25 17:08 corvock

Hi @corvock @KimiyaMorozova, Could you help us reproduce this issue? We are not familiar with Proxmox, so a detailed guide or step-by-step instructions would be really helpful for us. Thank you!

gustavosbarreto avatar Aug 19 '25 19:08 gustavosbarreto

instructions on how i installed the agent? or how the machine is set up within Proxmox? I created an unprivileged debian 12 LXC docker was installed in the container and then from the command line in the LXC i ran the script that installs the shellhub agent. I know there are some issues with running docker within an unprivileged LXC. I haven't had an opportunity to go down that path, but saw KimiaMorozova state that it was an issue with a Privileged one so i posted what i saw in mine from the agent side.

If you're looking for information on something else, let me know.

--Ryan

corvock avatar Aug 19 '25 20:08 corvock

@corvock Could you try to install it in a virtual machine? Because as stated above that also did not work for me. If it does for you i would love to know what i did wrong.

@gustavosbarreto What kind of Step-by-step guide do you need? The entire process from lxc to shellhub agent?

KimiyaMorozova avatar Aug 19 '25 20:08 KimiyaMorozova

@corvock Could you try to install it in a virtual machine? Because as stated above that also did not work for me. If it does for you i would love to know what i did wrong.

clean install of Debian 13 into a VM without docker installed ShellHub standalone installed and worked as expected, and with docker installed the docker container installed and worked as expected.

took a non-working LXC and made it privileged and it logs in without the ns/time error

so it is an unprivileged thing, which is most of what i have, but i know how to solve it if i continue forward

corvock avatar Aug 19 '25 22:08 corvock