SSH.NET icon indicating copy to clipboard operation
SSH.NET copied to clipboard

SftpClient: Hang after ~500 MB with Dropbear server

Open DavidYawCSpeed opened this issue 6 months ago • 2 comments

After downloading about 500 MB of data, I'm getting a hang in DownloadFile.

I tested with files of various sizes: with a 1 MB (1,048,576 bytes) test file, 475 downloads work, and it deadlocks on the 476th one. With 2 MB files, 237 work, and it locks on the 238th. In all cases, it deadlocks shortly before 499,000,000 bytes.

This only seems to be an issue with the Dropbear SSH server with OpenSSH's sftp. When using OpenSSH on a desktop Linux, this issue does not occur.

I am assuming that this is not (strictly speaking) a bug in SSH.Net. Since it works with OpenSSH, I'm assuming it's Dropbear not acting quite right. But supporting slightly misbehaving SSH servers is a reasonable thing for SSH.Net to do.

Here is my test program.

using Renci.SshNet;
using System.Diagnostics;

namespace SshTransferTest;

internal class Program
{
    // This is my small embedded device. Low power. Running PetaLinux on a Xilinx SoC.
    // SSH Server is Dropbear v2020.80. (No, I can't change that.)

    // (Why 127.0.0.1? I'm on a work computer, with a strict VPN. I have to use an SSH tunnel
    // to make connections to machines on the local network.)
    const string host = "127.0.0.1";
    const int port = 2122;
    const string user = "root";
    const string privKey = "DeviceKey.id_rsa";
    const string largeFile = "/mnt/data/testLargeFile"; // 1.0 MB of /dev/urandom

    static void Main(string[] args)
    {
        Stopwatch sw = new();
        void callback(object? _)
        {
            Console.WriteLine("Running for {0}", sw.Elapsed);
            if (sw.Elapsed > TimeSpan.FromSeconds(30))
                Debugger.Break();
        }
        Timer lockupTimer = new(callback);

        SftpClient sftpClient = new(
            host, port, user,
            new PrivateKeyFile(typeof(Program).Assembly.GetManifestResourceStream(typeof(Program), privKey)!));

        sftpClient.Connect();

        for (int i = 0; i < 1000000; i++)
        {
            Console.WriteLine("{0}: Starting Download", i);
            lockupTimer.Change(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(5));

            sw.Restart();
            using (MemoryStream ms = new())
            {
                sftpClient.DownloadFile(largeFile, ms);
            }

            sw.Stop();
            lockupTimer.Change(Timeout.InfiniteTimeSpan, Timeout.InfiniteTimeSpan);
            Console.WriteLine("{0}: Complete in {1}", i, sw.Elapsed);

            // I tried having a sleep here, but it didn't seem to make any difference.
            //Thread.Sleep(1000);
        }
    }
}

When it deadlocks, here's what the call stack looks like:

  • Renci.SshNet.dll!Renci.SshNet.Sftp.SftpFileReader.Read() Line 106
  • Renci.SshNet.dll!Renci.SshNet.SftpClient.InternalDownloadFile(string path, System.IO.Stream output, Renci.SshNet.Sftp.SftpDownloadAsyncResult asyncResult, System.Action downloadCallback) Line 1371
  • Renci.SshNet.dll!Renci.SshNet.SftpClient.DownloadFile(string path, System.IO.Stream output, System.Action downloadCallback) Line 801
  • SshTransferTest.dll!SshTransferTest.Program.Main(string[] args) Line 62

Here's the exact line where it's deadlocks in SftpFileReader.Read():

public byte[] Read()
{
	ThrowHelper.ThrowObjectDisposedIf(_disposingOrDisposed, this);
	if (_exception != null)
	{
		ExceptionDispatchInfo.Capture(_exception).Throw();
	}
	if (_isEndOfFileRead)
	{
		throw new SshException("Attempting to read beyond the end of the file.");
	}
	BufferedRead value;
	lock (_readLock)
	{
		while (!_queue.TryGetValue(_nextChunkIndex, out value) && _exception == null)
		{
			Monitor.Wait(_readLock); //<-------------------
		}

Other things I tried

Disconnect/Reconnect

for (int i = 0; i < 1000000; i++)
{
    if (i % 200 == 0)
    {
        sftpClient.Disconnect();
        sftpClient.Connect();
    }

    Console.WriteLine("{0}: Starting Download", i);

With this change, the deadlock does not occur. (Tested about 120,000 iterations.) So it appears to be a limit on the number of bytes transferred per SFTP connection, not per SftpClient object lifetime.

Different Download Method

//using (MemoryStream ms = new())
//{
//    sftpClient.DownloadFile(largeFile, ms);
//}

using (var sftpStream = sftpClient.OpenRead(largeFile))
{
    byte[] temp = new byte[65536];
    long red;
    while ((red = sftpStream.Read(temp)) != 0)
    {
        // Keep reading
        lockupTimer.Change(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(5));
    }
}

With this change, the issue is still present. It still deadlocks, and still on the 476th download. With a different array size, still a deadlock on the 476th download.

DavidYawCSpeed avatar Jul 15 '25 22:07 DavidYawCSpeed

Here is what I think is happening:

SSH.NET opens a channel with a (very) large initial window size (2GB):

https://github.com/sshnet/SSH.NET/blob/03ae6bfa0f1bda7635d3a4eb8888bebf3fab1105/src/Renci.SshNet/Session.cs#L58

https://github.com/sshnet/SSH.NET/blob/03ae6bfa0f1bda7635d3a4eb8888bebf3fab1105/src/Renci.SshNet/Channels/ChannelSession.cs#L365-L367

dropbear clamps it to 500MB:

https://github.com/mkj/dropbear/blob/16106997d11615e5e2dfe477def062aed7ed0bca/src/sysoptions.h#L246

https://github.com/mkj/dropbear/blob/16106997d11615e5e2dfe477def062aed7ed0bca/src/common-channel.c#L909

SSH.NET only sends a window_adjust message after its window is practically depleted i.e. after 2GB of data received. LocalPacketSize is 64KB here:

https://github.com/sshnet/SSH.NET/blob/03ae6bfa0f1bda7635d3a4eb8888bebf3fab1105/src/Renci.SshNet/Channels/Channel.cs#L775-L780

So dropbear sends 500MB of data, assumes the window is depleted, and doesn't receive a message to adjust the window. So it does not send any more data.

So who is at fault? Well granted, the SSH.NET behaviour is questionable, but it appears to be in spec:

The 'initial window size' specifies how many bytes of channel data can be sent to the sender of this message without adjusting the window.

(https://datatracker.ietf.org/doc/html/rfc4254#section-5.1)

Implementations MUST correctly handle window sizes of up to 2^32 - 1 bytes.

(https://datatracker.ietf.org/doc/html/rfc4254#section-5.2)

Rob-Hague avatar Aug 03 '25 16:08 Rob-Hague

I do agree though that the library should be less unusual here so that this issue would not be hit

Rob-Hague avatar Aug 04 '25 06:08 Rob-Hague