sunset icon indicating copy to clipboard operation
sunset copied to clipboard

Dev/sftp start: Implementing basic SFTP functionality

Open jubeormk1 opened this issue 5 months ago • 18 comments

Hi there @mkj,

This is an early draft of a pull request to help with the SFTP subsystem for sunset. I have been reading and working on proto.rs and I would like you to see my commits and discuss problems and further steps. If you prefer other channels to discuss this I am happy to do it in nay other way.

To begin with, there were a number of build errors and to make incremental steps I opted to comment out and out mod sftpserver from lib.rs and the macro_rules! sftpmessages in recent commits (1,2,3). This way I could focus and use cargo-expand to analyse the SSHEncode/SSHDecode macro expansions.

The 20th of August I was trying to fix small things without getting into how the sftp module would work In the commits from the 20th of August I fixed some minor problems and is likely that I will have to revert some changes later on, like 4, 5.

Before working on the big entity in proto.rs, sftpmessages!, I need to fix some issues with StatusCode. I would like to ask you if I am using SSHEncode/SSHDecode macros correctly for StatusCode since I felt forced to add to it a lifetime which might not be necessary. Also I believe that SSHEncodeEnum is not working as expected (6, 7). I added sftp/out/cargo-expand-sftp.rs as reference to the expanded SSHEncode.

I also would like to mention that I had to modify some files outside of the sftp module as you will see in Files changed tab to implement or alter visibility of elements (src/packets.rs, src/sshwire.rs)

jubeormk1 avatar Aug 22 '25 06:08 jubeormk1

In 8 and 9 I have fixed the CI issue

jubeormk1 avatar Aug 25 '25 07:08 jubeormk1

After that I have:

  • added SSHDecode for StatusCode in 10,
  • added SSHEncode/Decode for name11
  • and in the rest of the commits I have been uncommenting more bits of sftpmessages! but it is not quite ready yet but it is not throwing building errors.

As a minor change I have clarified that the module will follow SFTP v3 specifications in 12

jubeormk1 avatar Aug 25 '25 07:08 jubeormk1

We have a complete, not complaining sftp/src/proto.rs. However I restricted the lifetime for SftpPacket and decode_{response, request} methods so it has the same lifetime as SSHSource. This might have consequences once this structures start being used in sftpserver.rs and future demos.

Apart from that, I have avoided the use or sunset::Error in decode_{response, request} methods. That means that no Eerror::bug() or error::SSHProto.fail() are returned.

There is a number of TODOs in the file, mainly regard to:

  • The points made before
  • handling extension packets or fields
  • Variable number of ExtPair

jubeormk1 avatar Aug 28 '25 05:08 jubeormk1

Hi again,

I am starting to process the sftp subsystem, I have deserialized init packets and the first request SSH_FXP_REALPATH packet using the definitions and functions of sftp/src/proto.rs. The demo sftp server is not ready for any review so I am not pushing it yet.

I have made mayor changes in sftp/src/proto.rs that are worth mentioning. They consist of a redefinition of thesftpmessages! macro. The changes solve the issue of deserializing and serialising the Request Ids while keeping them out of the inner sftp packet definition. Is the complexity introduced worth it versus adding the ReqId to the inner packet definition?

The use of the macro has changed as well and maybe for good.

jubeormk1 avatar Sep 03 '25 10:09 jubeormk1

I am working on providing basic functionality to handle the sftp initialisation and adjusting the sftpserver trait to expose as little as possible protocol details to the user. I also have to handle properly errors inside sftphandle.

The code at this point is able to start an sftp session after running debug_sftp_client.sh but it does not fail gracefully for commands not included in the sftpmessages macro.

jubeormk1 avatar Sep 05 '25 00:09 jubeormk1

At this point, I need to handle the case where after extracting from the buffer_in a sensible SFTP request, I have left only a fragment of a request that cannot be decoded without some extra information that will not be available until the buffer is read from stdio.

I am exploring storing the fragment of the request in an auxiliar buffer to make sense of it after the next buffer read. Did you have to deal with this situation at a different level @mkj?

For now, I am going to use an auxiliary buffer to assemble the request. More coming soon...

jubeormk1 avatar Sep 25 '25 06:09 jubeormk1

For now, I am going to use an auxiliary buffer to assemble the request. More coming soon...

Yeah I think a buffer is probably necessary (if SSHWire had partial decoding it'd still need to store intermediate parts in memory somewhere).

As an optimisation it might be possible to special_case FXP_DATA and avoid needing to store the data in the aux buffer, instead copy it to the destination? There would still be a need for relative large SSH_FXP_NAME lists though I guess.

mkj avatar Sep 26 '25 05:09 mkj

Thanks for the ideas. For now I am writing a helper where I can take enough data from buffer in to get the size and packet type packet type.

In the case of SSH_FXP_WRITE I copying enough data to create a multi part write of the content reusing the current technique for it.

SSH_FXP_NAME is an interesting case. And FXP_DATA will also require some thinking, since the sink length may limit the sftpserver implementer and would create a bottleneck reading.

jubeormk1 avatar Sep 26 '25 07:09 jubeormk1

Thanks to @mkj original design of structures in proto.rs, there is only one entity that for now cannot be used in no_std context. This is Name, the response for SSH_FXP_READDIR and SSH_FXP_READPATH: SSH_FXP_NAME.

The reason is simple and @mkj mentioned before. The response for SSH_FXP_READPATH is a collection of all the elements in the directory.

Since reading a directory is part of the basic features that I have proposed in the library roadmap, I think that I should address it.

I would like to share the approach that I am starting to put together but I am open to more ideas.

I am considering processing the data element by element in the directory to resolve the arbitrary number of elements with new traits and structures for collections of SSH_FXP_READPATH responses. I would like to capture the next workflow:

  • SftpServer readdir return would be a result including: Number of elements in the directory, total length of all the data required to encode the SSH_FXP_NAME , and importantly one reference to a DirectoryElement struct for the first element in the directory
  • the SftpHandler will encode the header of the SSH_FXP_NAME response, wait until the stdio.write is completed,
  • encode the DirectoryElement content, wait for the write completion,
  • obtain the next DirectoryElement reference and repeat until there is no more elements to proceed.

I have a lifetime challenge here because the SftpServer would need to keep an internal state with either the full list of elements in the directory or a slim implementation that only keeps the relevant directory element and provides it to the SftpHandler.

I would like to keep those internal details for the the Sftp library user by providing a trait and maybe a slim proposed implementation for constrained environments.

Does anybody have a simpler or better idea? Maybe there is a better pattern to use here that I am overlooking.

jubeormk1 avatar Oct 08 '25 04:10 jubeormk1

One way to resolve lifetime problems could be to have the readdir be passed a closure to call with entries to add? Either a FnMut or AsyncFnMut. Just an idea, I've found that pattern useful before. There might be a reason it wouldn't fit here though.

mkj avatar Oct 08 '25 10:10 mkj

Thanks @mkj! We've been inspecting this piece of code (and your advice), for readdir and we have a few questions:

We assume that the developer would eventually implement readdir(dir_handle: &T, reply: &mut DirReply), now our question is: which types/data are behind the lifetimes 'g and 'a behind the DirReply/Chanout types?:

pub struct DirReply<'g, 'a> {
    chan: ChanOut<'g, 'a>,
}
(...)
// TODO Implement correct Channel Out
pub struct ChanOut<'g, 'a> {
    _phantom_g: PhantomData<&'g ()>,
    _phantom_a: PhantomData<&'a ()>,
}

We are a bit confused by the 'g and 'a lifetimes? We guess that PhantomData was a prototyping placeholder but you had more concrete types and/or data in mind that didn't want to elaborate on while writing the SftpSever draft? In other words:

What is 'g supposed to hold down the road? What is 'a supposed to hold eventually?

We think that 'g could be a ChanIO perhaps? ...

As in: we noticed the impl DirReply's "reply" method passes a _data &[u8]... is this a buffer that is supposed to shuttle bytes between the user-provided bytes (dropped into data) and ChanIO?

brainstorm avatar Oct 11 '25 00:10 brainstorm

I'm currently travelling, should be able to take a look later next week.

mkj avatar Oct 11 '25 08:10 mkj

Hi again @mkj,

I think that I finally understand what you where suggesting that could work. A visitor pattern to avoid borrowing/lifetime issues while allowing iteratively process the directory elements.

@brainstorm I am implementing a DirReply holding references to the output buffer and a channel out and now I understand better the lifetimes that you propose in the original DirReply prototype.

  • 'g: Probably the sunset ChannelOut
  • 'a: The buffer used for a SSHSink

For now I have a mock structure that simulates sending data. Soon I will push it.

jubeormk1 avatar Oct 16 '25 09:10 jubeormk1

In the commit Fixed Status Encoding..and the previous one, I came across with the fact that sshwire-derive does not encode enum values. Therefore, SSHEncoding structures with enum fields would not encode enum fields. To solve this situation I implemented the SSHEncode for StatusCode.

I identified the relevant function and point in sshwire-derive/src/lib.rs where automatic encoding of numeric enum fields could be done, but the enum data type would need to be provided somehow, maybe as a struct Attribute, to automate the numerical value encoding.

@mkj Do you recognise that implementing an automatic enum value encoding would be a good use of time?

jubeormk1 avatar Nov 11 '25 22:11 jubeormk1

I identified the relevant function and point in sshwire-derive/src/lib.rs where automatic encoding of numeric enum fields could be done, but the enum data type would need to be provided somehow, maybe as a struct Attribute, to automate the numerical value encoding.

@mkj Do you recognise that implementing an automatic enum value encoding would be a good use of time?

Ah I see, that's a bit unfortunate, sorry! sshwire-derive should either implement it or fail if var.value.is_some() here https://github.com/mkj/sunset/blob/351e2ef182d1d11ca34bdc1538972246bbae93d7/sshwire-derive/src/lib.rs#L285-L288

In SSH base protocol there are only a couple of places that use numeric enums ^1. Are there many places you'd use it in SFTP? I think if sshwire-derive were to implement it we'd want to check that all the variants are the same simple value form, and maybe require an attribute like #[sshwire(numeric_enum=u32)] that could define the type.

mkj avatar Nov 12 '25 02:11 mkj

In response to mkj today

think if sshwire-derive were to implement it we'd want to check that all the variants are the same simple value for

No worries! It gave me the excuse to look at sshwire-derive again and learn something.

I am only encoding an enum to return SftpStatus. I should maintain a better testing practices to catch up when enc() does not behave as I expect. I will add that to my bucket list but will not work on that.

Sorry about not reading you earlier and keeping pushing commits. I am finally finishing the SSH_FXP_READDIR request response and I have been tidying things up and documenting.

Thanks to @mkj @brainstorm and @autofix for listening, helping and discussing details of the implementation.

jubeormk1 avatar Nov 12 '25 04:11 jubeormk1

I think I've managed to resolve the stuck read in #37 . It was a stupid mistake I didn't notice in testing, somehow sunsetc never hits it.

For NoRoom I can see where it's happening - the Window Adjust send in finished_read() may occur when there isn't any send space available. That's a more general problem of how to handle protocol messages that need to be send when the buffer is full - I'll have a think of the best solution there. As a workaround I've made it ignore NoRoom failures for that particular send since any later channel read will retrigger the send. (There is a possibility that there won't be any more data in the case that the window was fully used - that will get stuck. It's a bit of a hack!)

My test branch is https://github.com/mkj/sunset/tree/matt/sftptesting , with those fixes. I was testing with --features sunset/backtrace, that will print a backtrace for NoRoom and some other errors.

[2025-11-19T15:43:12.682261825Z WARN  sunset_demo_common::server] Ended with error NoRoom {
        backtrace: Backtrace [
            { fn: "snafu::backtrace_impl::<impl snafu::GenerateImplicitData for std::backtrace::Backtrace>::generate", file: "/home/matt/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/snafu-0.8.6/src/backtrace_impl_std.rs", line: 5 },
            { fn: "sunset::error::NoRoom::build", file: "./src/error.rs", line: 18 },
            { fn: "sunset::error::NoRoom::fail", file: "./src/error.rs", line: 18 },
            { fn: "sunset::traffic::TrafOut::send_packet", file: "./src/traffic.rs", line: 382 },
            { fn: "sunset::traffic::TrafSend::send", file: "./src/traffic.rs", line: 479 },
            { fn: "sunset::channel::Channels::finished_read", file: "./src/channel.rs", line: 224 },
            { fn: "sunset::runner::Runner<CS>::finished_read_channel", file: "./src/runner.rs", line: 571 },
            { fn: "sunset::runner::Runner<CS>::read_channel", file: "./src/runner.rs", line: 539 },
            { fn: "<sunset_async::async_sunset::AsyncSunset<CS> as sunset_async::async_sunset::ChanCore>::poll_read_channel", file: "./async/src/async_sunset.rs", line: 567 },

mkj avatar Nov 19 '25 15:11 mkj

I think I've managed to resolve the stuck read in #37 . It was a stupid mistake I didn't notice in testing, somehow sunsetc never hits it.

For NoRoom I can see where it's happening - the Window Adjust send in finished_read() may occur when there isn't any send space available. That's a more general problem of how to handle protocol messages that need to be send when the buffer is full - I'll have a think of the best solution there. As a workaround I've made it ignore NoRoom failures for that particular send since any later channel read will retrigger the send. (There is a possibility that there won't be any more data in the case that the window was fully used - that will get stuck. It's a bit of a hack!)

My test branch is https://github.com/mkj/sunset/tree/matt/sftptesting , with those fixes. I was testing with --features sunset/backtrace, that will print a backtrace for NoRoom and some other errors.

[2025-11-19T15:43:12.682261825Z WARN  sunset_demo_common::server] Ended with error NoRoom {
        backtrace: Backtrace [
            { fn: "snafu::backtrace_impl::<impl snafu::GenerateImplicitData for std::backtrace::Backtrace>::generate", file: "/home/matt/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/snafu-0.8.6/src/backtrace_impl_std.rs", line: 5 },
            { fn: "sunset::error::NoRoom::build", file: "./src/error.rs", line: 18 },
            { fn: "sunset::error::NoRoom::fail", file: "./src/error.rs", line: 18 },
            { fn: "sunset::traffic::TrafOut::send_packet", file: "./src/traffic.rs", line: 382 },
            { fn: "sunset::traffic::TrafSend::send", file: "./src/traffic.rs", line: 479 },
            { fn: "sunset::channel::Channels::finished_read", file: "./src/channel.rs", line: 224 },
            { fn: "sunset::runner::Runner<CS>::finished_read_channel", file: "./src/runner.rs", line: 571 },
            { fn: "sunset::runner::Runner<CS>::read_channel", file: "./src/runner.rs", line: 539 },
            { fn: "<sunset_async::async_sunset::AsyncSunset<CS> as sunset_async::async_sunset::ChanCore>::poll_read_channel", file: "./async/src/async_sunset.rs", line: 567 },

Thanks for that! I rebase the branch on mkj/sunset/main.

I am working in some edge cases arising from fragmented requests.

jubeormk1 avatar Nov 20 '25 06:11 jubeormk1