Also return remaining bytes in `FromBytes` conversion methods
This change makes the slice and non-slice conversion methods consistent, and provides a lightweight way (i.e., without Ref) to access the leftover bytes after conversions. For instance, this:
trait FromBytes {
fn ref_from_prefix(bytes: &[u8]) -> Option<&Self>
where
Self: KnownLayout + NoCell;
}
...becomes this, in this PR:
trait FromBytes {
fn ref_from_prefix(bytes: &[u8]) -> Option<(&Self, &[u8])>
where
Self: KnownLayout + NoCell;
}
...where the second element of the tuple are the remaining bytes. Providing the remainder eases the pattern where a T is parsed from a buffer, and then the buffer is advanced to exclude the parsed T.
This PR tests out the 'feel' of tuple-returning conversion methods. Alternatively, we could provide a mirror set of _split methods.
Addresses #1051, but is the antithesis of #884. Inspired by https://github.com/google/zerocopy/issues/1051#issuecomment-2007575409. It's syntactically cheap for users to discard remaining bytes at call sites if they wish, but the alternative — discarding the excess and requiring users to go through Ref — is syntactically and cognitively expensive.
While I'm obviously in favor of doing this in general, I feel like modifying the existing methods would be too large of a breaking change to be worthwhile, which was why I suggested new methods.
Hm, we'll need to balance the costs of subjecting current users to breaking changes, against the costs of subjecting current and future users to a more bloated API. Adding _split methods consistently would require adding eight additional methods to FromBytes (which has 14 methods in our latest 0.8 pre-release):
mut_from
mut_from_prefix
+ mut_from_prefix_split
mut_from_suffix
+ mut_from_suffix_split
mut_slice_from
mut_slice_from_prefix
+ mut_slice_from_prefix_split
mut_slice_from_suffix
+ mut_slice_from_suffix_split
read_from
read_from_prefix
+ read_from_prefix_split
read_from_suffix
+ read_from_suffix_split
ref_from
ref_from_prefix
+ ref_from_prefix_split
ref_from_suffix
+ ref_from_suffix_split
slice_from_prefix
slice_from_suffix
That's quite a hefty addition, and whatever course of action we take here will also likely be taken for our upcoming TryFromBytes trait, too.
Since we're preparing for a major breaking release anyways, we have a bit more license than usual to consider breaking changes.
As I discuss in https://github.com/google/zerocopy/issues/884#issuecomment-2027828809, I'm strongly in support of not removing the plain-casting methods that don't return the suffix. Embedded code should do the minimum work necessary for the job. Zerocopy has a few design decisions that are good for networking code, but bad for embedded, like how all of the byteorder items are also unaligned. Removing these methods pushes it further in an embedded-hostile direction.
I personally find the zerocopy APIs lacking in useful transitive operations like this, so I would prefer the "bloat" of adding these methods and keeping the plain-casting ones.
What do you think of these methods starting with split_ instead of ending in _split? split_mut_from_prefix reads clearly in the imperative mood and highlights that it's doing more than just a bounds-check and pointer cast: it's splitting as well. This split is an extra operation, both cognitively and for the compiler.
I don't care about the *_from_suffix API and I don't know who does - we should be consistent where it makes sense to do so. Perhaps as a middle ground, this PR can introduce only splitting versions for the prefix APIs, which adds 5 methods instead of 10:
mut_from
mut_from_prefix
+ split_mut_from_prefix
mut_from_suffix
mut_slice_from
mut_slice_from_prefix
+ split_mut_slice_from_prefix
mut_slice_from_suffix
read_from
read_from_prefix
+ split_read_from_prefix
read_from_suffix
ref_from
ref_from_prefix
+ split_ref_from_prefix
ref_from_suffix
slice_from_prefix
+ split_slice_from_prefix
slice_from_suffix
I don't care about the *_from_suffix API and I don't know who does
We use it for parsing certain things as well as prefix, and would strongly prefer to have split versions of it as well if you decide to go this route.
I've made a discussion post exploring some of the naming and return type design space: https://github.com/google/zerocopy/discussions/1095
I'm inclined to merge this based on the following reasoning:
- The ergonomics of discarding an unneeded prefix/suffix are better than the ergonomics of reconstructing a prefix/suffix if one isn't returned by the API
- From what little visibility we have into use cases in the ecosystem, it seems that wanting the prefix/suffix is the more common situation
- For users who care about not relying on the compiler to perform optimizations, techniques like the one described in this comment are available
- More generally, we are aware that our current internal software organization is based on the assumption that the compiler will aggressively optimize, and we are actively considering how to provide better support for users who aren't comfortable with this situation: https://github.com/google/zerocopy/issues/1125