safety-dance icon indicating copy to clipboard operation
safety-dance copied to clipboard

Audit bytes crate

Open Shnatsel opened this issue 6 years ago • 17 comments

https://docs.rs/bytes/0.4.12/bytes/

Slice-like type with atomic reference counting on top. Insanely popular - 12,000 downloads/day. Used in reqwest, tokio-tcp and hyper, exposed to untrusted data from the network. Contains plenty of unsafe code.

Shnatsel avatar Jul 21 '19 23:07 Shnatsel

I'm maybe a fourth of the way through auditing this, and should be done around Thursday or Friday.

So far everything looks good, assuming that having access to uninitialized memory outside of an untagged union or raw pointer isn't UB (IIRC that's unclear).

nuew avatar Jul 23 '19 19:07 nuew

I think references to uninit are deadly right now.

Lokathor avatar Jul 23 '19 19:07 Lokathor

I believe that crate could also use a conversion to MaybeUninit<T>

Shnatsel avatar Jul 23 '19 19:07 Shnatsel

Apparently there is an issue in bytes crate where they deliberately trigger UB because the alternative costs too much performance. See https://github.com/rust-lang/unsafe-code-guidelines/issues/158

Shnatsel avatar Aug 04 '19 07:08 Shnatsel

Yeah there's a huge comment on all the functions that knowingly take advantage of UB. IIRC (not at my computer so I can't see my draft report) the author missed annotating one case of the same sort of intentional, but (for now) correctly compiled UB.

Also, sorry about not finishing the report yet. The only excuse I can offer is laziness.

nuew avatar Aug 04 '19 17:08 nuew

Correctly compiled on x86_64, at least. I imagine it would be miscompiled on, say, an Alpha AXP (which has an absurdly weak memory model), but Rust doesn't support those so who cares.

nuew avatar Aug 04 '19 17:08 nuew

Doesn't standard ARM have a weaker memory model? Because Rust compiles for that all the time.

Lokathor avatar Aug 04 '19 18:08 Lokathor

Yes the ARM memory model is weaker than x86's but it's not so weak as to require a barrier for relaxed reads, and I'm assuming that LLVM will do the same program-level optimizations independent of architecture (which may, admittedly, be a bad assumption).

Anyways, here's the internals threads on intentional UB in bytes:

  • https://internals.rust-lang.org/t/bit-wise-reasoning-for-atomic-accesses/8853/38
  • https://internals.rust-lang.org/t/writing-through-a-pointer-derived-from-a-shared-reference-after-that-reference-is-dead/9078 (this is another unrelated case that isn't referenced above, so they may have concluded that it is defined)

My observation on the soundness of bytes was premature. Admittedly, everything but src/bytes.rs, the only file I had yet to look at in detail at the time, is fine excepting uninitialized issues.

nuew avatar Aug 04 '19 18:08 nuew

This is an all-volunteer effort, so don't feel pressured to complete the audit. Just note what you've already looked at and what the results were so that someone else could pick up where you left off.

Shnatsel avatar Aug 05 '19 09:08 Shnatsel

Alright, seeing as I'm not sure if I'll ever get back to this (hopefully I will, but...), and I almost wiped my home directory[^wiped] and I didn't have this backed up. The incomplete report is here if somebody else wants to complete it, though I might resume work on it.

[^wiped]: I accidentally set it as swapspace instead of the new partition on the disk. Luckily, it only wiped the LUKS header, which I had backed up. Way too close for comfort though.

nuew avatar Aug 08 '19 05:08 nuew

the author missed annotating one case of the same sort of intentional, but (for now) correctly compiled UB.

FWIW, those annotations were added by me, not the original author. ;) Which case did I miss?

So far everything looks good, assuming that having access to uninitialized memory outside of an untagged union or raw pointer isn't UB (IIRC that's unclear).

Do you remember at which types this happens? For integers (and raw poiners) it is unclear, for pretty much anything else it is fairly certain that uninitialized memory is UB.

I think references to uninit are deadly right now.

No, references pointing to something are never worse than having the data by-value. For now, assume they are equally bad ("references must point to valid data"); this may be relaxed eventually.

RalfJung avatar Aug 26 '19 09:08 RalfJung

  • The missed annotation was on Inner::inline_len.
  • IIRC all the accessible uninitialized memory is through u8 arrays.

The functions I saw potential issues with (again, didn't review all the unsafe functions, see my draft report) that hadn't been noted before were Bytes::to_mut, Inner::inline_ptr, and Inner::inline_len.

nuew avatar Aug 26 '19 21:08 nuew

The missed annotation was on Inner::inline_len.

Thanks! Submitted https://github.com/tokio-rs/bytes/pull/289.

RalfJung avatar Aug 27 '19 07:08 RalfJung

Another issue was discovered recently - a fairly obscure contract is violated: https://github.com/tokio-rs/bytes/issues/328

Another issue was found by accidentally stumbling into a segfault: https://github.com/tokio-rs/bytes/issues/340 and a related issue was later discovered by code analysis: https://github.com/tokio-rs/bytes/issues/343

Shnatsel avatar Dec 13 '19 21:12 Shnatsel

It seems all the issues described in the audit by @nuew above are fixed as of version 0.5.3:

  • Inner::inline_ptr appears to be gone
  • Bytes::to_mut() appears to be gone
  • Examples for advance_mut() appear to be correct now
  • the types BufMut::put uses are converted to MaybeUninit

BufMut::put could drop one unsafe block by switching to MaybeUninit::write but that API is nightly-only for now.

Also *const to *mut conversion discussed earlier here is fixed by accepting &mut Self instead of &Self

Shnatsel avatar Dec 13 '19 22:12 Shnatsel

I was mistaken about the *const to *mut conversion, I've corrected my previous comment.

Shnatsel avatar Dec 13 '19 22:12 Shnatsel

Maybe we can start filing recently added unsound informational advisories to bugs listed here?

Qwaz avatar Jun 25 '20 05:06 Qwaz