bitfields raw pointer accessors
When creating bindings for a struct with bitfields the accessor functions that are created take an immutable receiver (&self). This is a problem if the underlying type is used by both Rust and C, since Rust is not allowed to create &self references, since the C side might modify the value at any time.
One solution could be to create raw accessor functions that allow access via *const Self.
Input C Header
struct foo {
unsigned available:1;
};
Bindgen Invocation
$ bindgen input.h
impl foo {
#[inline]
pub fn available(&self) -> ::std::os::raw::c_uint {
unsafe { ::std::mem::transmute(self._bitfield_1.get(0usize, 1u8) as u32) }
}
#[inline]
pub fn set_available(&mut self, val: ::std::os::raw::c_uint) {
unsafe {
let val: u32 = ::std::mem::transmute(val);
self._bitfield_1.set(0usize, 1u8, val as u64)
}
}
#[inline]
pub fn new_bitfield_1(
available: ::std::os::raw::c_uint,
) -> __BindgenBitfieldUnit<[u8; 1usize]> {
let mut __bindgen_bitfield_unit: __BindgenBitfieldUnit<[u8; 1usize]> = Default::default();
__bindgen_bitfield_unit.set(0usize, 1u8, {
let available: u32 = unsafe { ::std::mem::transmute(available) };
available as u64
});
__bindgen_bitfield_unit
}
}
Expected Results
Additional functions that should be generated:
impl foo {
#[inline]
pub unsafe fn raw_available(this: *const Self) -> ::std::os::raw::c_uint {
unsafe {
::std::mem::transmute(__BindgenBitfieldUnit::raw_get(
addr_of!((*this)._bitfield_1),
0usize,
1u8,
) as u32)
}
}
#[inline]
pub unsafe fn raw_set_available(this: *mut Self, val: ::std::os::raw::c_uint) {
unsafe {
let val: u32 = ::std::mem::transmute(val);
__BindgenBitfieldUnit::raw_set(addr_of!((*this)._bitfield_1), 0usize, 1u8, val as u64)
}
}
}
This also requires changes to the generated __BingenBitfieldUnit.
@pvdrz gentle nudge, do you have any thoughts on the proposal here? I think RFL may have a use case
Additional context since I keep forgetting: https://lore.kernel.org/rust-for-linux/[email protected]/ and its parent
Basically we have a C struct that has interior mutability on some fields, so usually it is in an UnsafeCell. Using these bitfield methods mean that you need to create an &self without the UnsafeCell - that is, you have a reference to something with interior mutability but the compiler has no way of knowing it has that interior mutability. So incorrect things can potentially happen quietly.
When you say "since the C side might modify the value at any time.", I'm unclear if you mean other threads will modify the data concurrently?
Otherwise, if this is all within a single thread, then there's not really a problem as far as I can tell.
In the use case, we have an `UnsafeCell< pointer to a struct with:
- Some fields that never change while we have it (i.e. are const or locking is handled by the caller)
- Some fields that are mutex-protected and could change if we are preempted while using it
The issue is that it is in general incorrect to take &* reference to the struct to access the const fields because the other could change, but that is required for bitfield access. I think the risk for UB here is exceedingly low if you are not also accessing the locked fields, but it is a bit of an easy footgun telling rustc something that isn't true.
I'm not 100% sure I understand this all myself and the locking story is complicated, so I may be misrepresenting some of it.
Well, unsafecell doesn't protect you from any concurrency problems. If there's supposed to be a lock, then the lock must be held to correctly read or write the available field even if a &foo reference is never actually made, because raw pointers still have to follow data race rules.
I don't think the parent issue showcases the problem well, it's more like:
#[repr(C)]
struct foo {
// this is a bitfield
field: u8,
mtx: mutex_t,
// mutex protected
protected: u8
}
// this is how you reference the type normally
struct FooWrapper(UnsafeCell<foo>);
Then you must construct a &foo via unsafe { &*foo_wrapper.0.get() } to use these bitfield accessors. The problem is with:
- Context A creates
&fooand accessfield - Preemption hits, context B locks
mtx, changesprotected, and unlocks - Context A resumes and
&foopoints to changed data,rustchas no way of knowing about it - Context A accesses
field
This is as far as it goes, and I think the incorrectness is purely semantic meaning this is something miri would flag if it could (I believe), but there is no actual UB.
I guess there could be a theoretical race condition if context A then tries to lock mtx and access protected. Locking the mutex will prevent reordering at that level - but maybe rustc could take the "const & with no UnsafeCell never ever changes" very literally and do some reordering itself. So it could load field and protected at the same time (step 4 above) and then lock mtx - meaning you would be getting an unsynchronized read of protected.
That seems like a stretch but maybe not impossible? Probably need @y86-dev to clarify things
I think I follow now. The quick fix might be to distribute the UnsafeCell over the individual fields of foo, but that's got some ergonomic drawbacks i expect
I guess there could be a theoretical race condition if context A then tries to lock
mtxand accessprotected. Locking the mutex will prevent reordering at that level - but mayberustccould take the "const&with noUnsafeCellnever ever changes" very literally and do some reordering itself. So it could loadfieldandprotectedat the same time (step 4 above) and then lockmtx- meaning you would be getting an unsynchronized read ofprotected.That seems like a stretch but maybe not impossible? Probably need @y86-dev to clarify things
Your example from above is much better than mine, since it actually has a mutex. But the protected field should also be a bitfield, so
// bingden generates
struct Foo {
protected: u8, // bitfield
mtx: mutex,
}
The problem lies with the implementation of mutex, since it contains a wait_list, an intrusive circular doubly linked list that contains all threads waiting for the mutex. This list might change even if we lock the mutex (since at any time another thread might try to lock it, but since we hold the lock, it goes to sleep).
I think I follow now. The quick fix might be to distribute the UnsafeCell over the individual fields of
foo, but that's got some ergonomic drawbacks i expect
Since Foo is generated by bingden, I do not know if that is possible.
I definitely don't think it's currently possible, someone would have to add that ability to bindgen.
I would prefer the solution outlined in this issue, since relying on raw pointers also avoids the aliasing problems we would have when having a &mut T instead of a &T.
You do lose the ability to use method syntax, which makes the API kinda weird, but otherwise it's also fine, yes.
Sounds good to add this fwiw.