zig icon indicating copy to clipboard operation
zig copied to clipboard

translate-c: support C bitfields

Open ghost opened this issue 7 years ago • 12 comments

struct Test {
    unsigned foobar : 1;
};
pub const struct_Test = @OpaqueType();

ghost avatar Sep 10 '18 22:09 ghost

Hi, Is there a design for fixing this bug? I would like to help if possible.

d18g avatar Oct 07 '18 18:10 d18g

Hi @d18g. This is tricky because although Zig has bit fields in the form of packed struct, they are more well-defined and thus do not map directly to C's bit fields.

In theory we should be able to support this by emitting code that checks the target and emits different packed structs depending on the target, to match what C does on that target.

So in order to solve this issue, one must do the following:

  • Research what various C compilers do for bitfields (what is the actual memory layout)
  • Edit translate-c code in zig to detect bitfields and emit appropriate AST that will match the C ABI for every target.

The other option, which I think would also be OK is to say that translate-c only emits code for the target specified, and so translate-c only would emit a packed struct that matches the target given to translate-c.

Both options are roughly the same amount of work.

Even just the work of creating documentation for what the C ABI for bit fields is for the various compilers would be 60%+ of the work of solving this issue.

I had this issue marked as a bug, but that was incorrect; this is a missing feature. translate-c downgrades structs to opaque types when it cannot understand all the fields.

andrewrk avatar Oct 08 '18 14:10 andrewrk

Hi @andrewk, Thanks for the detailed answer. Just to be sure I am not expecting any struct to be convertible only for "packed" C structs, while this will narrows the problem the issues you laid out are still standing. My motive here is for protocols or HW registers where packed structs are must and bitfields might play a role.

So I suggest multi-step plan (only for packed structs):

  1. Check that we can find out if a struct is packed, alternatively it can be a flag in translate-c which is ugly IMHO.
  2. Set the desired test cases.
  3. Select several C(/C++?) compilers for the initial research.
  4. Document the bitfield ABI behavior in all supported platforms for the selected compilers.
  5. Analyze the findings.

Next steps might be:

  • Ditch the effort since it intractable.
  • Analyze more compilers/cases/configurations/etc.
  • Implement initial support.

If this is acceptable then we can start addressing steps 1-3. After that a deep dive is needed in steps 4-5.

d18g avatar Oct 09 '18 07:10 d18g

I think your plan makes sense. Step 1 is already solved:

https://github.com/ziglang/zig/blob/d40c4e7c896c5dfed9dd35e8c0d2117f7cf5c53c/src/translate_c.cpp#L4009-L4014

andrewrk avatar Oct 09 '18 17:10 andrewrk

This is bitfield detection which is great. I meant packed struct as in #pragma pack or __attribute__((packed)).

d18g avatar Oct 09 '18 17:10 d18g

Ah I see. Clang provides that information as well. It can be tricky to find out where in the API, but it's there. Using a packed struct in Zig we can always emulate whatever struct layout we need to.

Related: #1512

andrewrk avatar Oct 09 '18 17:10 andrewrk

Clang's logic for this seems to be in MicrosoftRecordLayoutBuilder::layoutBitField and ItaniumRecordLayoutBuilder::LayoutBitField in lib/AST/RecordLayoutBuilder.cpp, for reference.

(Also, it actually exposes this information as part of the AST, so there should be no need to reimplement it.)

tadeokondrak avatar Sep 30 '20 14:09 tadeokondrak

I ran into this problem today so out of curiosity I looked into how bindgen does this. It looks like their implemention is in the bitfields_to_allocation_units function.

Their implementation seems to be pretty simple -- as far as I can tell the only 2 cases it deals with are packed vs unpacked structs and named vs unnamed bitfields. Right now they seem to just be ignoring the Microsoft case (there's an if is_ms_struct but the bool is_ms_struct is hardcoded to false).

jvns avatar Feb 26 '21 19:02 jvns

A few weeks ago I created library in Rust that computes the layouts of C types. I've created this library for two purposes:

  • To use it for code generation.
  • To document the layout algorithms used by various C compilers on various platforms. Apart from my library, there seem to be no complete resources out there that document these algorithms correctly. Worse: Some of the official documentation provided by Microsoft and Gnu is inaccurate.

My library is heavily tested against the layouts produced by the C compilers by generating C code, compiling it with the C compilers, and then extracting the type layouts from the debug information. I'm therefore confident that it has a high degree of accuracy. In the process I was also able to find various bugs in Clang's MicrosoftRecordLayoutBuilder implementation.

The code is heavily documented and it should not be hard to adapt it for use in your compiler. You can find it here: https://github.com/mahkoh/repr-c/tree/master/repc/impl/src/builder

mahkoh avatar Mar 07 '21 11:03 mahkoh

I ran into this problem today so out of curiosity I looked into how bindgen does this. It looks like their implemention is in the bitfields_to_allocation_units function.

This implementation is incorrect.

mahkoh avatar Mar 07 '21 11:03 mahkoh

It appears the current plan for getting support for C bitfields is via a switch in the translate-c backend from clang to arocc, is this correct? I was looking to see what it'd take to write a Linux kernel module in zig but without this I may hold off.

aka-mj avatar Aug 03 '23 16:08 aka-mj

Are there any hacks to get around this current issue? This prevents from being able to use libfuse with Musl libc due to its timespec implementation

Update for anyone else having issues with this: not sure why I hadn't thought of this before, but simply not importing the problematic header is a possible solution. You'll just have to carefully re-implement the type/function definitions in Zig. Not a perfect solution and may be tedious, but at least it can get you something working. For large header files, you should be able to automatically convert most of it with translate-c

mgord9518 avatar Nov 02 '23 08:11 mgord9518

but simply not importing the problematic header is a possible solution.

You can import the problematic header and redefine the problematic functions, right?

You'll just have to carefully re-implement the type/function definitions in Zig.

Would this not involve the same work that people are talking about putting into the compiler?

Radvendii avatar May 31 '24 01:05 Radvendii

@Radvendii

You can import the problematic header and redefine the problematic functions, right?

I don't see why not, but my issue was small enough that I felt re-defining everything I used wasn't a big deal

Would this not involve the same work that people are talking about putting into the compiler?

No, because the compiler issue would require a lot of testing to ensure it doesn't cause issues in edge cases. Parsing something manually is much, much easier than getting it to work reliably in code.

mgord9518 avatar May 31 '24 04:05 mgord9518

https://jkz.wtf/bit-field-packing-in-gcc-and-clang seems like a good reference for the behavior on GCC and Clang, although I've not verified the behavior yet. (Also, bear in mind I think the author has written their binary strings backwards, with the LSB on the left; either that or their rule is incorrectly written.)

mlugg avatar Aug 29 '24 21:08 mlugg