zig icon indicating copy to clipboard operation
zig copied to clipboard

x86_64: add table-driven instruction encoder

Open kubkon opened this issue 2 years ago • 4 comments

This PR downstreams part of a side-project I am working on, namely, a table-driven x86_64 instruction selector and encoder. Prior to this we would hard-code the selection and encoding of the used variants missing out on selecting the most efficient (well, shortest) encoding for no reason. Additionally, we would also not always encode correctly in the eyes of the ISA. The new approach tries to rectify this by using a tabulated format for all possible instruction variants and then selecting the shortest matching one based on the passed operands. This was greatly inspired by NASM's table-driven approach which came to my attention thanks to @andrewrk and it really proves to be the best way to go about this.

Currently, the table is encoded as a tuple in encodings.zig file, however, when Zig gains the ability to import a Zon file, this is what I will be converting it to. The actual layout of the table might also get somewhat tweaked over time as we add support for more extensions (AVX, AVX512, etc.).

I have also taken this opportunity to greatly simplify how we lower AIR to MIR. Instead of manually specifying all the nitty-gritty required to correctly serlalise the instruction to MIR format, we now use a convenient set of helpers that act directly on the operand types used by the encoder. The helper destructure the operands into MIR compatible format which we then deserialise back into instruction encoder compatible format in the emitter. This resulted in an API that looks something like this:

// mov rdx, rcx
try self.asmRegisterRegister(.rdx, .rcx);

kubkon avatar Mar 11 '23 19:03 kubkon

I forgot to add that I will downstream tests of the encoder from https://github.com/kubkon/zig-dis-x86_64 however I haven't yet figured out how I want to do this. In the upstream repo, I use the assembler to convert assembly in Intel syntax into an instruction which is then encoded and compared for expected encoding. For Zig, I might be able to plug directly into AST and Zig's lexer and parser to parse inline asm and encode that.

kubkon avatar Mar 11 '23 20:03 kubkon

OK, given that @mlugg is eager to contribute to the x86 backend, I will do a direct port of the test harness from the upstream repo in this PR.

kubkon avatar Mar 11 '23 20:03 kubkon

aarch64-windows CI is failing for some reason with error https://github.com/ziglang/zig/actions/runs/4396410883/jobs/7698825772#step:3:713

fatal error: error in backend: SEH unwind data splitting not yet implemented
[713](https://github.com/ziglang/zig/actions/runs/4396410883/jobs/7698825772#step:3:714)
Exception Code: 0xE0000046

kubkon avatar Mar 12 '23 08:03 kubkon

Well, that's pretty unfortunate but it seems my PR is hitting an unimplemented bit of the code in LLVM. Related https://github.com/llvm/llvm-project/issues/60681 The next step will be rebuilding the zig devkit locally in debug, rerunning the build and submitting the full report that includes the stack trace upstream. I have no clue what to do with this PR though to get it merged.

kubkon avatar Mar 12 '23 19:03 kubkon

@kubkon What did you do to get aarch64-windows building. It was mentioned that it was failing here https://github.com/ziglang/zig/actions/runs/4396410883/jobs/7698825772#step:3:713?

hmartinez82 avatar Mar 14 '23 02:03 hmartinez82

@kubkon What did you do to get aarch64-windows building. It was mentioned that it was failing here https://github.com/ziglang/zig/actions/runs/4396410883/jobs/7698825772#step:3:713?

Hey! Apologies for a late reply, got caught up in various other things. To answer your question, I don't know what you would need to tweak in qgis to please LLVM. In case of Zig, I have abused inlining of for loops for a rather large table too much which resulted in a lot of local alloca's somehow influencing LLVM's ability to correctly generate the unwind info for the affected functions.

kubkon avatar Mar 17 '23 14:03 kubkon