Implement RISC-V ASM-to-JSON conversion
CompCert implements ASM-to-JSON conversion for the ARM and PowerPC architectures, but this functionality is missing for RISC-V. This patch implements this functionality. The implementation is based on the existing ARM/PowerPC AsmToJSON.ml implementations.
I'm neither an expert in CompCert or in OCaml, so any feedback on this PR would be appreciated 🙂
I needed this functionality for a research project I'm working on, so I went ahead and implemented it, but I thought I might as well try contributing upstream in case anyone else could benefit from this.
Thanks for the code! Maybe @bschommer or @m-schmidt could review it, as they wrote the ASM-to-JSON converters for the other platforms, and know this code better than me.
Thanks for the code from us too. The ASM-to-JSON converters are currently only used for some internal tool, which is only available for certain Architectures, however that might change in the future. So we cannot guarantee that if we might need the code in the future or that we do not change it.
I needed this functionality for a research project I'm working on, so I went ahead and implemented it, but I thought I might as well try contributing upstream in case anyone else could benefit from this.
Can you already tell for which research project you will us this?
So we cannot guarantee that if we might need the code in the future or that we do not change it.
This is used by the Valex tool, right? Anything else?
I wouldn't mind if this functionality were changed in the future in a backward-incompatible way, I will pin a version of CompCert, and I can also update my code if there are breaking changes in CompCert.
Can you already tell for which research project you will us this?
Sure! I'm working on a follow-up project to Knox (paper, code). I want to dump CompCert's RISC-V Asm AST (before running the Asmexpand stage, so pseudo-instructions like Pallocframe are preserved, and so that I don't have to write a parser for the .s files) so that I can consume this output in a different tool not written in Coq. I'm working on rewriting the CompCert Asm semantics in Rosette, so that I can use Rosette to symbolically execute programs that are compiled using CompCert.
Reusing this AsmToJSON machinery seemed like the easiest way to do it. In my version, I actually need to remove the Asmexpand stage that runs before AsmToJSON. Right now, this is hard-coded, but I was considering adding a flag for that and submitting a PR, so one could do e.g., ccomp -noexpandasm -sdump to dump the pre-expansion version of the Asm. That would require some fixup in the ARM/PPC AsmToJSON code as well, and would also require editing those to support dumping the pseudo-instructions that are currently expanded away (and adding some logic to panic when those pseudo-instructions exist and -noexpandasm is not passed, to match the current behavior and error-checking).
This is used by the Valex tool, right? Anything else?
Yes, it used only by Valex atm.
I wouldn't mind if this functionality were changed in the future in a backward-incompatible way, I will pin a version of CompCert, and I can also update my code if there are breaking changes in CompCert.
Can you already tell for which research project you will us this?
Sure! I'm working on a follow-up project to Knox (paper, code). I want to dump CompCert's RISC-V Asm AST (before running the Asmexpand stage, so pseudo-instructions like
Pallocframeare preserved, and so that I don't have to write a parser for the.sfiles) so that I can consume this output in a different tool not written in Coq. I'm working on rewriting the CompCert Asm semantics in Rosette, so that I can use Rosette to symbolically execute programs that are compiled using CompCert.Reusing this AsmToJSON machinery seemed like the easiest way to do it. In my version, I actually need to remove the Asmexpand stage that runs before AsmToJSON. Right now, this is hard-coded, but I was considering adding a flag for that and submitting a PR, so one could do e.g.,
ccomp -noexpandasm -sdumpto dump the pre-expansion version of the Asm. That would require some fixup in the ARM/PPC AsmToJSON code as well, and would also require editing those to support dumping the pseudo-instructions that are currently expanded away (and adding some logic to panic when those pseudo-instructions exist and-noexpandasmis not passed, to match the current behavior and error-checking).
Okay, sounds interesting. Unfortunately for our use cases it's quite the opposite, we want to have less pseudo-instructions in the dumped assembly output, so I would say that we are not that interested in having a switch -noexpandasm.