LLVM on Aarch64 Tracker
LLVM doesn't work properly on aarch64. There are architectural reasons for this, particularly when used within RPCS3. There is some internal progress for all of these, so we're tracking things here to collaborate more efficiently.
Pre
- [x] Re-implement VM escape on aarch46 using far-jump and context save/restore.
Main
-
[x] PPU LLVM
- [x] Reimplement call gate using register context only.
- [x] Diagnose mysterious ret path causing safety asserts to be hit
- [x] Fix commercial games crashing with nullptr access.
- [x] Refactoring/cleanup (guest/host context switches, to be shared with SPU)
- [x] Identify and fix regression after rebase on recent master
Obsolete implementation
- [x] Breaking from guest to hypervisor (escape) is broken because of callstack unwinding being incompatible with LLVM's GHC implementation on aarch64.
- [x] Worked around by having a manual call stack.
- [x] LLVM is clobbering the link register in GHC blocks making them noreturn.
- [x] Worked around by modifying LLVM's reserved register list for GHC to at least leave the LR alone.
- [x] File a report with upstream with generated blocks.
-
[x] SPU LLVM
- [x] Same escape issue as PPU LLVM
- [x] Compilation failures
- [x] Stack smashing
- [x] Stack underrun
- [x] Reimplement unwind + stack scratch hacks as function transformation passes and remove inline asm hooks.
- [x] Properly formalize/refactor guest-host(hypervisor) context switches and have the code in one place. Currently littered everywhere.
- [x] Fix lockup after rebase on current master
- [x] ~~Basic optimizations (call, shufb)~~ (deferred)
- [x] ~~Other undiagnosed issues (list will be added)~~
I had an epiphany about the LLVM incompatibilities. I'm fighting the ABI so hard trying to make everything stack-based on an architecture that is all about registers. There is 0 reason for me to do that. The stack doesn't need any unwinding or preserving at all. I can unwind by just doing a far jump to the dispatcher to simulate the ret chain. It's also much faster and doesn't require a hacked up LLVM branch to work.
Well, we're back to running simple games after the guest enter/exit rewrite.
Now comes the boring task of figuring out why most commercial games crash. The good news is that the new approach does not require a hacked LLVM anymore and upstream works fine. However, I still need to solve the random jumps to uninitialized memory before we can see a PR.
Commercial games now work with PPU LLVM though there is some severe memory corruption coming from the LLVM-generated subroutines. The odd GHC behavior continues and I can see functions writing to stack-based addresses without first allocating memory. A workaround is to just allow a working area on the stack on the gateway. The execution context is no longer saved on the stack to avoid random corruption as well. Next is to get SPU up to speed, refactor and port the changes on top of modern rpcs3.
~~And now we're back to needing a custom LLVM again. My original work was based on a specific branch of rpcs3 that was very old (maybe 1-2 years) since that was the first time I started playing with it. The current emulator code doesn't work so well. I have a workaround in mind, but it may be a long time before I have a chance to touch this subproject again.~~
Turns out this was a disaster of my own making. I had some llvm modifications breaking my generated code and causing stack smashing. Reverting that crap gets us back in business.
Got mandelbrot SPU demo to run with some heavy hacks littered around the codebase. I think that's enough experiments, I'll start slowly putting everything together and hopefully we get a PR before end of the month.
SPU + PPU running native LLVM.
The codegen is unoptimized, there is a lot of room for improvement. I'm never writing a MachineFunctionPass again so I hope I can just optimize in IR.
With research being completed, I've created some smaller tasks to help merge in the changes. I'll close this ticket once the other 4 have been closed.
Closing now that we have shipping CI builds. It's now official 🥳