Drop bytecode VM and compile to LLVM IR
This player is going to be feature-complete soon, so I should start thinking about what to do with the current bytecode based virtual machine. It wasn't the best idea because it isn't fast. Scratch code can be compiled to LLVM IR instead which is then compiled to the architecture specific machine code by LLVM.
If anyone has experience with this, feel free to comment with any ideas :)
Notes:
- I don't want to write a JIT. I'd prefer ahead-of-time compilation when projects are loaded (just like the current compiler does it).
- This will require rewriting/refactoring all Scratch blocks because they rely on the bytecode VM.
- Since it'll take a long time to implement, I'll most likely keep the current implementation and make the LLVM-based compiler optional while it's being developed.
How will it work?
- Every Scratch script will have LLVM IR which will be executed using LLVM's ExecutionEngine with JIT compilation. It should be fast enough. If not, it should be possible to implement hacks like compiling to native code and running it from memory, but that's not portable.
- Since Scratch scripts must be able to yield at random points (for example at the end of a loop, if not running without screen refresh), coroutines will be used.
- A coroutine will execute the sequence for the appropriate part of the script (block functions in the sequence will be declared with
extern "C"). - If the code has to yield, the coroutine will suspend (e. g. in a loop). The execution context will store the coroutine handle so it can resume later.
- If the script has to restart (current VM has a function that makes it jump to the beginning), it'll just destroy the coroutine handle and call the coroutine.
- If the code can run without screen refresh, there's no need to use coroutines which should be faster.
Progress:
- [x] API preparation: Refactor Value class
- [x] API preparation: Refactor List class
- [x] API preparation: Add Thread class
- [x] API preparation: #393
- [x] Add Compiler class (in
devfolder to keep the old implementation) - [x] Add code builder class (abstract)
- [x] Add LLVM code builder class
- [x] Add a class for executing compiled code
- [x] Add API for Scratch blocks
- [x] If statements
- [x] Loops
- [x] Yielding
- [x] Operators
- [x] Math instructions
- [x] Variables
- [x] Lists
- [x] Promises
- [x] API for testing Scratch blocks
- [x] Motion blocks
- [x] Looks blocks
- [x] Sound blocks
- [x] Event blocks
- [x] Control blocks
- [ ] Sensing blocks
- [x] Operator blocks
- [x] Custom blocks
- [x] Variable blocks
- [x] List blocks
- [x] Edge-activated hats
- [x] Monitors
- [x] Fix memory leaks in unfinished coroutines
- [x] Implement type prediction within scripts
- [ ] Implement type prediction within sprites (local variables/lists)
- [ ] Implement type prediction within project (global variables/lists)
- [ ] Cache converted constants (custom block arguments, constant values, constant variables, etc.)
- [ ] Implement custom block inlining
- [x] Remove bytecode VM
@Arcnor Just letting you know I've opened the issue to work on this.
@Arcnor Just letting you know I've opened the issue to work on this.
Hey @adazem009 , that's fantastic, and thanks for keeping me posted :). It makes sense not wanting to write a JIT, it's not an simple task and very easy to fall down the rabbit hole :D. But there are some implementations you can use IIRC. There are also other targets that are not LLVM that might help you as well, there are things like LibJIT that provide you a big base to construct upon. Finally LLVM itself has ORC, and the tutorial covers this, although I've never used it (nor the predecesor, can't remember its name) and it's not an actual JIT implementation but "hooks" to write your own.
Anyway, I wish you luck!
@Arcnor Just letting you know I've opened the issue to work on this.
Hey @adazem009 , that's fantastic, and thanks for keeping me posted :). It makes sense not wanting to write a JIT, it's not an simple task and very easy to fall down the rabbit hole :D. But there are some implementations you can use IIRC. There are also other targets that are not LLVM that might help you as well, there are things like LibJIT that provide you a big base to construct upon. Finally LLVM itself has ORC, and the tutorial covers this, although I've never used it (nor the predecesor, can't remember its name) and it's not an actual JIT implementation but "hooks" to write your own.
Anyway, I wish you luck!
Thanks. I mostly think it doesn't make sense to write a JIT because (probably) all Scratch projects compile very quickly with the current compiler, only JSON parsing (to generate AST) can take a long time. I just hope compiling with LLVM won't slow it down.
I'll most likely use LLVM as long as I can call C++ functions from the IR and write unit tests for blocks easily (easier than it is now, with the complicated bytecode VM).
(reply to a discord message, but I'm putting it here since its relevant) The way I set up threespeed is theres a python script that converts a project.json file to c++. This conversion is basically 1:1, ie. each scratch block becomes a single function call to the corresponding C++ implementation. This makes writing the compiler pretty easy. I also decided that each block stack would get its own native thread. This was probably a bad idea since most performance-intensive projects are mostly single threaded anyways and it comes with all the normal multithreading problems.
If you decide to compile the code instead of making a bytecode interpreter, the main problem you'll run into is how scratch handles reentrancy. If you have a long-running event handler and it gets called again, the first instance immediately stops. With a bytecode interpreter this is easy to handle, for native code it isn't since this isn't how most languages handle reentrancy. You might be able to solve this in an elegant way if you don't just generate function calls, although it sounds like you're going down a similar path as what I did.
As for compile time, the sb3 -> c++ compiler written in python is much faster than gcc. The project.json is already an AST and you have to parse at most 5mb of data for most projects.
(reply to a discord message, but I'm putting it here since its relevant) The way I set up threespeed is theres a python script that converts a project.json file to c++. This conversion is basically 1:1, ie. each scratch block becomes a single function call to the corresponding C++ implementation. This makes writing the compiler pretty easy. I also decided that each block stack would get its own native thread. This was probably a bad idea since most performance-intensive projects are mostly single threaded anyways and it comes with all the normal multithreading problems.
If you decide to compile the code instead of making a bytecode interpreter, the main problem you'll run into is how scratch handles reentrancy. If you have a long-running event handler and it gets called again, the first instance immediately stops. With a bytecode interpreter this is easy to handle, for native code it isn't since this isn't how most languages handle reentrancy. You might be able to solve this in an elegant way if you don't just generate function calls, although it sounds like you're going down a similar path as what I did.
As for compile time, the sb3 -> c++ compiler written in python is much faster than gcc. The project.json is already an AST and you have to parse at most 5mb of data for most projects.
These things (thread/script management) are currently handled by the Engine class (https://github.com/scratchcpp/libscratchcpp/blob/537ff52a7500d17d07df7ea31c7023358824e344/src/engine/internal/engine.cpp). I just need to make the native code be able to stop, so it can resume later. That's probably the trickiest part. I'm not sure how that's going to work yet.
I found this, but I'm not sure it's related: https://llvm.org/docs/Coroutines.html It seems I can make a coroutine suspend and then resume, but I don't know if I can do that with the main function as well.
@Arcnor After some experiments with LLVM, I think I've figured out how to be able to suspend and resume compiled code (read @davidtheplatform's comment). See the "How will it work?" section of the issue description.
@Arcnor Just letting you know that this is finally finished after almost one year... All standard Scratch blocks are implemented, but there are still a few optimizations I'll perform later.
Great to hear mate, you're smashing those goals! 🙂
Great to hear mate, you're smashing those goals! 🙂
Static type analysis is one of those optimizations. It's going to be a very interesting task. I'll write algorithms that analyze loops/if statements and determine which types can a variable or list have at a specific point in code.
This is what I have right now, but it's pretty limited and doesn't work for multiple types: https://github.com/scratchcpp/libscratchcpp/blob/f5834a854ab1e87c1eb310376e9bec559036e32e/src/engine/internal/llvm/llvmtypeanalyzer.cpp