oldterror
oldterror copied to clipboard
Terror-based VM.
TerrorVM

A register-based, garbage-collected, stackless, lightweight Virtual Machine for object-oriented programming languages.
Ships with a naive implementation of a reference-counting Garbage Collector, although it will implement a more advanced GC algorithm, probably Baker's treadmill.
Eventually the VM will be optimized to be fast on ARM processors, but for now it compiles to both ARM (tested on Android) and x86 architectures. It will also use LLVM for JIT-compiling code to native at runtime.
Anyway, it's a work in progress! :)
Building the VM
$ git clone git://github.com/txus/terrorvm.git
$ cd terrorvm
$ make
To run the tests:
$ make dev
And to clean the mess:
$ make clean
Running programs
TerrorVM runs .tvm bytecode files such as the hello_world.tvm under the
examples directory.
$ ./bin/vm examples/hello_world.tvm
It ships with a simple compiler written in Ruby (Rubinius) that compiles a
tiny subset of Ruby to .tvm files. Check out the compiler directory, which
has its own Readme, and the compiler/examples where we have the
hello_world.rb file used to produce the hello_world.tvm.
TerrorVM doesn't need Ruby to run; even the example compiler is a proof of concept and could be written in any language (even in C obviously).
Implementing your own dynamic language running on TerrorVM
TerrorVM is designed to run dynamic languages. You can easily implement a compiler of your own that compiles your favorite dynamic language down to TVM bytecode.
I've written a demo compiler in Ruby under the compiler/ folder, just to
show how easy it is to write your own. This demo compiler compiles a subset of
Ruby down to TerrorVM bytecode, so you can easily peek at the source code or
just copy and modify it.
You can write your compiler in whatever language you prefer, of course.
Bytecode format
TerrorVM files are encoded with a header containing _main (the method
that will be the entry point), some info about number of registers, local
variables, literals and instructions used by the method, followed by all the
literals, and then all the instructions. It starts like this:
_main
Then info encoded in the format
:num_registers:num_locals:num_literals:num_instructions:
:10:2:4:17
Then all the literals, each one in a line (the ones starting with " are
string literals):
123
"print
"Goodbye world!
"Hello world!
And then all the instructions:
0x2000000
0x51000000
0x9010000
0x51010100
...
Instructions have a compact 3-operand representation, 8-bit each, for a total of 32-bit per instruction.
VM primitive functions
TerrorVM exposes a VM object that responds to primitive, which returns a
hash with some VM primitive functions exposed as Terror Function objects.
A simple example of those are arithmetic functions (+, -, *, /) used
by Integer objects, for example. To use this in your functions, do it like
this:
VM.primitive[:+].apply(3, 4) # this is the same as 3 + 4 or 3.+(4)
Examples
- Hello world (Ruby code, TVM bytecode)
- Numbers (Ruby code, TVM code)
- Objects with prototypal inheritance (Ruby code, TVM bytecode)
- Exposed VM primitives (Ruby code, TVM bytecode)
Instruction set
- NOOP: no operation -- does nothing.
Loading values
- MOVE A, B: copies the contents in register
Bto registerA. - LOADI A, B: loads the integer (from the literal pool) at index
Binto registerA. - LOADS A, B: loads the string (from the literal pool) at index
Binto registerA. - LOADNIL A: loads the special value
nilinto registerA. - LOADBOOL A, B: loads a boolean into register
A, being true ifBis the number 1 or false ifBis 0. - LOADSELF A: loads the current
selfinto registerA.
Branching
- JMP A: unconditionally jumps
Ainstructions. - JIF A, B: jumps
Ainstructions if the contents of the registerBare eitherfalseornil. - JIT A, B: jumps
Ainstructions if the contents of the registerBare neitherfalsenornil.
Local variables
- LOADLOCAL A, B: loads the value in the locals table at index
Binto the registerA. - SETLOCAL A, B: stores the contents of the register
Ain the locals table at indexB.
Slots
- LOADSLOT A, B, C: loads the slot named
Cfrom the objectBto the registerA. - SETSLOT A, B, C: sets the slot named
Bfrom the objectAto the value in the registerC.
Arrays
- MAKEARRAY A, B, C: Takes
Cregisters starting from registerB, creates an array with them and stores it in registerA.
Message sending and call frames
- SEND A, B, C: send a message specified by the string in the literals
table at index
Bto the receiverA, with N arguments depending on the arity of the method, the first argument being in the registerC. - RET A: return from the current call frame to the caller with the value
in the register
A.
Debugging
- DUMP: Print the contents of all the registers to the standard output.
Who's this
This was made by Josep M. Bach (Txus) under the MIT license. I'm @txustice on twitter (where you should probably follow me!).