rosetta-lisp
rosetta-lisp copied to clipboard
A set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code
Rosetta Lisp
Rosetta Lisp is a set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code.
Implementations
Currently there are 6 implementations available. Each of them is implemented in different languages.
- ocalisp in OCaml (Reference implementation)
- scalisp in Scala 3
- golisp in Go
- fslisp in F#
- kokalisp in Koka 3
- idrlisp in Idris 1 (Out of date)
- wonderlisp in Rosetta Lisp itself
VM Design
The design of the Rosetta Lisp implementation is heavily inspired by SECD machine. Every S-expression is macro-expanded and then compiled into a code, a sequence of instructions. Rosetta Lisp implementations share an extremely simple, small instruction set that targeting an SECD-like virtual machine.
VM Instruction Set
Rosetta Lisp VM consists of four states:
[S]tack- a list of values[E]nvironment- an abstract representation of a collection of key-value pairs[C]ode- a list of instructions[D]ump- a list of pairs of Environment and Code
ldc value
Push a value on top of S.
(define (ldc value)
(lambda (S E C D k)
(k (cons value S) E C D)))
ldv name
Find a binding for name from E and push the value on top of S.
(define (ldv name)
(lambda (S E C D k)
(k (cons (find-env name E) S) E C D)))
ldf pattern code
Make a function capturing E and push it on top of S.
(define (ldf pattern code)
(lambda (S E C D k)
(k (cons (make-function-closure pattern code E) S) E C D)))
ldm pattern code
Make a macro capturing E and push it on top of S.
(define (ldm pattern code)
(lambda (S E C D k)
(k (cons (make-macro-closure pattern code E) S) E C D)))
ldb name
Find a built-in function named name and push it on top of S.
(define (ldb name)
(lambda (S E C D k)
(k (cons (find-builtin-function name) S) E C D)))
sel then-code else-code
Pop a value from S, set C to then-code if the value is #f, set C to
else-code otherwise. Set E to a new environment derived from E. Push the
previous E and C on top of D.
(define (sel then-code else-code)
(lambda ((s . S) E C D k)
(k S
(new-env E)
(if s then-code else-code)
(cons (cons E C) D))))
leave
Pop a pair of Environment and Code from D and set E and C to it.
(define (leave)
(lambda (S E C ((e . c) . D) k)
(k S e c D)))
app argc
Pop argc values as function arguments from S, pop a function from S, call
the function with the arguments.
(define (app argc)
(lambda ((argN argN-1 ... arg1 f . S) E C D k)
(apply-function f (list arg1 .. argN) S E C D k)))
apply-function is defined for built-in functions and function closures
obtained by ldf. apply-function on function closures is defined as follows:
(define (apply-function (FUNCTION-CLOSURE pattern code env)
args
S E C D k)
(k S
(bind-args-with-pattern args pattern (new-env env))
code
(cons (cons E C) D)))
pop
Pop a value from S.
(define (pop)
(lambda ((s . S) E C D k)
(k S E C D)))
def name
Pop a value, create a binding from name to the value on E.
(define (def name)
(lambda ((s . S) E C D k)
(k S (env-define name s E) C D)))
set name
Pop a value, update a binding from name to the value on E.
(define (set name)
(lambda ((s . S) E C D k)
(k S (env-set name s E) C D)))
Compilation to VM instructions
All literals and quoted expressions are compiled into ldc:
; compiling 123
[0 entry]
ldc 123
All unquoted symbols are compiled into ldv:
; compiling foo
[0 entry]
ldv foo
Lists are compiled into a sequence of element evaluations and an app.
; compiling (compare a b)
[0 entry]
ldv compare
ldv a
ldv b
app 2
; compiling (+ foo (* bar baz) 111)
[0 entry]
ldv +
ldv foo
ldv *
ldv bar
ldv baz
app 2
ldc 111
app 3
All other instructions are produced by Syntax. Applications of Syntax are
not compiled into an usual app.
(builtin cons) produces a ldb.
[0 entry]
ldb hello
(if a b c) produces a sel and two branch codes terminated by a leave.
[0 entry]
ldv a
sel [1 then] [2 else]
[1 then]
ldv b
leave
[2 else]
ldv c
leave
(begin a b c) produces a sequence of evaluations and pops.
[0 entry]
ldv a
pop
ldv b
pop
ldv c
(fun (x y) y) produces a ldf, (macro (x y) x) produces a ldm. Body of
functions are terminated by a leave but macros are not.
[0 entry]
ldf [1 fun (x y)]
[1 fun (x y)]
ldv y
leave
[0 entry]
ldm [1 macro (x y)]
[1 macro (x y)]
ldv x
(def x 123) produces a def, (set! x 123) produces a set. Both of them
also produce a ldc () to adjust stack size.
[0 entry]
ldc 123
def x
ldc ()
[0 entry]
ldc 123
set x
ldc ()
(quote x) is also one of Syntax that produces a ldc.
Built-in functions
All other functionalities are injected as built-in functions. Every implementation provides the same built-in function set. By doing this, Every implementation can use the same bootstrapping code to get frequently used functions and macros.
Built-in functions required by the bootstrapping code
To reduce the size of required built-in functions, there are a lot of functions and macros that are defined in the bootstrapping code instead of in host languages. It's not optimal in terms of performance, but it makes easy to add another Rosetta Lisp implementation.
In the following documentation, result<a> means a result cons cell.
- If
caris#t, the resultais set tocdr. - If
caris#f, the failure error informationstris set tocdr.
Cons cell operators
| name | overloads |
|---|---|
cons |
(_, _) -> cons |
car |
(cons) -> _ |
cdr |
(cons) -> _ |
Terminators
| name | overloads |
|---|---|
exit |
(?num) -> ! |
error |
(?str) -> ! |
Macro supports
NOTE: quote is a syntax, quasiquote and unquote are implemented on
boot.lisp.
| name | overloads | note |
|---|---|---|
gensym |
() -> sym |
Must not overlap with symbol literals |
Control operators
| name | overloads |
|---|---|
apply |
(proc, list) -> _ |
call/cc |
(proc) -> _ |
never |
(proc, ..._) -> ! |
Test operators
| name | overloads | note |
|---|---|---|
num? |
(_) -> bool |
|
sym? |
(_) -> bool |
|
str? |
(_) -> bool |
|
cons? |
(_) -> bool |
|
nil? |
(_) -> bool |
|
bool? |
(_) -> bool |
|
proc? |
(_) -> bool |
Built-in functions and user functions |
meta? |
(_) -> bool |
Language syntax and user macros |
vec? |
(_) -> bool |
Arithmetic operators
| name | overloads |
|---|---|
+ |
(...num) -> num |
- |
(num, ...num) -> num |
* |
(...num) -> num |
/ |
(num, ...num) -> num |
% |
(num, ...num) -> num |
Relational operators
| name | overloads |
|---|---|
= |
(..._) -> bool |
< |
(...num) -> bool, (...str) -> bool |
> |
(...num) -> bool, (...str) -> bool |
<= |
(...num) -> bool, (...str) -> bool |
>= |
(...num) -> bool, (...str) -> bool |
str operators
String encoding is not specified, but is expected to be ASCII compatible through these operators. This requirement is usually satisfied by using a Unicode scalar sequence or byte sequence.
| name | overloads | note |
|---|---|---|
str |
(...num) -> str | ! |
Each num corresponds to a character. Raises an error on invalid characters |
str-char-at |
(str, num) -> num | nil |
Takes an index. Returns a character |
str-length |
(str) -> num |
Returns character count |
str-concat |
(...str) -> str |
|
substr |
(str, num, num) -> str | ! |
Takes a str, index, and length. Raises an error on out of range |
sym->str |
(sym) -> str |
|
num->str |
(num) -> str |
|
str->num |
(str) -> num | nil |
vec operators
vec is a type for minimal support of mutable/fixed-length sequential data.
| name | overloads | note |
|---|---|---|
vec |
(..._) -> vec |
|
vec-make |
(num, _) -> vec |
Takes a length and initial value |
vec-length |
(vec) -> num |
|
vec-get |
(vec, num) -> _ | nil |
Get by index |
vec-set! |
(vec, num, _) -> nil | ! |
Set by index. Raises an error on out of range |
vec-copy! |
(vec, num, vec, num, num) -> nil | ! |
Takes a dest, dest-start-index, src, src-start-index, and length |
I/O operators
| name | overloads | note |
|---|---|---|
read-file-text |
(str) -> result<str> |
Takes a file path. Returns contents of the file |
write-file-text |
(str, str) -> result<nil> |
Takes a file path and contents |
read-console-line |
() -> result<str | nil> |
No line feed at end. Returns nil if terminated |
write-console |
(str) -> result<nil> |
Misc
| name | overloads | note |
|---|---|---|
args |
() -> list |
Returns command line arguments as list of str |
eval |
(_) -> result<_> |
Evaluates a S-expression |
macroexpand |
(_) -> result<_> |
Perform macro expansion |
macroexpand-1 |
(_) -> result<_> |
Perform one step macro expansion |
Tests
There are test suites for the functionalities of the parser, the compiler, and the VM. Rosetta Lisp's bootstrapping code includes unit tests for each built-in function, as comments.
(def cons (builtin cons))
;! > (cons 1)
;! fail
;! > (cons 1 2)
;! (1 . 2)
;! > (cons 1 2 3)
;! fail
Since it's troublesome to parse these comments, there is a unified, easy-to-parse test file available. This test file is generated by ./gentest.