Bootstrap TCC using pnut
Context
This draft PR shows that pnut can provide an alternative path to TCC.
This is still work in progress as it currently need a prebuilt pnut-exe binary since the live-bootstrap environment doesn't provide a POSIX compliant shell when bootstrapping TCC . However, the binary can be built reproducibly from a POSIX shell with a script in the pnut repository per the instructions below.
Alternatively, the C subset used by pnut is relatively simple (only 1 struct that could be easily removed, very few dynamic memory allocations, no sizeof, everything is a signed integer or pointer) and could probably be ported to M2-Planet.
Prebuilt binary
To make the prebuilt pnut-exe binary:
> git clone [email protected]:udem-dlteam/pnut.git
> cd pnut
> git checkout origin/laurent/small-fixes-for-TCC
> ./utils/make-pnut-exe-for-tcc.sh --shell <shell> # To make pnut-exe with pnut-sh.sh
> ./utils/make-pnut-exe-for-tcc.sh # To make pnut-exe with pnut-exe-from-gcc
Getting the sources
Running ./download-distfiles.sh will download the .tar.gz file but Github will return 404 for the .tar, that's expected. To get the .tar, use gunzip distfiles/79832069f0d44c20a620a923a15e38a545c5e911.tar.gz.
I'm happy to see that you were able to get pnut working in this way!
From my perspective, while pnut-exe is a binary, this cannot be merged. Unfortunately, adding in pnut-exe as a seed is a big negative.
Out of the two options you have given for replacing pnut-exe, having it buildable by M2-Planet is much preferable.
If my understanding is correct, pnut can be used without generating a POSIX shell script at all? That seems to be what is happening here.
It might be interesting to have M2-Planet -> pnut-exe -> tcc....
From my perspective, while pnut-exe is a binary, this cannot be merged. Unfortunately, adding in pnut-exe as a seed is a big negative.
I agree! This draft PR is meant to demonstrate that pnut can reproduce the tcc-0.9.26 binary and as a starting point for a potential M2-Planet -> pnut-exe -> tcc path.
If my understanding is correct, pnut can be used without generating a POSIX shell script at all? That seems to be what is happening here.
Exactly. The prebuilt pnut-exe comes from pnut-exe's code generator, and pnut-exe can be compiled with pnut-sh.sh or with an existing C compiler.
I'll look into porting pnut's source code to M2-planet over the next weeks.
I wonder if M2-Planet has any chance of building pnut. If so, that might remove the need for pre-built pnut-exe. Oh yes, I can see you had this question too...
I wonder if M2-Planet as any chance of building pnut. If so, that might remove the need for pre-built pnut-exe. Oh yes, I can see you had this question too...
I've managed to create a version of pnut-exe that can be built by cc_x86 (or M2-Planet). I did this by porting the x86 version of pnut to a subset of C that is also valid JavaScript.
This script will build pnut-exe and then build pnut-exe using cc_x86 and then use pnut-exe to build the live bootstrap bootstrappable version of tcc (with @laurenthuberdeau patches applied):
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_pnut_cc_x86
This also needs a checkout of https://github.com/cosinusoidally/tcc_bootstrap_alt/ (as it uses the copy of cc_x86 from that repo).
I also have a couple of other alternative build modes:
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_pnut_mujs builds pnut-exe using the mujs JavaScript vm. Since I ported pnut to JS (https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/pnut_refactor/pnut.js) it can also be run inside a JS VM. For now it only works in this custom mujs build, but I am planning to get it building with node.js and Spidermonkey.
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_m2_pnut build with M2-Planet. This requires M2-Planet is in your PATH. The version of stage0-posix I used is from here https://github.com/cosinusoidally/mishmashvm/tree/master/tcc_js_bootstrap/alt_bootstrap/stage0-posix (this is an older fork I have been using for a while).
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_cc_x86_pnut checks that cc_x86 and M2-Planet produce idential M1 files when compiling pnut-exe. This works as I'm using an older version of M2-Planet that will produce output identical to cc_x86
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_pnut makes sure pnut can be built by both gcc and a stock version of tcc-0.9.27.
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_tcc-boot-mes takes and already built copy of pnut-exe from artfacts/ and uses it to build tcc-boot-mes. This is generally called from other scipts.
Note the changes I made to pnut are probably more extensive than necessary. I suspect M2-Planet/Mesoplanet could build a nearly stock upstream pnut-exe as pnut is already written in a fairly conservative dialect of C. I also didn't port the sh backend, but I might try and re-add it at some point.
https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_coverage_pnut is to test code coverage. This was useful during porting as it allowed me to cut out code that I didn't need, plus it allowed me to check that I wasn't accidentally breaking code that wasn't touched. I have around 90% code coverage (which I will improve eventually)
I though I'd mention this here. I've wired up my pnut_js fork to live-bootstrap. My changes to live-bootstrap is definitely not in a mergeable state, but I thought it may be of interest:
https://github.com/cosinusoidally/live-bootstrap/pull/6 is the internal PR in my fork.
See also: https://github.com/cosinusoidally/live-bootstrap/blob/pnut_js/steps/pnut_js-1.0/pass1.kaem and https://github.com/cosinusoidally/live-bootstrap/blob/pnut_js/steps/pnut_js-1.0/pnut_refactor/build_alt.kaem
This avoids the need for a prebuilt pnut-exe binary.
CI also passes (I think it shaves off around 18 mins):
https://github.com/cosinusoidally/live-bootstrap/actions/runs/14814217910
If I were to tidy this up the steps would probably be something like:
- cut a real release of pnut_js
- add a live-bootstrap option "USE_PNUT_JS" to conditionally skip the build of mescc and instead use pnut_js
- add this path to CI in addition to the standard mescc path
- update pnut_js to avoid the use of mmap as builder-hex0 does not support mmap
I though I'd mention this here. I've wired up my pnut_js fork to live-bootstrap. My changes to live-bootstrap is definitely not in a mergeable state, but I thought it may be of interest:
Nice work! Is the plan to use the many javascript runtimes for DDC like pnut does with shells? Or is muJS compatible with M2-Planet?
Note the changes I made to pnut are probably more extensive than necessary. I suspect M2-Planet/Mesoplanet could build a nearly stock upstream pnut-exe as pnut is already written in a fairly conservative dialect of C. I also didn't port the sh backend, but I might try and re-add it at some point.
I recently removed the use of structs from pnut's code, not sure how much this helps with M2-Planet's support. I'm also considering removing all malloc from the code to further reduce the level of C language support required.
update pnut_js to avoid the use of mmap as builder-hex0 does not support mmap
I'm not sure I understand this part. pnut-exe doesn't use mmap but implements it with a syscall, which should be independent from builder-hex0? Or is builder-hex0 a minimal operating system without the mmap syscall?
Before pnut-exe started depending on mmap, all globals were allocated on the stack. That worked until we needed larger statically allocated arrays, in particular for the code buffer for TCC. Fortunately, I just made the code generator one-pass and so the code buffer only needs to be a few kilobytes instead of megabytes so the globals should fit on a 8MB stack. The same could be done for the malloc buffer, since only path strings are allocated dynamically.
M2-Planet supports structs. And it actually got lots of new features: see https://github.com/oriansj/stage0-posix/blob/master/CHANGELOG.org for recent changes.