live-bootstrap icon indicating copy to clipboard operation
live-bootstrap copied to clipboard

Bootstrap TCC using pnut

Open laurenthuberdeau opened this issue 9 months ago • 7 comments

Context

This draft PR shows that pnut can provide an alternative path to TCC.

This is still work in progress as it currently need a prebuilt pnut-exe binary since the live-bootstrap environment doesn't provide a POSIX compliant shell when bootstrapping TCC . However, the binary can be built reproducibly from a POSIX shell with a script in the pnut repository per the instructions below.

Alternatively, the C subset used by pnut is relatively simple (only 1 struct that could be easily removed, very few dynamic memory allocations, no sizeof, everything is a signed integer or pointer) and could probably be ported to M2-Planet.

Prebuilt binary

To make the prebuilt pnut-exe binary:

> git clone [email protected]:udem-dlteam/pnut.git
> cd pnut
> git checkout origin/laurent/small-fixes-for-TCC
> ./utils/make-pnut-exe-for-tcc.sh --shell <shell> # To make pnut-exe with pnut-sh.sh
> ./utils/make-pnut-exe-for-tcc.sh # To make pnut-exe with pnut-exe-from-gcc

Getting the sources

Running ./download-distfiles.sh will download the .tar.gz file but Github will return 404 for the .tar, that's expected. To get the .tar, use gunzip distfiles/79832069f0d44c20a620a923a15e38a545c5e911.tar.gz.

laurenthuberdeau avatar Mar 31 '25 03:03 laurenthuberdeau

I'm happy to see that you were able to get pnut working in this way!

From my perspective, while pnut-exe is a binary, this cannot be merged. Unfortunately, adding in pnut-exe as a seed is a big negative.

Out of the two options you have given for replacing pnut-exe, having it buildable by M2-Planet is much preferable.

If my understanding is correct, pnut can be used without generating a POSIX shell script at all? That seems to be what is happening here.

It might be interesting to have M2-Planet -> pnut-exe -> tcc....

fosslinux avatar Mar 31 '25 08:03 fosslinux

From my perspective, while pnut-exe is a binary, this cannot be merged. Unfortunately, adding in pnut-exe as a seed is a big negative.

I agree! This draft PR is meant to demonstrate that pnut can reproduce the tcc-0.9.26 binary and as a starting point for a potential M2-Planet -> pnut-exe -> tcc path.

If my understanding is correct, pnut can be used without generating a POSIX shell script at all? That seems to be what is happening here.

Exactly. The prebuilt pnut-exe comes from pnut-exe's code generator, and pnut-exe can be compiled with pnut-sh.sh or with an existing C compiler.

I'll look into porting pnut's source code to M2-planet over the next weeks.

laurenthuberdeau avatar Mar 31 '25 12:03 laurenthuberdeau

I wonder if M2-Planet has any chance of building pnut. If so, that might remove the need for pre-built pnut-exe. Oh yes, I can see you had this question too...

stikonas avatar Mar 31 '25 20:03 stikonas

I wonder if M2-Planet as any chance of building pnut. If so, that might remove the need for pre-built pnut-exe. Oh yes, I can see you had this question too...

I've managed to create a version of pnut-exe that can be built by cc_x86 (or M2-Planet). I did this by porting the x86 version of pnut to a subset of C that is also valid JavaScript.

This script will build pnut-exe and then build pnut-exe using cc_x86 and then use pnut-exe to build the live bootstrap bootstrappable version of tcc (with @laurenthuberdeau patches applied):

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_pnut_cc_x86

This also needs a checkout of https://github.com/cosinusoidally/tcc_bootstrap_alt/ (as it uses the copy of cc_x86 from that repo).

I also have a couple of other alternative build modes:

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_pnut_mujs builds pnut-exe using the mujs JavaScript vm. Since I ported pnut to JS (https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/pnut_refactor/pnut.js) it can also be run inside a JS VM. For now it only works in this custom mujs build, but I am planning to get it building with node.js and Spidermonkey.

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_m2_pnut build with M2-Planet. This requires M2-Planet is in your PATH. The version of stage0-posix I used is from here https://github.com/cosinusoidally/mishmashvm/tree/master/tcc_js_bootstrap/alt_bootstrap/stage0-posix (this is an older fork I have been using for a while).

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_cc_x86_pnut checks that cc_x86 and M2-Planet produce idential M1 files when compiling pnut-exe. This works as I'm using an older version of M2-Planet that will produce output identical to cc_x86

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_test_pnut makes sure pnut can be built by both gcc and a stock version of tcc-0.9.27.

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_tcc-boot-mes takes and already built copy of pnut-exe from artfacts/ and uses it to build tcc-boot-mes. This is generally called from other scipts.

Note the changes I made to pnut are probably more extensive than necessary. I suspect M2-Planet/Mesoplanet could build a nearly stock upstream pnut-exe as pnut is already written in a fairly conservative dialect of C. I also didn't port the sh backend, but I might try and re-add it at some point.

https://github.com/cosinusoidally/tcc_simple/blob/master/experiments/mk_coverage_pnut is to test code coverage. This was useful during porting as it allowed me to cut out code that I didn't need, plus it allowed me to check that I wasn't accidentally breaking code that wasn't touched. I have around 90% code coverage (which I will improve eventually)

cosinusoidally avatar Apr 15 '25 12:04 cosinusoidally

I though I'd mention this here. I've wired up my pnut_js fork to live-bootstrap. My changes to live-bootstrap is definitely not in a mergeable state, but I thought it may be of interest:

https://github.com/cosinusoidally/live-bootstrap/pull/6 is the internal PR in my fork.

See also: https://github.com/cosinusoidally/live-bootstrap/blob/pnut_js/steps/pnut_js-1.0/pass1.kaem and https://github.com/cosinusoidally/live-bootstrap/blob/pnut_js/steps/pnut_js-1.0/pnut_refactor/build_alt.kaem

This avoids the need for a prebuilt pnut-exe binary.

CI also passes (I think it shaves off around 18 mins):

https://github.com/cosinusoidally/live-bootstrap/actions/runs/14814217910

If I were to tidy this up the steps would probably be something like:

  • cut a real release of pnut_js
  • add a live-bootstrap option "USE_PNUT_JS" to conditionally skip the build of mescc and instead use pnut_js
  • add this path to CI in addition to the standard mescc path
  • update pnut_js to avoid the use of mmap as builder-hex0 does not support mmap

cosinusoidally avatar May 11 '25 12:05 cosinusoidally

I though I'd mention this here. I've wired up my pnut_js fork to live-bootstrap. My changes to live-bootstrap is definitely not in a mergeable state, but I thought it may be of interest:

Nice work! Is the plan to use the many javascript runtimes for DDC like pnut does with shells? Or is muJS compatible with M2-Planet?

Note the changes I made to pnut are probably more extensive than necessary. I suspect M2-Planet/Mesoplanet could build a nearly stock upstream pnut-exe as pnut is already written in a fairly conservative dialect of C. I also didn't port the sh backend, but I might try and re-add it at some point.

I recently removed the use of structs from pnut's code, not sure how much this helps with M2-Planet's support. I'm also considering removing all malloc from the code to further reduce the level of C language support required.

update pnut_js to avoid the use of mmap as builder-hex0 does not support mmap

I'm not sure I understand this part. pnut-exe doesn't use mmap but implements it with a syscall, which should be independent from builder-hex0? Or is builder-hex0 a minimal operating system without the mmap syscall?

Before pnut-exe started depending on mmap, all globals were allocated on the stack. That worked until we needed larger statically allocated arrays, in particular for the code buffer for TCC. Fortunately, I just made the code generator one-pass and so the code buffer only needs to be a few kilobytes instead of megabytes so the globals should fit on a 8MB stack. The same could be done for the malloc buffer, since only path strings are allocated dynamically.

laurenthuberdeau avatar May 12 '25 10:05 laurenthuberdeau

M2-Planet supports structs. And it actually got lots of new features: see https://github.com/oriansj/stage0-posix/blob/master/CHANGELOG.org for recent changes.

stikonas avatar May 12 '25 11:05 stikonas