Add support for GraalPy
It's an alternative Python implementation, as per https://github.com/python/pythondotorg/issues/2797 and supported by https://github.com/actions/setup-python
@timfel What do you think?
It mostly worked:
5 benchmarks failed:
- dask (Benchmark died)
- gc_collect (Benchmark died)
- networkx (Benchmark died)
- networkx_connected_components (Benchmark died)
- networkx_k_core (Benchmark died)
It was extremely slow:
- https://github.com/maurycy/pyperformance/actions/runs/18823779289/job/53703296682
I cancelled it after an hour.
Is it premature? Does it require much more powerful runner?
@maurycy we run the pyperformance benchmarks internally, but they require a powerful runner and lots more warmup than PyPy or CPython per benchmark. One hour is not nearly enough, we give it 18 cores and 64G of RAM and it runs in about 2.5 hours
As for the benchmarks, the networkx benchmarks should work, that's a bug on our side. But dask and gc_collect as are won't, the GC benchmarks in general just don't make much sense on GraalPy, since we have a completely different GC, so neither the benchmark nor any assertions make sense for us. The dask one doesn't work because we don't support the dis module, we have no current plans for this. So those would have to be excluded for GraalPy
This also increases the CI runs from around 13 minutes to over 2.5 hours.
@maurycy we run the pyperformance benchmarks internally, but they require a powerful runner and lots more warmup than PyPy or CPython per benchmark. One hour is not nearly enough, we give it 18 cores and 64G of RAM and it runs in about 2.5 hours
That was just tests, though:
- https://github.com/python/pyperformance/blob/main/.github/workflows/main.yml#L79
Other tests finished in ~10 minutes
FYI: The whole benchmark runs in ~1h on cpython on i9-12900K, 128G DDR4
@maurycy if it's anything like in our internal setup, running the pip subprocesses to install dependencies is easily the worst part of the runtime. These creating venv for benchmark steps are easily 20x slower than on CPython, and when 2s turn into 40s times ~90 benchmarks this fills the time quickly.