rules_python sphinxdocs: implement a persistent worker mode

Right now, sphinxdocs runs as a fresh process every time. For iterative development, this adds unnecessary overhead in the setup and execution of sphinx. It also creates barriers to make use of Sphinx's incremental rebuilding features.

By implementing the sphinx action as a persistent worker, it should be possible to reduce the startup overhead and better detect what has changed.

I see two basic implementation options:

ideal: The worker is the sphinx process. Build requests then re-use the existing sphinx process. This eliminates any startup overhead. This requires

Implementing a wrapper that uses sphinx as a library. Essentially running a while True: wait_for_request(); build() loop.
That Sphinx be OK with a single process handling multiple builds. i.e. that the global process state (caches etc) is OK to reuse between builds.

backup: The worker calls sphinx-build as a subprocess. The persistent worker API tells which files changed, so the worker can massage state so that sphinx's incremental (timestamp based) building works better.

The env-related event api might also be relevent to this: https://www.sphinx-doc.org/en/master/extdev/event_callbacks.html

Notably, events to intercept how source files are read. e.g. intercept how they're read and keep an in-memory cache based on hash.

May 14 '25 16:05 rickeylev

Detail I ran into: When bazel re-invokes the action, the output directory is cleared. This means Sphinx has to re-write all the outputs even if the sources didn't change. This can be addressed by having sphinx write output to a location that isn't declared as an output, and having the worker copy all those files to the bazel output location.

Jun 02 '25 03:06 rickeylev

A persistent worker mode has been implemented.

Oct 12 '25 03:10 rickeylev