frankenphp icon indicating copy to clipboard operation
frankenphp copied to clipboard

Async workers

Open AlliBalliBaba opened this issue 6 months ago • 15 comments

Describe you feature request

If something like https://github.com/true-async/php-async gets merged into php-src what would be the necessary steps to create an 'async' worker? In other words: a worker where the whole request runs inside of a coroutine.

worker {
  file: "public/worker.php"
  async
}`

Probably worth thinking about, even though this is still a while off.

AlliBalliBaba avatar Jul 18 '25 15:07 AlliBalliBaba

Would that be particularly useful? Sure, in single threaded scenarios we could achieve parallel execution of multiple requests, but I don't think that's a particularly important aspect.

henderkes avatar Jul 18 '25 15:07 henderkes

In most use cases, running async isn't very useful for application servers IMO since IO usually isn't that slow. Async only shines if you have very slow IO (like multiple seconds), since it becomes more memory efficient at that point.

That being said, async is more restricting in terms of what you can do with globals, so many of PHP's frameworks/libraries probably aren't really ready for async. It still would be nice though to have the option, like creating a specific 'async worker' that handles requests to a very slow endpoint like hitting an LLM.

AlliBalliBaba avatar Jul 19 '25 08:07 AlliBalliBaba

I think there is a lot to do on the PHP side. For example, allowing function callbacks + state + objects to pass the thread boundary (TSRM makes these things thread-local IIRC).

withinboredom avatar Jul 19 '25 08:07 withinboredom

If you mean running the actual handler inside coroutine, then you would need to add suspending / resumption to the code that does IO in handler in a similar way how it's done for streams in https://github.com/true-async/php-src/blob/acced1b4b76e49c6fed2408dcba1379a4e6ce547/main/network_async.c . So you would probably need to have some sort of notifier that wakes up when the request is ready. Stream implementation is wrapping select / poll functions for that purpose.

bukka avatar Jul 19 '25 19:07 bukka

If you mean running the actual handler inside coroutine

This is what I was referring to. If we were to throw these things onto a go routine, go may or may not keep that in the same executing thread. Go expects memory to be shared at the runtime level (if I understand the internals correctly -- I once took a deep dive into it to see how SSA was implemented compared to opcache).

But anyway, we can always lock a go routine to the current thread, but that defeats the purpose? I think we need to address TSRM sooner than later. I like opcache's approach to put everything in shared memory and then just ensure things don't overlap (similar to go, really) except for very special cases. For example, there's no reason that CG(compiler_options) be thread-local, only that we read from the executing requests block of memory.

withinboredom avatar Jul 19 '25 19:07 withinboredom

We'd probably need https://github.com/php/php-src/pull/16565 as well, so we don't need an exclusive lock to read shared state.

withinboredom avatar Jul 19 '25 19:07 withinboredom

Yeah it cannot be execute from multiple threads. This might be quite tricky to handle in go though.

bukka avatar Jul 19 '25 19:07 bukka

Running on go threads and the request being async are 2 different things. You can run PHP asynchronously in a node/swoole-like model with each thread having its own memory. And you can run PHP on go threads even without it being async (with a custom TSRM implementation).

Best case scenario would be having both, and I think the first might even need to happen before the second.

AlliBalliBaba avatar Jul 19 '25 20:07 AlliBalliBaba

Yeah I was actually thinking more how could you wait for the request IO but that would be probably quite pointless because that's already distributed between worker threads quite optimaly. So what you would probably just need is to call the handler function in frankenphp_handle_request using async spawn. So pretty much having similar logic like the extension spawn there: https://github.com/true-async/php-async/blob/5d7445e356ce456312ae84a6cd21d9b40376c49b/async.c#L63-L91

bukka avatar Jul 19 '25 22:07 bukka

Yeah you'd probably need to push requests onto the event loop like that instead of pulling them. Also, not sure what to do about request globals, they either need to be in the context or be dependency injected somehow.

AlliBalliBaba avatar Jul 21 '25 16:07 AlliBalliBaba

After looking a bit through the repo, it would be relatively straightforward for go to be the async backend (just a bit of work). Pretty much all of these async operations could just be handled through a goroutine.

zend_async_reactor_register(
   "Go Async Backend",         // Module name
   false,                      // Allow override
   my_reactor_startup,         // Startup function
   my_reactor_shutdown,        // Shutdown function
   my_reactor_execute,         // Execute function
   my_reactor_loop_alive,      // Loop alive check
   my_new_socket_event,        // Socket event factory
   my_new_poll_event,          // Poll event factory
   my_new_timer_event,         // Timer event factory
   my_new_signal_event,        // Signal event factory
   my_new_process_event,       // Process event factory
   my_new_thread_event,        // Thread event factory
   my_new_filesystem_event,    // Filesystem event factory
   my_getnameinfo,             // DNS nameinfo
   my_getaddrinfo,             // DNS addrinfo
   my_freeaddrinfo,            // DNS cleanup
   my_new_exec_event,          // Exec event factory
   my_exec                     // Exec function
);

my_reactor_execute listens to a channel of event callbacks and event.start creates a goroutine that pushes the result back to the channel upon completion.

func go_look_up_adress_start (host) {
    go func() {
        ips, err := net.LookupIP(host )
        // ...modify result
        eventChan <- result
    }
}

It might be a bit more complex if events also need to be stoppable/disposable

AlliBalliBaba avatar Jul 27 '25 21:07 AlliBalliBaba

Just a note here that the API will almost certainly not going to get merged in time for PHP 8.5.

bukka avatar Jul 29 '25 11:07 bukka

Describe you feature request

If something like https://github.com/true-async/php-async gets merged into php-src what would be the necessary steps to create an 'async' worker? In other words: a worker where the whole request runs inside of a coroutine.

worker { file: "public/worker.php" async }` Probably worth thinking about, even though this is still a while off.

Hello all. MVP. You can play around with it. Docker is coming soon. https://github.com/true-async/frankenphp/tree/true-async

EdmondDantes avatar Dec 22 '25 20:12 EdmondDantes

Looks really interesting @EdmondDantes 👍 , will have a look

AlliBalliBaba avatar Dec 25 '25 14:12 AlliBalliBaba

Looks really interesting @EdmondDantes 👍 , will have a look

This solution has several drawbacks that will need to be addressed:

  1. Under high load, an empty page sometimes appears for unclear reasons.
  2. There is a failure while waiting for a context. Possibly the client closed the socket, but this is not fully clear yet.
  3. A robust request distribution algorithm is required. I implemented Round Robin, but it is not suitable for production.
  4. There is an issue with response overhead: memory has to be copied. A 100% solution exists, but it requires more effort to implement.

EdmondDantes avatar Dec 25 '25 17:12 EdmondDantes

Empty pages probably appear due to race conditions between writing and flushing, both would need to happen on the same channel.

Another drawback to async writes is that the buffered chunks will just queue up in memory if the response is big and the client has a slow network. A proper solution would probably need to still block but give control back to the C-scheduler on network writes.

I saw you mention somewhere that it's also possible to tie globals to the request coroutine. Would it also be possible to tie the request ID to the coroutine? In that case we wouldn't even need a request and response object and could just use PHP's traditional globals/headers and unbuffered writing.

Do static variables also swap with the coroutine? Looking at your Laravel test implementation, it probably would still need a lot of work to be fully async-aware. The current (blocking but stateful) Octane runtime has a ton of event listeners to reset facades and singletons back to their initial state after a request. It would probably take a lot of refactoring in the Laravel framework itself to make singletons stateless, which is required for async.

AlliBalliBaba avatar Dec 27 '25 12:12 AlliBalliBaba

Another drawback to async writes is that the buffered chunks will just queue up in memory if the response is big and the client has a slow network. A proper solution would probably need to still block but give control back to the C-scheduler on network writes.

In this version, I removed chunks, and the output goes straight into memory. This is more a matter of implementation complexity than something impossible. It needs some thought.

Would it also be possible to tie the request ID to the coroutine? In that case we wouldn't even need a request and response object and could just use PHP's traditional globals/headers and unbuffered writing.

Yes, that can be done, but SuperGlobals have the drawback of lacking immutability. That’s why request and response objects are the most correct approach, since they immediately localize memory and access to it.

Do static variables also swap with the coroutine?

Yes, that can be done, but only for functions. For classes, it’s an almost meaningless feature due to performance issues.

Looking at your Laravel test implementation, it probably would still need a lot of work to be fully async-aware. The current (blocking but stateful) Octane runtime has a ton of event listeners to reset facades and singletons back to their initial state after a request. It would probably take a lot of refactoring in the Laravel framework itself to make singletons stateless, which is required for async.

There’s no doubt that Laravel will require refactoring. Although I’m not sure that all facades need to be reset. Problems should arise only when I/O is combined with shared state, and there’s much less of that code than everything else. But yes, absolutely, Laravel will never work on its own for stateful mode.

EdmondDantes avatar Dec 27 '25 19:12 EdmondDantes

https://github.com/laravel/octane/blob/ae618600cb54826a21f67d130a39446f68be1a9a/src/Listeners/CloseMonologHandlers.php

I looked at this code, and it seems there’s a different issue here: the log reset point. After each request, logs must be properly flushed/reset.

EdmondDantes avatar Dec 27 '25 19:12 EdmondDantes

Yes, that can be done, but SuperGlobals have the drawback of lacking immutability. That’s why request and response objects are the most correct approach, since they immediately localize memory and access to it.

SuperGlobals are the simplest things to use (its a blessing and a curse for PHP) and shouldn't be disregarded. If you can swap out other values, you can swap these out. Underneath pretty much all of PHP are HashMaps, pretty much.

withinboredom avatar Dec 27 '25 19:12 withinboredom

SuperGlobals are the simplest things to use (its a blessing and a curse for PHP) and shouldn't be disregarded. If you can swap out other values, you can swap these out. Underneath pretty much all of PHP are HashMaps, pretty much.

The benefit of using SuperGlobals is quite small. The gain from them is minimal because, if the code is properly designed, request parameters are not needed in 98% of the code. There are DTOs, contexts, and middleware, and that covers everything. The motivation for localizing SuperGlobals sounds like this: to fix poor decisions from the past, we’ll come up with a not-so-good solution in the future.

SuperGlobals were would be a great solution if they had a semantic immutability guarantee. Or if they were mapped to some kind of SuperGlobal object like $_REQUEST (not an array, but an object with an interface). Then that would be the right approach.

Or if such a variable were defined using a separate keyword like superglobals type $request;, thereby enforcing a contract.

Although personally, I would prefer to introduce effects for functions and solve about 50% of similar problems that way.

EdmondDantes avatar Dec 27 '25 20:12 EdmondDantes

The fact that superglobals are not disregarded in FrankenPHP is one of the reasons I even contributed in the first place. You may feel they are a "mistake" in the past, but this is just an opinion. A somewhat popular one, to be sure. But as someone who writes many one-off scripts in my day-job, I really do appreciate them. For the people using frameworks, they basically don't exist; depending on the framework. But every framework builds from the globals. Every request object library does as well -- and there are a lot of them that implement the PSR standard, and even add in additional features.

I don't think it would be wise to throw away over a decade of distributed work in that arena. The fact that there are so many implementations means the community hasn't settled on a "right way" of determining what a request/response object even is.

withinboredom avatar Dec 28 '25 00:12 withinboredom

I agree that super-globals are non-ideal for multiple reasons, not necessarily because they are global or mutable, but also because of how $_SERVER is structured in general. A request object feels like a separate discussion though, that requires some kind of consensus across SAPIs. Ideally the object would be good enough to use without framework, maybe conforming to PSR7 or go's net/http.

If the endgoal is to just transfer the data into a framework-specific request, I'd probably value BC via superglobals more.

Yes, that can be done, but only for functions. For classes, it’s an almost meaningless feature due to performance issues.

The only other way I can think of to make frameworks like Laravel compatible without some core adaptions would be to have event listeners on coroutine swap. Is something like that planned?

AlliBalliBaba avatar Dec 28 '25 16:12 AlliBalliBaba

The only other way I can think of to make frameworks like Laravel compatible without some core adaptions would be to have event listeners on coroutine swap. Is something like that planned?

This is not difficult to implement. However, this approach has several drawbacks:

  1. The code becomes hard to maintain and poorly predictable. That is, a developer cannot reliably reason about how the code behaves. This is a serious drawback.
  2. Asynchrony loses performance if parts of the code block I/O.

EdmondDantes avatar Dec 28 '25 17:12 EdmondDantes

Yeah it's probably a bad idea, just wondering how async compatibility could be made less risky. Otherwise I fear most frameworks won't ever bother unless they were explicitly designed around concurrency like hyperf.

AlliBalliBaba avatar Dec 28 '25 17:12 AlliBalliBaba

Yeah it's probably a bad idea, just wondering how async compatibility could be made less risky. Otherwise I fear most frameworks won't ever bother unless they were explicitly designed around concurrency like hyperf.

This is not a question about risks, but about the stateful mode. Whether PHP code should be stateful is apparently a choice that everyone will have to make in the future.

EdmondDantes avatar Dec 28 '25 18:12 EdmondDantes

Please no PSR-7, immutable objects are not even compatible with the HTTP/2(+) spec where everything is a stream.

dunglas avatar Dec 28 '25 20:12 dunglas

Please no PSR-7, immutable objects are not even compatible with the HTTP/2(+) spec where everything is a stream.

I think the time has come for a new PSR, without the mistakes of past decisions.

EdmondDantes avatar Dec 28 '25 20:12 EdmondDantes

@EdmondDantes Does the true-async api somehow wrap libuv's uv_async_send() or is something like that on the roadmap? Would make it a lot simpler to inject requests into the eventloop instead of having custom polling and notifers via eventfd.

Would a response-write that suspends the coroutine until complete always need an async_response_write event?

Which PHP IO would still be blocking in the current implementation? Would it be necessary to adjust all PHP IO operations separately to make them non-blocking?

I'm also getting the following error under high concurrency if suspending during a request, could this be a bug in the scheduler or the FrankenPHP implementation?

async/scheduler.c:1389: fiber_entry: Assertion `circular_buffer_is_not_empty(resumed_coroutines) == 0

AlliBalliBaba avatar Jan 02 '26 18:01 AlliBalliBaba

Does the true-async api somehow wrap libuv's uv_async_send() or is something like that on the roadmap? Would make it a lot simpler to inject requests into the eventloop instead of having custom polling and notifers via eventfd.

The uv_async_send function internally uses fdevent together with a non-blocking data structure for inter-thread communication. In other words, it is literally the same thing.

Which PHP IO would still be blocking in the current implementation? Would it be necessary to adjust all PHP IO operations separately to make them non-blocking?

fwrite / flock. There are definitely more. Pipes are not supported. There are also some issues on Windows with the accept function. There is no support for SQLite or PostgreSQL.

I'm also getting the following error under high concurrency if suspending during a request, could this be a bug in the scheduler or the FrankenPHP implementation?

It could be a bug. I need to check under what conditions it occurs. Do you have a way to reproduce it?

EdmondDantes avatar Jan 02 '26 18:01 EdmondDantes

Of course, I would prefer to make changes to ZendMM so that input and output buffers can be efficiently passed to Caddy without copying memory. It would also be good to have a built-in queue mechanism for freeing memory directly in the core.

EdmondDantes avatar Jan 02 '26 20:01 EdmondDantes