Fix #46
This is the bare minimum required to make fibers work within the go runtime; The go runtime uses split stacks.
This is not so simple, I don't know for sure if there is a solution to this problem. The stack size for goroutines is determined by the go runtime (aot, obviously), and they use split stacks.
Compiling with split stack flags seemed to superficially solve the problem, but it's not reliable, I'm still getting "g0 morestack" aborts; The go runtime is creating threads that we don't have control over, probably with too small a stack size, while we could edit the thread pool library to use larger stacks we cannot exert control over go-runtime-created threads ...
I'm on sick leave, and this is mostly about reading/research ... which I'll try to continue when able ...
If anyone has any better understanding of what is going on, and if there is a solution too it, I'd be interested to hear it ...
According to https://groups.google.com/g/golang-nuts/c/VUKub2nIt3M the stack size of the C threads can be configured using ulimit.
I also wonder if using the master branch of PHP (which has the SA_ONSTACK signal handler flag set) could help, but to be honest I've not a good understanding of how Fibers work internally and I may be totally wrong.
I added it to the CFLAGS when compiling PHP:
Index: Dockerfile
<+>UTF-8
===================================================================
diff --git a/Dockerfile b/Dockerfile
--- a/Dockerfile (revision 0a2e2de8428ab440b536d652d834e82e602abb7c)
+++ b/Dockerfile (date 1668063664875)
@@ -18,7 +18,7 @@
; \
\
export \
- CFLAGS="$PHP_CFLAGS" \
+ CFLAGS="$PHP_CFLAGS -fsplit-stack" \
CPPFLAGS="$PHP_CPPFLAGS" \
LDFLAGS="$PHP_LDFLAGS" \
; \
And this seems to work "in real life" (as in not getting any errors when testing on real files with a real server) but the test segfaults 100% of the time.
Ah, I had to load test it to get it to crash.
Edit:
Stack size appears to be related to RPS. I set ulimit -s 16384 (the maximum for my system) and I'm able to sustain well over 1000 requests-per-second and only hitting this very rarely.
It appears PHP does some pretty low-level modifications to the stack when using fibers. I suppose it is possible there is some sort of bug there.
This is all really fascinating. CGO apparently makes a copy of the stack before calling cgo as do Fibers using assembly upstreamed at Boost context. I suspect this looks something like this:
- Go gets an HTTP request.
- Go copies the stack, and calls C code.
- PHP creates a Fiber, and copies the stack.
- Control returns to Go, the original stack is copied in, and it handles another HTTP request (now more copies of the stack)
I'm nowhere near 100% certain this is what is going on, but maybe this is why increasing the stack size probably helps.
Upon further testing (at least on x64 machine), with the Dockerfile + this change running with --ulimit stack=33554432 is required to handle fibers reliably. I suspect this is due to some assumptions built into PHP/Boost. Each thread will get 1/4 of of the stack limit, which is 8mb (the default size on Linux) when using 32mb limit as mentioned above. Thus each thread will be dealing with the stack size it was originally written for, as I suspect PHP expects to be running in control.
RESULT
-------------------------------------
Success Count: 5000 (100%)
Failed Count: 0 (0%)
Durations (Avg):
DNS :0.0469s
Connection :0.0981s
TLS :0.1101s
Request Write :0.0200s
Server Processing :1.5924s
Response Read :0.0024s
Total :1.8700s
Status Code (Message) :Count
200 (OK) :5000
With this, Frankenphp is able to quite handily handle quite a bit of connections and fibers without any crashes. Using stackusage, I was able to determine the average usage to be about 8mb on average, with a maximum of 21mb. Obviously this will be impacted by the number of Fibers created.
I'm going to do a couple of tests with regular PHP and smaller stack size limits, since this may be a bug with PHP.
Result: non-issue.
Actually, this change causes a segfault when running the php cli. If you set the stack size limit to 32 mb, this change isn't required at all.
@withinboredom maybe could we at least patch the Docker image to have Fibers support out of the box?
It's a specific runtime configuration, so probably just needs documentation:
Docker Run: --ulimit stack=33554432
Docker Compose: ulimits: {"stack": 33554432}
Kubernetes: (not supported: see https://github.com/kubernetes/kubernetes/issues/3595)
Alternatively, if there is some way to get PHP + golang stacks to play nice together, that would be a better solution.