Optimize `Deferred` execution
Resolves https://github.com/webonyx/graphql-php/issues/1803
Follow up to https://github.com/webonyx/graphql-php/pull/1790
Summary
This PR optimizes memory usage for deferred execution while preserving the DataLoader pattern that was broken in #1790.
Problem
PR #1790 attempted to optimize memory but broke DataLoader batching - batch loads executed incrementally before all IDs were collected, causing null values with 600+ items.
Solution
Reduces memory usage by replacing closure-based queue storage with direct object/array storage:
-
Store executor on promise instance -
$executorproperty instead of closure context -
Queue promise objects directly -
enqueue($this)instead ofenqueue(function() {...}) -
Use arrays for waiting handlers -
[$promise, $onFulfilled, $onRejected, $state, $result]instead of closures -
Explicit cleanup -
$task->executor = nullandunset($task)enable GC during long queues
Tests Added
-
testDataLoaderPatternWithManyDeferredObjects()- 600 items, verifies single batch load -
testNestedDataLoaderPattern()- 750 items, reproduces #1803 scenario
Verification
- [x] Tests pass with optimization
- [x] Tests fail when #1790 changes are cherry-picked (confirms they catch the regression)
- [x] DataLoader pattern preserved - all IDs collected before first batch executes
@shadowhand @zimzat @joshbmarshall Can you give this a review?
I loaded it up on the system that was having problems with v15.27.0
It still has issues with v15.27.0 tag, but this is working for me at 1954f81fe437d910c0283307d8f17d125b4a3a11
For my query, the memory usage is identical for this pull request as for master branch. But I think it's because I have nested resolvers:
event date > event > event dates
I think it's waiting for all the event dates to be queued before resolving, which is how I'd expect it to be. I don't think that can be managed!
@spawnia This looks promising; I don't see any issues with load order or breaking batches.
Below are some extra thoughts based on my own trial-error:
The different signatures in the queue feels concerning. I had started experimenting with the idea of switching out the anonymous functions with invokable classes. Your implementation of $this->executor wins out over the cost of instantiating a new class, while a new class for the waiting queue item as a replacement for the array is a little more efficient there, so the two patterns combined reduces peak usage slightly further.
Using SplFixedArray for $this->waiting[] = SplFixedArray would also get half way to the same peak memory reduction.
https://gist.github.com/zimzat/310e9121225ef18640fad2f4ab675ad4
@zimzat I did follow up on your suggestion to use SplFixedArray, seems like a useful optimization.
Can you elaborate on what specifically you find concerning? The methods in SyncPromise are not marked with @api, so they are not supposed to be part of the public API.
Can you elaborate on what specifically you find concerning?
The implicit array shape versus an explicit type check of a named and structured class. It's not a big deal, especially here and internally to the class (like you've said, this isn't technically a public API). The if (x) elseif (y) elseif (z) felt verbose compared to the simplicity of the queue always being callable(). ¯\_(ツ)_/¯
@zimzat I did another round which should hopefully alleviate your concern. Benchmarks look good too, what do you think?
@spawnia That looks nice; a simple signature, self-contained handling, and still a massive drop in memory usage in the simple many deferred (no then) scenario. Thanks!
Memory Benchmark Results
Ran memory benchmarks comparing master against the optimized implementation using a benchmark script adapted from @zimzat's test.
Results
| Metric | Master | Optimized | Change |
|---|---|---|---|
| Single Deferred | 888 bytes | 144 bytes | -84% |
Nested Deferred (with then) |
1,272 bytes | 528 bytes | -58% |
| 1,000 Deferreds | 896 KB | 144 KB | -84% |
| 5,000 nested Deferreds | 6.5 MB | 2.8 MB | -57% |
| Peak memory | 13.9 MB | 9.3 MB | -33% |
| Final memory | 517 KB | 263 KB | -49% |
Key Optimization: Garbage Collection
During benchmarking, we found that RootSyncPromise::runQueuedTask() must null out $executor after invocation to allow immediate garbage collection of the closure and its captured variables:
public function runQueuedTask(): void
{
// Clear reference to allow immediate garbage collection
$executor = $this->executor;
$this->executor = null;
// ...
}
Without this, closures stay alive until the promise object is freed, causing significant memory accumulation when many deferred operations are queued. This pattern is now documented in the code with an explanatory comment.
Benchmark Results Summary
Ran benchmarks comparing this branch against master on PHP 8.3.6.
Key Changes
- Promise state changed from strings (
"pending","fulfilled","rejected") to integers (0,1,2) - Extracted
SyncPromiseQueueto a dedicated class - Moved
$executorhandling fromSyncPromiseconstructor toDeferred - Added comprehensive
DeferredBenchfor tracking deferred execution performance
Memory Improvements
| Benchmark | Master | Branch | Change |
|---|---|---|---|
| benchManyNestedDeferreds | 19.39 MB | 17.67 MB | -8.9% |
| bench1000Chains | 10.08 MB | 9.72 MB | -3.6% |
| benchManyDeferreds | 6.45 MB | 6.09 MB | -5.6% |
Execution Time
| Benchmark | Master | Branch | Change |
|---|---|---|---|
| benchManyNestedDeferreds | 17,801 μs | 16,651 μs | -6.5% |
| bench1000Chains | 4,042 μs | 3,919 μs | -3.0% |
| benchManyDeferreds | 673 μs | 922 μs | +37% |
| benchChain100 | 97 μs | 145 μs | +50% |
| benchChain5 | 6.0 μs | 10.1 μs | +68% |
| benchSingleDeferred | 0.9 μs | 1.2 μs | +38% |
Summary
The branch achieves significant memory savings (up to 9%) for heavy deferred workloads while also improving execution time for the most demanding benchmarks (benchManyNestedDeferreds, bench1000Chains).
The microbenchmarks with simple promise chains show some CPU overhead, likely due to structural changes that benefit memory at a small cost to simple cases. For real-world GraphQL queries with many nested deferreds, the branch performs better overall.
Next steps for me:
- try this in my real-world projects
- try this in nuwave/lighthouse
Lookin' good!
I tested this in a private project with a couple of thousand GraphQL tests, no issues. Can't say anything about real life impact though.
Lighthouse and a large project of mine work.
Released with https://github.com/webonyx/graphql-php/releases/tag/v15.29.0.
very early at the moment, but a number of my sites have gone down because of this update. I've reverted back to v15.28.0 for now which gets them going. I'll have to get you details but this is causing issues
My sentry is reporting:
Call to undefined method GraphQL\Deferred::runQueue()
vendor/overblog/dataloader-php/lib/promise-adapter/src/Adapter/WebonyxGraphQLSyncPromiseAdapter.php in Overblog\PromiseAdapter\Adapter\WebonyxGraphQLSyncPromiseAdapter::await at line 115
public function await($promise = null, $unwrap = false)
{
if (null === $promise) {
Deferred::runQueue();
SyncPromise::runQueue();
$this->cancellers = [];
return null;
}
$promiseAdapter = $this->getWebonyxPromiseAdapter();