node icon indicating copy to clipboard operation
node copied to clipboard

AsyncLocalStorage loses the store if used together with createHook if there is a async function in between

Open mcollina opened this issue 1 year ago • 25 comments

Consider the following code:

import { AsyncLocalStorage, createHook } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

// Commenting out this line will make the bug go away
hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

async function main () {
  // Commenting out this line will make the bug go away
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

Note that:

  1. disabling the hook solve the problem
  2. removing the await 1 promise solves the problem

Debugging the executionAsyncResource() in this case:

Promise {
  <pending>,
  [Symbol(async_id_symbol)]: 12,
  [Symbol(trigger_async_id_symbol)]: 9
}

while in without the createHook().enable() call:

{ [Symbol(kResourceStore)]: { foo: 'bar' } }

mcollina avatar May 17 '24 17:05 mcollina

cc @bengl @Qard

mcollina avatar May 17 '24 17:05 mcollina

Additional notes:

I attempted with the following changes, and it still happens:

  • Replace the async_hook with a promise hook. This narrows it down to promise hooks, I think.
  • Moved the hook after the storage construction. The order here doesn't seem to matter, regardless of what type of hook is used, except that it must be before the first await in the example below.
  • Replace the main() function, and awaiting it, with a function that reduces the number of awaits used, in case that's an issue. I could not reproduce without awaiting here. Simply putting a then() containing the checks at the end did not suffice to reproduce.
import { AsyncLocalStorage, executionAsyncResource } from 'node:async_hooks';
import { promiseHooks } from 'node:v8';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

// Commenting out this line will make the bug go away
promiseHooks.onInit(() => {});

await Promise.resolve().then(() => {
  asyncLocalStorage.enterWith({ foo: 'bar' });
})

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

bengl avatar May 17 '24 18:05 bengl

This is known and (unfortunately) expected. It's also presently encoded into the AsyncContext spec proposal, which I'm trying to explain to people why it should not work that way. Specifically the issue is that PromiseHook, ContinuationPreservedEmbedderData, and the design of the future AsyncContext API all follow the path of continuation registration rather than promise resolution, so effectively context routes around awaits instead of through them.

The reason it works before the first await is that the code up until that first await runs synchronously and so it changes the current context before the await happens. That await then stores that context and restores it when it resumes. However if you put it after the first await then it will have already captured the context to restore coming out of the await before that happens. It only flows within that async function, which effectively makes it the same as a local variable. 😐

I'll continue to work on explaining to TC39 and V8 folks why this doesn't work.

Qard avatar May 18 '24 08:05 Qard

Oh, and the reason why the hooks.enable() in the first example changes the behaviour is because PromiseHook would otherwise not be enabled until that enterWith() is reached and therefore there would be no captured context from the await to restore when it resumes. If you put the hooks.enable() right before that enterWith() it would work because it would have already passed the point when it could have captured a context value for the await.

Qard avatar May 18 '24 09:05 Qard

Can you expand on why this works:

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

await 1

asyncLocalStorage.enterWith({ foo: 'bar' });

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

While this doesn't?

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

await Promise.resolve(1).then(() => {
  asyncLocalStorage.enterWith({ foo: 'bar' })
});

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

This is really unintuitive.

mcollina avatar May 18 '24 13:05 mcollina

The enterWith(...) call will set the context until something would change it. In your first example the gap between where you set it and where you try to read it is sync so nothing would have changed it. Whereas in your second example it is set within an awaited promise continuation so the context value captured when the await started would replace that value when the await resumes.

Both async functions and promise continuations occur within this sort of dead space where context will only flow internally. In the case of async/await, the context at the point the await yields is restored when the await resumes. In the case of promises the context is captured when a continuation is added and restored when it runs, but it will do this for each point in the chain, so if you chain multiple continuations in place they will all link back to the same context rather than nesting. You would have to nest the continuation attachment for it to nest the contexts.

The more user-intuitive flow one would expect it to have is that promise continuations and await resumes are treated in the same way as callbacks. The point of confusion is that callbacks have a single point that functions both as callback registration which naturally leads into resolution, but the registration is the final point in the flow before it goes to the native side. Whereas promises split the two concerns and people are conflating the current design of AsyncLocalStorage binding callbacks as meaning that the appropriate place to bind such context is at registration point. Which is actually not the case, just the resolve happens in unobservable code and, due to implementation details, async_hooks and therefore AsyncLocalStorage needs that context binding to occur in observable code.

We don't flow context into internals, so there are a bunch of points where the value stored in context would be technically incorrect or empty, but it doesn't matter because it's impossible to acquire the context value at those points as it's in native code at the time. My rewrite to use AsyncContextFrame was an attempt to begin making that more correct, but that got blocked on V8 internal politics and I haven't got back to it yet...hopefully I will soon.

Qard avatar May 19 '24 03:05 Qard

Isn't this an issue that constructing an asyncLocalStorage doesn't enable promise hooks for performance optimization, until setting a value in an asyncLocalStorage? Enabling promise hooks before and after a promise creation/resolution would change how AsyncLocalStorage propagates a value is a problem on either semantics.

legendecas avatar May 20 '24 10:05 legendecas

Isn't this an issue that constructing an asyncLocalStorage doesn't enable promise hooks for performance optimization, until setting a value in an asyncLocalStorage?

Partially, but I think it's a good tradeoff.

We might want to document this "odd" behavior.

mcollina avatar May 20 '24 11:05 mcollina

I'm confused about @Qard 's above comment that this issue exists in AsyncContext. The performance optimization that @legendecas mentions would be disabled for any native AsyncContext implementation. (Also enterWith doesn't exist in AsyncContext.) Could you explain the relationship?

littledan avatar May 20 '24 13:05 littledan

The issue is that the user is expecting the context to flow out of an await.

async function main () {
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

While this does use the current scope modifying enterWith(...) rather than run(...), it could equally be expressed like this and have the same problem:

await asyncLocalStorage.run({ foo: 'bar' }, async () => {
  await 1
})

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

The expected flow is that the context flows into asynchronously following code, which it does not do whereas callback code does. A user could reasonably expect rewriting of callback-based code into promise-based code to continue to flow in roughly the same way, but this is not the case.

Qard avatar May 20 '24 13:05 Qard

@Qard Do you agree that the bug as originally filed can be fixed without that other change?

littledan avatar May 20 '24 14:05 littledan

The issue is that the user is expecting the context to flow out of an await.

I think the issue is that the user expects these two following code to be identical:

async function main () {
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

// Should not be undefined

notEqual(asyncLocalStorage.getStore(), undefined);
await 1
asyncLocalStorage.enterWith({ foo: 'bar' });
await 1

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

Most developers would think those are identical. Only experts would notice the difference (I was fooled by it on first sight, even after implmeneting a chunk of it).

mcollina avatar May 20 '24 15:05 mcollina

@Qard Do you agree that the bug as originally filed can be fixed without that other change?

How? What other change? I'm not clear what you are referring to.

Qard avatar May 20 '24 15:05 Qard

The other change being turning off the optimization that legendecas identified in https://github.com/nodejs/node/issues/53037#issuecomment-2120190830 . Such an optimization will not apply to AsyncContext.

littledan avatar May 20 '24 16:05 littledan

That doesn't fix the flow problem I described. It would make the behaviour consistent, but consistently not what the user is expecting.

Qard avatar May 21 '24 04:05 Qard

I think the issue is that the user expects these two following code to be identical:

Most developers would think those are identical. Only experts would notice the difference (I was fooled by it on first sight, even after implmeneting a chunk of it).

enterWith is the factor that make the pattern unintuitive. If the AsyncLocalStorage was accessed with run instead, the two examples would be identical:

async function main () {
  await 1
  asyncLocalStorage.run({ foo: 'bar' }, () => {});
}

await main()

equal(asyncLocalStorage.getStore(), undefined);
await 1
asyncLocalStorage.run({ foo: 'bar' }, () => {});
await 1

equal(asyncLocalStorage.getStore(), undefined);

legendecas avatar May 21 '24 09:05 legendecas

Indeed, but that destroys the goal. asyncLocalStorage.run() leads to worse DX, which is the whole point of this exercise.

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

await asyncLocalStorage.run({ foo: 'bar' }, async () => {
  await 1
  console.log(executionAsyncResource());
  // Should not be undefined
  notEqual(asyncLocalStorage.getStore(), undefined);
})

vs

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

asyncLocalStorage.enterWith({ foo: 'bar' })

await 1

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

The latter is 100% more idiomatic javascript.

mcollina avatar May 21 '24 10:05 mcollina

worse DX, which is the whole point of this exercise.

Which exercise? If you were talking about AsyncContext: I don’t think the point of AsyncContext is to bring DX improvements vs AsyncLocalStorage. It is more about being a subset which can be implemented across environments, optimized well, etc.

littledan avatar May 21 '24 20:05 littledan

Indeed, but that destroys the goal. asyncLocalStorage.run() leads to worse DX, which is the whole point of this exercise. … The latter is 100% more idiomatic javascript.

Note that we've discussed how to bring DX improvements to AsyncContext proposal in a way that would be consistent to developers:

const ctx = new AsyncContext.Variable();

async function main() {
  using _ = ctx.scope('foo');
  equal(ctx.get(), 'foo');
  await 1
  equal(ctx.get(), 'foo');
}

const m = main();
// Doesn't matter that main started executing, the suspend point restores previous context
equal(ctx.get(), undefined);

await m;
// Doesn't matter that main finished executing, the await's resume restores previous context
equal(ctx.get(), undefined);

This won't solve @Qard's goal, but it does make OP's problematic code consistent. It shouldn't matter how you structure the code, the variable stores the same undefined value outside the using scope.

jridgewell avatar May 21 '24 20:05 jridgewell

That flow makes it essentially only a local variable though. Users are expecting it to flow into continuations as is shown by the assumption at the start that the context set within main is available in the top-level scope after main completes. This is the point I'm trying to make--code after await points are functionally the same as nested callbacks and those do receive the context, so the behaviour of not flowing through those points is inconsistent and confusing to users.

Qard avatar May 22 '24 02:05 Qard

I think this issue has multiple possible goals at the same time:

  • Inconsistent behavior for seemingly equivalent code structure
  • Improved DX without nesting in callbacks
  • Flow-through contexts

I don't know what priority @mcollina sets for these, or even if he intended to hit on all these points in the OP. I'm only saying that we can solve the first 2 without the 3rd. AC already fixes the inconsistent behavior, and we've discussed how to improve DX. If the third is lowest priority, or even unintentional, then it doesn't necessitate a change in design for AC or ALS.

jridgewell avatar May 22 '24 16:05 jridgewell

Improved DX without nesting in callbacks

How can you achieve this without flow-through contexts?

mcollina avatar May 22 '24 16:05 mcollina

See the async function body in https://github.com/nodejs/node/issues/53037#issuecomment-2123389594. It is possible to mutate the current context in scope without it leaking out of the scope. It's not defined in the spec yet, needs a extra syntax to be usable, but it's not unsolvable.

If that's the only DX improvement you're looking for, then it's a separate issue from flow-through. If the DX you expect is to leak the value out of the scope (which isn't flow-through semantics either), then that's not possible. If the DX you want is explicitly flow-through (context active during the promise's resolution is used after the await), then that's not the same as scoped mutations.

jridgewell avatar May 22 '24 21:05 jridgewell

If the DX you want is explicitly flow-through (context active during the promise's resolution is used after the await)

Yes. I think this is what the community needs. It appears that quite a few of you are being dismissive of this feedback. Users would expect this to happen, and be surprised when it doesn't. Framework authors would need to create convoluted APIs to leverage this to work around this limitation.

Why do you think it cannot be done? What are the tradeoffs implementation wise? No one has provided any explanation whatsoever on why, the only comments I have received are "because we say so". As one of the people that made this possible to begin with (pulling an all nighter in Munich eons ago), I think I have enough context to understand the tradeoffs at play.

mcollina avatar May 23 '24 06:05 mcollina

Users would expect this to happen, and be surprised when it doesn't. Framework authors would need to create convoluted APIs to leverage this to work around this limitation.

Are there links to issues from users showing this confusion? This issue is the only one I've been linked to.

It appears that quite a few of you are being dismissive of this feedback.

Speaking for myself, I think the current semantics that AC, ALS, Zone.js, and CPED expose is the correct one. It's exactly how Vercel implemented it's request store before before I inherited the project (which is the reason I started working on the AC proposal to begin with). If so many implementations land on the same thing, I think that means the semantics must be good.

Why do you think it cannot be done?

I'm assuming this is in response to me saying "If the DX you expect is to leak the value out of the scope (which isn't flow-through semantics either), then that's not possible". My statement is about the leak caused by enterWith. enterWith is not necessary for flow-through semantics. Implementing flow-through semantics in AC is easy, and could be done in a way that does not cause a permanent memory leak of the last stored item.

So, there are 2 orthogonal DX discussions going on here:

  1. Do users expect in-scope mutations?
  2. Do users expect flow-around or flow-through semantics?

We can implement scoped mutations with either flow-around or flow-through semantics. We can chose flow-around or flow-through semantics without implementing scoped mutations.

jridgewell avatar May 23 '24 19:05 jridgewell

I think the current semantics that AC, ALS, Zone.js, and CPED expose is the correct one. It's exactly how Vercel implemented it's request store.

That is targeting React though, which inherently has a unidirectional execution model--it flows inward to the inner components and only render results flow out so there is no following execution to flow to. Whereas Node.js most definitely has all sorts of branching and converging async which context needs to flow through. The use cases are entirely different, and there are multiple highly experienced Node.js developers in here that have interacted with a huge number of other Node.js developers in the community using AsyncLocalStorage and have a good idea of what user expectations are in that regard. All of those people in this issue are stating the same opinion of how context should flow for typical use cases in Node.js and are seemingly not being heard.

If so many implementations land on the same thing, I think that means the semantics must be good.

No, it just means they're all copying each other because other JS implementations are the closest examples to look to for inspiration.

If you look at literally any other language they all came up with the same different design from this, so you could make the same argument that because there's so many others that came to that same design that there must be something inherently wrong with our current model for being the outlier, and indeed there is. We built our model in this way because of deficiencies in the runtime to be able to express it any differently.


For similar issues demonstrating that ALS flow is not matching expectations:

  • https://github.com/nodejs/node/issues/46262 (Expected unhandledrejection to have flow-through semantics)
  • https://github.com/nodejs/node/issues/45848 (Same as this issue)
  • https://github.com/nodejs/node/issues/42237 (Expected context to flow through async generators)
  • https://github.com/nodejs/node/issues/36683 (Assuming run(...) meant contexts were stored in a stacked form)

There's plenty more, that's just the first page of ALS-related issues. The point is, we get lots of usability complaints both within GH issues and I see many more when dealing with APM customers expecting span data to flow and connect in particular ways.

Qard avatar May 27 '24 19:05 Qard

@Qard I agree that AsyncLocalStorage could be improved like how it interacts with async generators. But the issues mentioned are not universally agreed on the behavior mentioned described as flow-through in this thread, like https://github.com/nodejs/node/issues/46262#issuecomment-1405722308.

legendecas avatar May 28 '24 12:05 legendecas

That linked comment is conflating implementation details rather than consider what actually is the user expectation. Users expect context to flow from execution which, from their perspective, caused the particular continuation to execute. So context flows from calls to callbacks and from resolves to then handlers.

Qard avatar May 28 '24 19:05 Qard

That is targeting React though, which inherently has a unidirectional execution model--it flows inward to the inner components and only render results flow out so there is no following execution to flow to.

Vercel's edge runtime isn't specifically React. The use case we had to solve is essentially request logging, so that every console.log() the dev writes is tied to the request that caused it. A particular use case that I had to solve was sharing promises across requests, and if request B starts appending logs using request A's log id because the user awaited request A's promise, they're gonna confuse a bunch of users.

If you look at literally any other language they all came up with the same different design from this, so you could make the same argument that because there's so many others that came to that same design that there must be something inherently wrong with our current model for being the outlier, and indeed there is.

No language that I familiar with uses flows-through semantics, and some but not all implement mutability. Ignoring Node:

  • Go's context:
    • https://gist.github.com/jridgewell/ca75e91f78c6fa429af10271451a437d#file-test-go
    • context is immutable, must clone context to set a new value
    • propagation is explicit, must pass context to fn
  • Rust's LocalKey:
    • https://gist.github.com/jridgewell/ca75e91f78c6fa429af10271451a437d#file-test-rs
    • implements immutable scoped runs (it's what I based my initial AC prototype on)
    • there is no way to mutate the current local, it has to be scoped to a new function execution
  • Ruby's Fiber
    • https://gist.github.com/jridgewell/ca75e91f78c6fa429af10271451a437d#file-test-rb
    • implements mutability
    • does not implement flows-through
      • resuming a fiber does not set the current fiber's storage to the inner fiber's storage after execution
  • Python's contextvars
    • https://gist.github.com/jridgewell/ca75e91f78c6fa429af10271451a437d#file-test-py
    • implements mutability
    • does not implement flows-through
      • running a function within a cloned context does not assume that context outside
      • running a new task within a cloned context does not assume that context outside
      • confusingly: creating a coroutine in a cloned context doesn't keep that cloned context in the coroutine, the coroutine will use whatever context is active during the await
        • I don't believe anyone is advocating for this behavior, and it's not the same as either flows-through or flows-around.

Responding to each of the linked ASL issues:

This was actually created in response to our discussions in AC proposal. unhandledrejection is a global handler that can't be installed per request context, so we have to treat it special. The OP's semantics can be solved with either promise construct time's context or the reject time's, but it's cheaper for us to implement the reject time, so we used that. unhandledrejection's registration time never really makes sense, but this isn't an await for flows-through.

The OP there seems to advocate for consistency, and doesn't care if it's flows-around or flows-through. There seems to be a desire for code like the following to work:

describe("store", () => {
  beforeEach(async () => {
    store.enterWith("value");
  });

  it("the store should be set", () => {
    expect(store.getStore()).toEqual("value");
  });
});

This seems like the first mild +1 for flows-through semantics. But Jest could expose APIs for setting up the context snapshot to make this work in either semantics.

This doesn't bring up either flows-around or flows-through, just that async generators should have some context captured during init. Flow-around solves OP by binding the init concrete. Flow-though might actually break OP’s expectations, wouldn’t the .next() call’s context be required since it’s the executions context?

This seems to be an issue with exit() setting the current value to undefined instead of operating as enterWith's pair (restoring the previous value)? I don't think either semantics are asked for here, and AC solves the stacking by having .run(), and potentially if we have a scoped-mutation.

jridgewell avatar May 28 '24 22:05 jridgewell

Vercel's edge runtime isn't specifically React. The use case we had to solve is essentially request logging, so that every console.log() the dev writes is tied to the request that caused it. A particular use case that I had to solve was sharing promises across requests, and if request B starts appending logs using request A's log id because the user awaited request A's promise, they're gonna confuse a bunch of users.

This is the most interesting use-case/feedback of the current behavior and something I haven't considered before (even if "cross the promises" all the time). I can see how a flow-through semantics would be problematic in this case, and how it would break quite a lot of code that make this assumption.

mcollina avatar May 29 '24 12:05 mcollina