add namespace support for async storage
What does this PR do?
Adds centralized named storages similar to channels.
Motivation
Right now we are mainly relying on a single global storage with products reading or writing data to it according to their needs. This has a few downsides. First, it means that holding onto one piece of data ends up holding on everything else on that store. Second, there is no control over the scope of the individual properties of the store, as everything just ends up in whatever scope the main store is.
The goal of this PR is similar to why there is a channel(name) helper for diagnostics channel, to provide a central registry where it doesn't matter who initializes the storage or even if it was initialized at all, and to properly isolate the scope of each individual store.
Overall package size
Self size: 7.46 MB Deduped: 63.23 MB No deduping: 63.51 MB
Dependency sizes
| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.59 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 9.0.0 | 580.4 kB | 1.03 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe
Benchmarks
Benchmark execution time: 2024-10-11 19:53:50
Comparing candidate commit 4ddfce358fbafa9f4c4227f8ff7b3a881df4441b in PR branch storage-namespace with baseline commit 60529442d2184aa8f05ed906c8f834e51849ad26 in branch master.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.
that means every one of our namespace has to do enterWith() and all the continuation fixes right ? The good thing about the current single ALS store is that I can just rely on it. I don't need to maintain it. If I needed my own namespace I'd just do new AsyncLocalStorage(), but I don't because that's not what we need ?
I'm not sure I get the point of doing this really.
that means every one of our namespace has to do enterWith() and all the continuation fixes right
Not sure I understand the question. Continuation fixes are done with AsyncResource since we're fixing the library itself which is broken. This is not done at the AsyncLocalStorage level.
If you are referring instead to the use of enterWith when tracing, then this doesn't apply to other products since the scoping is different (for example AppSec wants a request-level scope, LLMObs wants an LLM span scope, etc). So then yes every product would have to maintain their own async stack, but that's how it should be, otherwise everything ends up being either attached to a span (or its corresponding store) or you need to maintain your own AsyncLocalStorage instance which has other problems (described below).
The good thing about the current single ALS store is that I can just rely on it. I don't need to maintain it.
I don't understand this statement either. What additional maintenance burden is there between doing const asmStorage = storage('asm') versus const asmStorage = new AsyncLocalStorage()?
If I needed my own namespace I'd just do new AsyncLocalStorage(), but I don't because that's not what we need ?
You can definitely do that and it's done in a few places today. The problem with this approach is that it means that in order to share that store with other products, either some interdependency has to be introduced by explicitly importing the other product, or the stores need to all be defined in a centralized place (which could be an option too but it pushes product knowledge to that centralized place which should be generic). You also need to determine where to instantiate the storage, which is not necessarily a big problem but not having to worry about that means that the storage can be lazily instantiated to avoid wasting memory (or eventually even the whole async_hooks import when not needed)
Basically, this does exactly what is says in the description, which is to provide a channel-like API to storages for all the same reason that the API for that is channel(name) instead of new Channel().
@simon-id Thoughts?
I guess this is fine, especially as part of a bigger project of separating core and products. As long as we can read the data we all need, like current span, current request etc, i'm fine with this :+1: