cyclonedx-node-yarn feat: Support for projects with multiple/nested workspaces

Currently this plugin will only resolve the dependencies/components for a single workspace at a time, which can introduce additional complexity/complication when working in a multi-workspace monorepo. Enhancing the plugin to recursively traverse all nested workspaces is quite a small change and would simplify the usage of this plugin in more complex projects, especially in cases where isolated reporting of BOM information is not required.

Mar 15 '24 12:03 MLSTRM

I've made an initial proof of concept branch for what the change for this could look like on my own fork here for now https://github.com/MLSTRM/cyclonedx-node-yarn/tree/feature/nested-workspace-traversal - need to rebase past some changes in the last couple of days before I can raise a proper PR if deemed a useful thing to have

Mar 15 '24 12:03 MLSTRM

PRs are welcome. Please make sure to follow the contribution guidelines, and add relevant test cases and test-beds

Mar 15 '24 12:03 jkowalleck

PR raised + test added, also noticed the old snapshots weren't updated since the license resolution removal so I've updated those at the same time.

Mar 18 '24 10:03 MLSTRM

With the v1.0 release candidate being public for some time now, i do not expect any internal refactoring or changes soon. This means, the implementation is ready to be extended.

Currently, it should be possible to call the plugin like yarn workspaces foreach cyclonedx ... to generate the SBOM for each workspace. Which makes totally sense, since each workspace is an individual product. Currently, it should be possible to call yarn cyclonedx in an individual workspace.

I still do not understand the need to combine an SBOM over all the workspaces, as this is already done when you called yarn cyclonedx in a (root) workspace that depended on the other workspaces ... but anyway ...

@MLSTRM are you still interested in working on this feature?

Jun 07 '24 17:06 jkowalleck

Hi @jkowalleck, apologies for the delayed response. I've ended up finding a different approach for my use-case so I'm happy to close this if you don't wish to continue with it.

I at least have several projects where the individual workspaces aren't necessarily anything that would be deployed separately, but more as separated modules as part of a larger application - as well as with how I was looking to implement SBOM into my CI/CD process and reporting, a singular file per repo was a much better fit for my aims.

Jul 04 '24 08:07 MLSTRM

re: https://github.com/CycloneDX/cyclonedx-node-yarn/issues/35#issuecomment-2208434505

thanks for the update, @MLSTRM .

i wonder how your process would be shaped, if you had individual packages, but just one big BOM for many of these. what tooling are you using to analyze this huge BOM that consists of many "components" that are actually independent (sub-)projects?

Could you elaborate on that? Getting insight here would help me understand the implications and the actual needs.

Thanks in advance.

Jul 04 '24 10:07 jkowalleck

Maybe its more of a consequence of the size of the projects I'm mostly working on, but the generated BOM doesn't end up being that big overall since each sub-package is quite small. A lot of my work is around AWS microservices, where a single repo would be a microservice, but might also expose some other packages for a client library (i.e. for other services to interact with this repo), as well as separation of core processing logic away from AWS lambda specifics, or splitting out storage implementation with DynamoDB vs Postgres vs other storage operator, where each component project may only have a couple of dependencies on its own. Each package isn't necessarily a deployable unit on its own.

At the same time, the tooling I'm using (which includes Aqua amongst other things) is also handling BOMs from larger monolith applications that have nearly 1k dependencies (whereas my largest BOM from the microservice side is ~350 dependencies across the workspace of 4 packages), so while the BOM is larger than it would be if you were doing things on a per-package level, its still quite "small" on a wider view.

Also the feedback loops I have around audit checks / compliance violations happens at CI/CD time as well as being reported after the fact, and having that attached directly to the repo the violation comes from rather than the specific subpackage makes it easier to address at the right level, as well as cutting down on some of the noise (since hoisted dependencies are then only reported once rather than multiple times).

At least for my current use-case, I've been evaluating trivy as a more drop-in language/structure agnostic solution and its serving the requirements quite nicely (at the slight annoyance of it not being so tightly integrated into the build tooling), and I'm fairly happy to write this up just a bad fit potentially where the intended direction of this plugin doesn't quite align with the structure I'm working with.

Jul 08 '24 12:07 MLSTRM