Bug: API doesn't respect memory limit with PG backend
Description:
The API doesn't respect global memory limit with a PG backend. This leads to incomplete results for objects with large controllers and controllables metrics.
Are you intending to fix this bug?
No
Component(s) Affected:
- API
- GUI
Steps to Reproduce:
- Ingest a large enough Entra ID dataset into a PG backend
- Configure BHCE with a 0 (unlimited) graph query memory limit through
bhe_graph_query_memory_limit=0 - request
/api/v2/azure/tenants?object_id={AZ_TENANT_ID} - See HTTP 500 error: "db error: graph query required more memory than allowed - Limit: 1024.00 MB - Memory In-Use: XXXX MB"
- request
/api/v2/azure/tenants?object_id={AZ_TENANT_ID}&type=list&limit=1000&related_entity_type=descendent-az-users - See HTTP 500 error: "calculating the request results exceeded memory limitations due to the volume of objects involved"
Expected Behavior:
I expect the query to return the tenant object in the first case, and a list of AZUser objects in the second.
Actual Behavior:
The query runs out of memory, even though the application is configured not to have a memory limit. The memory limit doesn't seem to be related to the application's default value (2G).
Screenshots/Code Snippets/Sample Files:
The error hit in the second case (descendant object access) is here: https://github.com/SpecterOps/BloodHound/blob/main/cmd/api/src/api/v2/azure.go#L331
Environment Information:
BloodHound: v8.3.1
Go (if API related): 1.24.4
Additional Information:
The same error is emitted when selecting the tenant object in the GUI. This visually manifests with a red exclamation mark instead of the descendant object count. The app consumes a lot of memory while the tenant object remains selected in the GUI. Memory consumption on the DB side stays rather low.
It is possible to return AZUser nodes through cypher using a tenantid filter.
Related Issues:
None I could find.
Contributor Checklist:
- [x] I have searched the issue tracker to ensure this bug hasn't been reported before or is not already being addressed.
- [x] I have provided clear steps to reproduce the issue.
- [x] I have included relevant environment information details.
After further testing, it appears that this affects the AD endpoints as well.
Querying /api/v2/groups/<ENTERPRISE_ADMINS_SID?counts=true fails to return. The application logs a graph query required more memory than allowed error with the 1024MB limit. The application is configured with a memory limit of 0.
This behaviour was likely already present in v8.2.0. However, it looks like v8.1.3 did not exhibit this defect.
Thanks for the report @0xd6cb6d73! Turns out a version of this issue has existed for a while... #106
Going to echo what's posted there, we're going to be putting more effort behind this soon 😄
Hi @0xd6cb6d73, wanted to give you a quick update.
Our wonderful @mistahj67 managed to chase this down and we have a PR open to resolve this upstream: https://github.com/SpecterOps/DAWGS/pull/19
I don't have any expectation on when this will be resolved in the product, but if you're desperate for this fix I could help you build from source
Hello @wes-mil, thanks a lot for the update.
Integrating this PR into our build is something we're interested in as this issue is a blocker for us. I'll take you up on your offer.
If you're on BloodHound Community slack, feel free to start a thread in bloodhound-chat so that we can communicate a little more fluidly! (If not, we can also continue to use GitHub. No problem either way!) Have a few plates spinning, but will get to this as soon as I can 😄
As discussed through slack, here are the steps I took to integrate the branch containing the fix into a BHCE build:
- Forked DAWGS and tagged the tip of the
plumb-memory-cfg-to-pgbranch asv0.3.1-test1** https://github.com/0xd6cb6d73/DAWGS/tree/v0.3.1-test1 - Used a replace directive in BHCE's go.mod to point to this new version ** https://github.com/0xd6cb6d73/BloodHound/commit/177c7645de90d66c4182fb9d6c5045cf4bdbebbb
- Built BHCE using the following command:
docker build -t bloodhound:my . -f dockerfiles/bloodhound.Dockerfile - Tested using a slightly modified default compose project to use pg as the graph backend: ** https://github.com/0xd6cb6d73/BloodHound/commit/af8e334183eb05606405ed8feaf5de106ef764b5
I uploaded the Entra and AD sample data into the instance for testing. I then tried to perform the second test listed in this issue, listing all descendant user objects from the AZ tenant. The results are as follows:
- Using 2G of RAM (default) there is an error complaining about exceeding 2G of RAM usage
- Using 4G of RAM there is an error complaining about exceeding 4G of RAM usage
- Using 6G of RAM the API returns the expected result
This confirms that the plumb-memory-cfg-to-pg feature branch is correctly integrated into the BHCE build, which resolves the issue.