BloodHound icon indicating copy to clipboard operation
BloodHound copied to clipboard

Bug: API doesn't respect memory limit with PG backend

Open 0xd6cb6d73 opened this issue 3 months ago • 1 comments

Description:

The API doesn't respect global memory limit with a PG backend. This leads to incomplete results for objects with large controllers and controllables metrics.

Are you intending to fix this bug?

No

Component(s) Affected:

  • API
  • GUI

Steps to Reproduce:

  1. Ingest a large enough Entra ID dataset into a PG backend
  2. Configure BHCE with a 0 (unlimited) graph query memory limit through bhe_graph_query_memory_limit=0
  3. request /api/v2/azure/tenants?object_id={AZ_TENANT_ID}
  4. See HTTP 500 error: "db error: graph query required more memory than allowed - Limit: 1024.00 MB - Memory In-Use: XXXX MB"
  5. request /api/v2/azure/tenants?object_id={AZ_TENANT_ID}&type=list&limit=1000&related_entity_type=descendent-az-users
  6. See HTTP 500 error: "calculating the request results exceeded memory limitations due to the volume of objects involved"

Expected Behavior:

I expect the query to return the tenant object in the first case, and a list of AZUser objects in the second.

Actual Behavior:

The query runs out of memory, even though the application is configured not to have a memory limit. The memory limit doesn't seem to be related to the application's default value (2G).

Screenshots/Code Snippets/Sample Files:

The error hit in the second case (descendant object access) is here: https://github.com/SpecterOps/BloodHound/blob/main/cmd/api/src/api/v2/azure.go#L331

Environment Information:

BloodHound: v8.3.1

Go (if API related): 1.24.4

Additional Information:

The same error is emitted when selecting the tenant object in the GUI. This visually manifests with a red exclamation mark instead of the descendant object count. The app consumes a lot of memory while the tenant object remains selected in the GUI. Memory consumption on the DB side stays rather low.

It is possible to return AZUser nodes through cypher using a tenantid filter.

Related Issues:

None I could find.

Contributor Checklist:

  • [x] I have searched the issue tracker to ensure this bug hasn't been reported before or is not already being addressed.
  • [x] I have provided clear steps to reproduce the issue.
  • [x] I have included relevant environment information details.

0xd6cb6d73 avatar Nov 18 '25 16:11 0xd6cb6d73

After further testing, it appears that this affects the AD endpoints as well.

Querying /api/v2/groups/<ENTERPRISE_ADMINS_SID?counts=true fails to return. The application logs a graph query required more memory than allowed error with the 1024MB limit. The application is configured with a memory limit of 0.

This behaviour was likely already present in v8.2.0. However, it looks like v8.1.3 did not exhibit this defect.

0xd6cb6d73 avatar Nov 27 '25 09:11 0xd6cb6d73

Thanks for the report @0xd6cb6d73! Turns out a version of this issue has existed for a while... #106

Going to echo what's posted there, we're going to be putting more effort behind this soon 😄

wes-mil avatar Dec 02 '25 16:12 wes-mil

Hi @0xd6cb6d73, wanted to give you a quick update.

Our wonderful @mistahj67 managed to chase this down and we have a PR open to resolve this upstream: https://github.com/SpecterOps/DAWGS/pull/19

I don't have any expectation on when this will be resolved in the product, but if you're desperate for this fix I could help you build from source

wes-mil avatar Jan 05 '26 18:01 wes-mil

Hello @wes-mil, thanks a lot for the update.

Integrating this PR into our build is something we're interested in as this issue is a blocker for us. I'll take you up on your offer.

0xd6cb6d73 avatar Jan 05 '26 18:01 0xd6cb6d73

If you're on BloodHound Community slack, feel free to start a thread in bloodhound-chat so that we can communicate a little more fluidly! (If not, we can also continue to use GitHub. No problem either way!) Have a few plates spinning, but will get to this as soon as I can 😄

wes-mil avatar Jan 05 '26 19:01 wes-mil

As discussed through slack, here are the steps I took to integrate the branch containing the fix into a BHCE build:

  • Forked DAWGS and tagged the tip of the plumb-memory-cfg-to-pg branch as v0.3.1-test1 ** https://github.com/0xd6cb6d73/DAWGS/tree/v0.3.1-test1
  • Used a replace directive in BHCE's go.mod to point to this new version ** https://github.com/0xd6cb6d73/BloodHound/commit/177c7645de90d66c4182fb9d6c5045cf4bdbebbb
  • Built BHCE using the following command: docker build -t bloodhound:my . -f dockerfiles/bloodhound.Dockerfile
  • Tested using a slightly modified default compose project to use pg as the graph backend: ** https://github.com/0xd6cb6d73/BloodHound/commit/af8e334183eb05606405ed8feaf5de106ef764b5

I uploaded the Entra and AD sample data into the instance for testing. I then tried to perform the second test listed in this issue, listing all descendant user objects from the AZ tenant. The results are as follows:

  • Using 2G of RAM (default) there is an error complaining about exceeding 2G of RAM usage
  • Using 4G of RAM there is an error complaining about exceeding 4G of RAM usage
  • Using 6G of RAM the API returns the expected result

This confirms that the plumb-memory-cfg-to-pg feature branch is correctly integrated into the BHCE build, which resolves the issue.

0xd6cb6d73 avatar Jan 05 '26 20:01 0xd6cb6d73