graphrag-accelerator icon indicating copy to clipboard operation
graphrag-accelerator copied to clipboard

[BUG] graphrag pods cannot reach CosmosDB when accelerator is deployed in Azure Government

Open timothymeyers opened this issue 1 year ago • 2 comments

Describe the bug

Deploying the accelerator to Azure Government results in the following CrashLoopBackoff error for both the -index and -query pods.

azure.cosmos.exceptions.CosmosHttpResponseError: (Forbidden) Request originated from IP 52.XXX.XXX.XXX through public internet. This is blocked by your Cosmos DB account firewall settings. More info:
https://aka.ms/cosmosdb-tsg-forbidden
ActivityId: XXXX, Microsoft.Azure.Documents.Common/2.14.0
Code: Forbidden
Message: Request originated from IP 52.XXX.XXX.XXX through public internet. This is blocked by your Cosmos DB account firewall settings. More info:
https://aka.ms/cosmosdb-tsg-forbidden
ActivityId: XXXX, Microsoft.Azure.Documents.Common/2.14.0

This is because the CosmosDB firewall has Public network access disabled and the pods in AKS require access via the AKS API Server PIP.

I'm not sure why this is not a problem in Azure Commercial.

To Reproduce Steps to reproduce the behavior:

az cloud set --name "AzureUSGovernment"
az login
  • Follow the Deployment guide but deploy to Azure Government instead of Azure Commercial. Do not use either the -d or -g option.

The following additional params are required in deploy.paramaters.json

  "AISEARCH_ENDPOINT_SUFFIX": "search.azure.us",
  "AISEARCH_AUDIENCE": "https://search.azure.us",
  "CLOUD_NAME":"AzureUSGovernment",
  "GRAPHRAG_COGNITIVE_SERVICES_ENDPOINT":"https://cognitiveservices.azure.us/.default"

timothymeyers avatar Jul 10 '24 14:07 timothymeyers

Some notes/workarounds

  • Deploying with -g or -d does not have this issue.
  • You can manually add the AKS VNET to the CosmosDB Firewall via the Portal. Networking -> Public Access -> Selected Networks -> Existing Virtual Network -> (select your AKS vnet in the MC_xxx resource group created by the deployment) -> Save
  • I believe this can also be done via the az cli but have not tested that yet.

A proper fix is likely to deploy the AKS cluster in private cluster mode with public fqdn disabled, and establish a private endpoint between the AKS cluster and Cosmos (and the other resources it needs to reach).

timothymeyers avatar Jul 10 '24 14:07 timothymeyers

Need to test this again after changes from #123 were introduced.

timothymeyers avatar Aug 16 '24 17:08 timothymeyers

I believe this is OBE now.

timothymeyers avatar Feb 13 '25 18:02 timothymeyers

Actually, no, this is still an issue when deploying GraphRAG in Azure Government without the -g option.

timothymeyers avatar Feb 18 '25 16:02 timothymeyers