[Bug]: 401 on privileged actions after cold restart despite valid login
🐞 Bug Summary
After a cold restart of the server/Kubernetes node (e.g., powered off overnight), the Admin Web UI intermittently returns 401 Unauthorized for privileged actions even though I appear logged in. Affected actions include adding MCP servers, viewing metrics, and creating servers.
🧩 Affected Component
Select the area of the project impacted:
- [x]
mcpgateway- API - [x]
mcpgateway- UI (admin panel) - [ ]
mcpgateway.wrapper- stdio wrapper - [ ] Federation or Transports
- [ ] CLI, Makefiles, or shell scripts
- [ ] Container setup (Docker/Podman/Compose)
- [ ] Other (explain below)
🔁 Steps to Reproduce
- Deploy
ghcr.io/ibm/mcp-context-forge:lateston Kubernetes with UI and Admin API enabled and auth required (env excerpt below). DB is SQLite on a PVC at/data. - Power off the host (or shut down the cluster) at end of day; power back on next day. (A cold start of the pod may also reproduce.)
- Log into the Admin UI (Basic Auth).
- Try any privileged action: Add MCP server, Metrics tab, Create server, etc.
- The UI shows “401 Unauthorized” responses for those API calls while the UI still indicates I’m logged in.
🤔 Expected Behavior
Admin actions should succeed when authenticated (200/201 responses), without requiring any extra steps after a cold restart.
📓 Logs / Error Output
Network panel shows 401 on endpoints such as /admin/servers, /admin/metrics, and related admin routes.
Pod logs primarily show 401 responses for those requests (no stacktrace).
⚠️ No secrets included. (Can provide additional sanitized logs if needed.)
🧠 Environment Info
You can retrieve most of this from the /version endpoint.
| Key | Value |
|---|---|
| Version or commit | ghcr.io/ibm/mcp-context-forge:latest (as of 2025-08-27) |
| Runtime | Containerized in Kubernetes (auth required; UI + Admin API enabled) |
| Platform / OS | Kubernetes cluster (Namespace mcp) |
| Container | Deployed via Deployment + PVC; Service is ClusterIP (HTTP to port 4444) |
🧩 Additional Context (optional)
Kubernetes manifest (relevant bits):
env:
- { name: HOST, value: "0.0.0.0" }
- { name: MCPGATEWAY_UI_ENABLED, value: "true" }
- { name: MCPGATEWAY_ADMIN_API_ENABLED, value: "true" }
- { name: AUTH_REQUIRED, value: "true" }
- name: BASIC_AUTH_USER
valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: BASIC_AUTH_USER } }
- name: BASIC_AUTH_PASSWORD
valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: BASIC_AUTH_PASSWORD } }
- name: JWT_SECRET_KEY
valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: JWT_SECRET_KEY } }
- name: DATABASE_URL
value: "sqlite:////data/gateway/mcp.db"
Notes / hypotheses to help triage:
- If cookies are marked Secure and the UI is accessed over plain HTTP, the browser won’t send the cookie, which could present as 401s on admin routes after restart/session changes. Consider reproducing with HTTPS or, only for testing,
SECURE_COOKIES=false. - Confirm whether admin auth relies on a cookie vs. header in the UI; check
COOKIE_SAMESITEand related settings. - Verify that the JWT signing key (
JWT_SECRET_KEY) and server time are stable across restarts (clock skew can invalidate tokens).
Potential directions:
- Provide guidance on expected cookie settings for HTTP vs HTTPS deployments.
- Clarify whether the UI refreshes/rotates tokens after pod restarts, and if any cache needs to be cleared.
- Any known issues with SQLite + PVC on restart that could affect session storage would be helpful to rule in/out.
Hi @InigoGastesi - thanks for the detailed bug report! I was able to reproduce the issue:
You're hitting a cookie security configuration issue. Your deployment has SECURE_COOKIES: "true" but you're accessing over plain HTTP (ClusterIP without TLS).
When cookies are marked Secure, browsers won't send them over HTTP - it's a security feature. So what happens is:
- Login works, cookie gets set
- Browser stores the cookie but refuses to send it on subsequent HTTP requests
- Server sees no auth token → 401
- UI still thinks you're logged in (cookie exists locally) but every API call fails
This explains why it's intermittent after restart - the cookie is there but never gets transmitted.
Quick Fix
Add this to your values or ConfigMap:
mcpContextForge:
config:
SECURE_COOKIES: "false"
Then restart your pods. Should work immediately.
Proper Fix (for production)
Enable TLS on your ingress:
mcpContextForge:
ingress:
enabled: true
tls:
enabled: true
Keep SECURE_COOKIES: "true" when using HTTPS.
Why the confusion
The current Helm chart defaults to SECURE_COOKIES: "true" even though most dev setups use plain HTTP. The warning in the logs only shows up on initial login failure, not on the 401s afterward, making it hard to diagnose.
We should probably:
- Default to
falsein the Helm chart for easier setup - Add better warnings when secure cookies are used over HTTP
- Auto-detect the protocol and adjust cookie flags accordingly
Let me know if setting SECURE_COOKIES: "false" resolves it!