MCP Fleet Authentication (JWT + Eunomia)¶

How every MCP server we deploy authenticates callers and authorizes tool calls, and how the multiplexer reaches JWT-protected children. The whole fleet is built from one factory, so this is configured once and applies everywhere.

The model¶

Claude Code ──► mcp-multiplexer ──►  child MCP servers (57×, *.arpa/mcp)
                     │                     │  AUTH_TYPE=jwt
                     │ client_credentials  │  ├─ JWTVerifier (Keycloak JWKS, aud=agent-services)
                     │ service token       │  └─ Eunomia middleware (policy, fail-closed)
                     ▼                     ▼
                 Keycloak  ◄────────  validates bearer (JWKS)
              (realm master)

Every -mcp service is built by create_mcp_server (agent_utilities/mcp/server_factory.py), so all of them honor the same env:

Env	Meaning
`AUTH_TYPE=jwt`	Verify a Keycloak-issued bearer with `JWTVerifier`.
`FASTMCP_SERVER_AUTH_JWT_ISSUER`	`http://keycloak.arpa/realms/master`
`FASTMCP_SERVER_AUTH_JWT_JWKS_URI`	`.../protocol/openid-connect/certs`
`FASTMCP_SERVER_AUTH_JWT_AUDIENCE`	`agent-services`
`EUNOMIA_TYPE=remote` + `EUNOMIA_REMOTE_URL`	Authorize each tool call against the policy server.

These are non-secret internal URLs and live in the compose template (scripts/gen_mcp_service_stacks.py COMPOSE_TMPL), so newly generated stacks are auth-on by default and compose.dev.yml inherits them via make_editable.

Two properties you must design around¶

Eunomia fails closed. With default_effect: deny (agent_utilities/mcp/eunomia_principal.py), a JWT service with no policy for the caller's principal denies every call. So a baseline policy that allows the multiplexer's service principal must exist at eunomia.arpa before a service is flipped to jwt.
The multiplexer must present a token. Children are configured per-entry in mcp_config.json; historically none carried an Authorization header, so a child flipped to jwt became unreachable (401) through the aggregator. (Local stdio children like graph-os are exempt — stdio has no HTTP auth.)

Multiplexer outbound auth (client-credentials)¶

agent_utilities/mcp/client_credentials.py gives the multiplexer one service identity. When MCP_CLIENT_AUTH=oidc-client-credentials, it mints a Keycloak service-account token (OAuth2 client_credentials, audience agent-services — the same audience children verify), caches and refreshes it, and the multiplexer attaches Authorization: Bearer <token> to every remote child that doesn't declare its own header. It never overrides an explicit header, and a mint failure degrades to no header (the child then 401s — visible in metrics/logs, not a crash).

Configuration (multiplexer service):

Env	Value
`MCP_CLIENT_AUTH`	`oidc-client-credentials`
`OIDC_CLIENT_ID`	`mcp-multiplexer` (Keycloak confidential client)
`OIDC_CLIENT_SECRET`	injected from OpenBao at deploy
`OIDC_AUDIENCE`	`agent-services` (default)
`OIDC_TOKEN_URL`	derived from the JWT issuer if unset

/metrics and /health are unauthenticated¶

create_mcp_server registers GET /metrics and GET /health as custom routes outside the auth/eunomia path (same pattern as graph-os /health), so Prometheus and blackbox probes need no token. They are overlay-network-scoped (no Caddy route). See Observability.

Rollout¶

Auth is rolled out in phased waves so each flip is verified before the next; the two gates above (token provisioning + baseline policy) come first. The end-to-end procedure — creating the Keycloak client, loading the policy, flipping services, and rolling back — is in the MCP Fleet Auth & Monitoring runbook.