Configuration Reference¶

This document provides a unified reference for all environment variables, configuration files, and CLI flags used across the agent-utilities ecosystem.

Environment Variables¶

LLM Configuration¶

All LLM configuration (models, API keys, endpoints) is now managed centrally via the XDG configuration file at ~/.config/agent-utilities/config.json.

Environment variables for LLM_BASE_URL, LLM_MODEL_ID, etc., are deprecated and will be ignored. API keys can optionally be provided in the .env file or directly inside the config.json model entries.

Graph Database¶

Variable	Default	Description
`GRAPH_BACKEND`	`tiered` (epistemic_graph L1 + LadybugDB L2; zero-infra)	Backend to use. Primary tiers: `tiered`, `memory`, `file`, `epistemic_graph`, `postgresql`. Contrib (opt-in, under `backends/contrib/`): `ladybug`, `falkordb`, `neo4j`. For a Postgres durable L2 set `GRAPH_DB_URI` (or `GRAPH_BACKEND_L2=postgresql`)
`GRAPH_BACKEND_L1`	`epistemic_graph`	L1 working store for the `tiered` backend
`GRAPH_BACKEND_L2`	`ladybug` (or `postgresql` when a DSN is set)	L2 durable store for the `tiered` backend
`GRAPH_DB_PATH`	`knowledge_graph.db`	File path for the `file`/`ladybug` backends
`GRAPH_DB_HOST`	`localhost`	Host for Neo4j/FalkorDB
`GRAPH_DB_PORT`	`7687`	Port for Neo4j/FalkorDB
`GRAPH_DB_URI`	None	Direct connection URI (overrides Host/Port; e.g. for PostgreSQL bolt/URI)
`GRAPH_DB_USER`	`neo4j`	Username for remote DBs (Neo4j/PostgreSQL)
`GRAPH_DB_PASSWORD`	None	Password for remote DBs (Neo4j/PostgreSQL)
`GRAPH_DB_NAME`	`agent_graph`	Database/graph name for FalkorDB/PostgreSQL
`GRAPH_POOL_MIN`	`2`	Minimum PostgreSQL connection pool size
`GRAPH_POOL_MAX`	`10`	Maximum PostgreSQL connection pool size
`GRAPH_PGGRAPH_SCHEMA`	`public`	Schema for pg-age table registration

OWL Reasoning¶

Variable	Default	Description
`OWL_BACKEND`	`stardog`	Reasoning backend (`stardog`, `hermit`)
`OWL_DB_PATH`	`ontology.owl`	Path to local OWL ontology file
`STARDOG_ENDPOINT`	`http://localhost:5820`	Stardog server URL
`STARDOG_USERNAME`	`admin`	Stardog authentication user
`STARDOG_PASSWORD`	`admin`	Stardog authentication password

Document Storage¶

Variable	Default	Description
`DOC_BACKEND`	`sqlite`	Backend for document pipeline (`sqlite`, `mongodb`, `postgresql`)
`DOC_DB_URI`	None	Connection string for MongoDB/Postgres document stores

Secrets & Auth (CONCEPT:OS-5.1)¶

Variable	Default	Description
`SECRETS_BACKEND`	`inmemory`	Storage for secrets (`inmemory`, `sqlite`, `vault`). See secrets-auth.md
`SECRETS_SQLITE_PATH`	`~/.agent-utilities/secrets.db`	Path for SQLite secrets DB
`SECRETS_VAULT_URL`	None	URL for HashiCorp Vault & OpenBao
`SECRETS_VAULT_MOUNT`	`secret`	Vault/OpenBao KV v2 mount point
`ENABLE_API_AUTH`	`False`	Enable JWT validation on server endpoints
`AUTH_JWT_JWKS_URI`	None	URI to fetch JSON Web Key Sets
`AUTH_JWT_ISSUER`	None	Expected JWT issuer
`AUTH_JWT_AUDIENCE`	None	Expected JWT audience
`AGENT_API_KEY`	None	Static API key for basic auth
`ALLOWED_ORIGINS`	`*`	Comma-separated CORS origins
`ALLOWED_HOSTS`	`*`	Comma-separated trusted hosts

Graph Execution¶

Variable	Default	Description
`GRAPH_DIRECT_EXECUTION`	`True`	Direct graph dispatch in AG-UI/ACP (bypasses LLM tool-call hop)
`VALIDATION_MODE`	`False`	Disables real LLM calls for unit testing and CI
`WORKSPACE_TOOLS`	`True`	Enable workspace filesystem and grep tools
`GIT_TOOLS`	`True`	Enable Git tools
`BROWSER_TOOLS`	`True`	Enable browser and web search tools
`A2A_TOOLS`	`True`	Enable Agent-to-Agent discovery and messaging

RLM & AHE Observability¶

Variable	Default	Description
`ENABLE_RLM`	`True`	Enable Recursive Language Model execution
`RLM_MAX_DEPTH`	`3`	Maximum recursion depth for RLM sub-shells
`RLM_USE_CONTAINER`	`True`	Run RLM in an isolated container if available
`AHE_TRACE_THRESHOLD`	`0.7`	Quality threshold triggering distillation traces

Swarm & First Principles¶

Variable	Default	Description
`SWARM_MODE`	`False`	Enable swarm orchestration in dispatcher
`SWARM_MAX_DEPTH`	`3`	Maximum recursion depth for sub-swarms
`SWARM_MAX_AGENTS`	`10`	Maximum agents per swarm

Observability¶

Variable	Default	Description
`OTEL_ENABLE_OTEL`	`False`	Enable OpenTelemetry exports
`LANGFUSE_PUBLIC_KEY`	None	Langfuse integration key
`LANGFUSE_SECRET_KEY`	None	Langfuse integration secret
`LOGFIRE_TOKEN`	None	Pydantic Logfire token

MCP Tooling¶

Variable	Default	Description
`MCP_CONFIG`	`mcp_config.json`	Path to the MCP server configuration map
`MCP_SEMAPHORE_LIMIT`	`30`	Max parallel subprocesses during tool discovery
`TOOL_GUARD_MODE`	`on`	Strictness of the tool execution guard (`on`, `off`, `custom`)
`DISABLE_TOOL_GUARD`	`False`	Completely bypass tool elicitation and safety checks

A2A Agent Discovery (CONCEPT:ECO-4.0)¶

Variable	Default	Description
`A2A_CONFIG`	None	Path to `a2a_config.json` for external A2A agent discovery
`A2A_REFRESH_INTERVAL`	`300`	Seconds between periodic `.well-known/agent-card.json` re-fetch

CLI Execution¶

The preferred method for running agent-utilities servers is via the standardized uv scripts:

Script	Command	Description
KG Server	`uv run graph-os`	Launches the Knowledge Graph (graph-os) MCP server
Main Server	`python -m agent_utilities`	Launches the unified protocol server (ACP/A2A/AG-UI)

CLI Flags¶

When running agent-utilities commands (or python -m agent_utilities), the following standard flags are available:

Flag	Equivalent Env Var	Description
`--base-url`	Base URL (Overrides `config.json`)
`--api-key`	API Key (Overrides `config.json`)
`--port`	None	Server listen port (default: 8000)
`--host`	None	Server bind host (default: 0.0.0.0)
`--web`	None	Enables the bundled web UI proxy if present
`--mcp-config`	`MCP_CONFIG`	Path to MCP config file
`--debug`	None	Sets log level to DEBUG
`--skill-types`	None	Comma-separated list of skills to load (`universal`, `graphs`)

Configuration & Environment Variables¶

All LLM and embedding configuration now routes exclusively through the chat_models and embedding_models registries in config.json.

Unified Agent Configuration (`config.json`)¶

The centralized config.json at ~/.config/agent-utilities/config.json (XDG-compliant) is the single source of truth for all configuration.

Configuration Precedence Chain¶

config.json registry → AgentConfig defaults

Environment variables are no longer part of the LLM configuration chain. API keys can be specified per-model in the registry.

Full `config.json` Schema¶

{
  // ── Agent Identity ──────────────────────────────────────────────
  "default_agent_name": "Agent",
  "agent_description": "AI Agent",
  "agent_system_prompt": null,

  // ── Server ──────────────────────────────────────────────────────
  "host": "0.0.0.0",
  "port": 9000,
  "debug": false,
  "enable_web_ui": false,
  "enable_terminal_ui": false,
  "enable_web_logs": true,
  "enable_acp": false,
  "acp_port": 8001,
  "acp_session_root": ".acp-sessions",
  "mcp_config": null,
  "max_upload_size": 10485760,

  // ── Authentication & Security ───────────────────────────────────
  "agent_api_key": null,
  "enable_api_auth": false,
  "auth_jwt_jwks_uri": null,
  "auth_jwt_issuer": null,
  "auth_jwt_audience": null,
  "allowed_origins": null,
  "allowed_hosts": null,
  "tool_guard_mode": "strict",
  "sensitive_tool_patterns": [".*delete.*", ".*remove.*", "..."],

  // ── Secrets Backend ─────────────────────────────────────────────
  "secrets_backend": "inmemory",
  "secrets_sqlite_path": null,
  "secrets_vault_url": null,
  "secrets_vault_mount": "secret",

  // ── Graph Execution ─────────────────────────────────────────────
  "routing_strategy": "hybrid",
  "graph_persistence_type": "file",
  "graph_persistence_path": "~/.local/share/agent-utilities/graph_state",
  "enable_llm_validation": false,
  "graph_router_timeout": 300.0,
  "graph_verifier_timeout": 300.0,
  "graph_direct_execution": true,
  "min_confidence": 0.4,
  "validation_mode": false,
  "approval_timeout": 0.0,

  // ── Knowledge Graph ─────────────────────────────────────────────
  "enable_kg_embeddings": true,
  "kg_backups": 3,
  "knowledge_graph_sync_background": true,

  // ── Observability (OTEL / Langfuse) ─────────────────────────────
  "enable_otel": true,
  "otel_exporter_otlp_endpoint": "http://langfuse.example.com/api/public/otel",
  "otel_exporter_otlp_headers": null,
  "otel_exporter_otlp_public_key": "lf_pk_...",
  "otel_exporter_otlp_secret_key": "lf_sk_...",
  "otel_exporter_otlp_protocol": "http/protobuf",
  "langfuse_host": "http://langfuse.example.com",
  "langfuse_public_key": "lf_pk_...",
  "langfuse_secret_key": "lf_sk_...",
  "langfuse_dataset_capture_threshold": 0.0,

  // ── A2A Agent Discovery ─────────────────────────────────────────
  "a2a_broker": "in-memory",
  "a2a_broker_url": null,
  "a2a_storage": "in-memory",
  "a2a_storage_url": null,
  "a2a_config": null,
  "a2a_refresh_interval": 300,

  // ── LLM Inference Parameters ────────────────────────────────────
  "max_tokens": 16384,
  "temperature": 0.7,
  "top_p": 1.0,
  "timeout": 32400.0,
  "tool_timeout": 32400.0,
  "parallel_tool_calls": true,
  "seed": null,
  "presence_penalty": 0.0,
  "frequency_penalty": 0.0,
  "logit_bias": null,
  "stop_sequences": null,
  "extra_headers": null,
  "extra_body": null,

  // ── Cognitive Scheduler & Agent Policies ─────────────────────────
  "cognitive_scheduler_enabled": true,
  "max_concurrent_agents": 5,
  "agent_token_quota": 100000,
  "preemption_threshold_pct": 0.85,
  "agent_policies_path": null,
  "permissions_signing_key": null,
  "specialist_registry_path": null,
  "homeostatic_downgrade_enabled": true,
  "adversarial_verification": false,
  "maintenance_token_budget": 0,
  "maintenance_priority": "LOW",
  "watchdog_patterns": ["pyproject.toml", "mcp_config.json", "requirements*.txt"],

  // ── Skills ──────────────────────────────────────────────────────
  "custom_skills_directory": null,
  "skill_types": null,

  // ── Model Registries (PRIMARY CONFIG) ───────────────────────────
  "chat_models": [
    {
      "id": "qwen/qwen3.5-9b",
      "provider": "openai",
      "base_url": "http://vllm.arpa/v1",
      "supports_json": false,
      "vision": true,
      "reasoning": true,
      "tools_enabled": true,
      "parallel_instances": 3,
      "context_window": 256000,
      "intelligence_level": "normal",
      "can_route": true,
      "can_kg": true
    }
  ],
  "embedding_models": [
    {
      "id": "text-embedding-nomic-embed-text-v2-moe",
      "provider": "openai",
      "base_url": "http://vllm-embed.arpa/v1",
      "parallel_instances": 4,
      "chunk_size": 768
    }
  ],

  // ── Workspace & Paths ───────────────────────────────────────────
  "workspace_path": "/home/apps/workspace",
  "agent_utilities_config_dir": "~/.config/agent-utilities"
}

Note: JSON does not support comments. The // annotations above are for documentation purposes only. Your actual config.json must not include comments.

Chat Model Fields¶

Field	Type	Required	Description
`id`	string	✅	Model identifier (e.g., `gpt-4o-mini`, `qwen/qwen3.5-9b`)
`provider`	string	✅	Provider name (`openai`, `anthropic`, `google`, etc.)
`base_url`	string	❌	Override API endpoint (e.g., for LM Studio, Ollama)
`api_key`	string	❌	Per-model API key override
`intelligence_level`	string	❌	Routing hint: `light`, `normal`, `high`
`supports_json`	bool	❌	Whether the model supports structured JSON output
`vision`	bool	❌	Whether the model supports image inputs
`reasoning`	bool	❌	Whether the model supports extended reasoning/thinking
`tools_enabled`	bool	❌	Whether the model supports tool/function calling
`parallel_instances`	int	❌	Max concurrent requests to this model
`context_window`	int	❌	Maximum context window in tokens
`can_route`	bool	❌	Whether the model can serve as a router in graph orchestration
`can_kg`	bool	❌	Whether the model can serve KG analysis tasks

Embedding Model Fields¶

Field	Type	Required	Description
`id`	string	✅	Model identifier
`provider`	string	✅	Provider name
`base_url`	string	❌	Override API endpoint
`api_key`	string	❌	Per-model API key override
`parallel_instances`	int	❌	Max concurrent embedding requests
`chunk_size`	int	❌	Embedding dimension size (default: 768)

Per-Model Provider Routing¶

The registry supports per-model base_url and api_key overrides, enabling configurations like: - LM Studio local: base_url: "http://vllm.arpa/v1" (your GPU server) - Official OpenAI: api_key: "sk-..." (no base_url needed, hits api.openai.com) - Ollama: base_url: "http://localhost:11434/v1", api_key: "ollama" - Azure OpenAI: base_url: "https://my-resource.openai.azure.com", api_key: "..."

This allows configuring multiple models from the same provider hitting different endpoints.

Migration from `.env` to `config.json`¶

Move all LLM_*, LITE_LLM_*, SUPER_LLM_*, and EMBEDDING_* variables from .env into chat_models/embedding_models registry entries
API keys go directly in per-model entries via the api_key field
Non-LLM environment variables (e.g., GRAPH_BACKEND, OTEL_ENABLE_OTEL) are now also configurable via config.json

Full Documentation: See docs/models.md for advanced schema options, local model fallbacks, and routing logic.