Architecture¶
Core Architecture Diagram¶
graph TD
User("ORCH-1.0: User Request + Images")
WebUI["ECO-4.0: agent-webui"]
TUI["ECO-4.0: agent-terminal-ui"]
Backend["ECO-4.0: agent-utilities Server"]
External["ECO-4.0: External AG-UI Client"]
User --> WebUI
User --> TUI
WebUI -- ACP Protocol /acp --> Backend
TUI -- AG-UI /ag-ui --> Backend
TUI -- ACP Protocol /acp --> Backend
External -- Legacy Protocol /ag-ui --> Backend
subgraph AgentUtilities [agent-utilities]
Backend --> UnifiedExec["Unified Execution Layer<br/>(graph/protocol_agnostic_execution.py)"]
UnifiedExec --> Graph[Pydantic Graph Agent]
Graph --> KG[ORCH-1.0: Intelligence Graph Engine]
subgraph MemoryArchitecture [Autonomous Memory Architecture]
KG --> MAGMA[ORCH-1.0: MAGMA: Orthogonal Views]
KG --> Lightning[AHE-3.3: Autonomous Self-Improvement]
KG --> UnifiedDB[(GraphBackend: epistemic-graph L1 + pg-age durable)]
MAGMA --> Semantic[KG-2.3: Semantic View]
MAGMA --> Temporal[KG-2.6: Temporal View]
MAGMA --> Causal[KG-2.5: Causal View]
MAGMA --> Entity[KG-2.0: Entity View]
Lightning --> Rewards[AHE-3.1: Outcome Rewards]
Lightning --> Critiques[Textual Gradients]
Lightning --> Evolution[AHE-3.2: Prompt/Skill Evolution]
end
subgraph ProtocolAdapters [Protocol Adapters]
AGUI_Adapter[ECO-4.0: AG-UI Adapter]
ACP_Adapter["ACP Adapter<br/>(graph-backed)"]
SSE_Adapter[ECO-4.0: SSE Stream]
A2A_Adapter[ECO-4.1: A2A Adapter]
end
Backend --> AGUI_Adapter
Backend --> A2A_Adapter
Backend --> ACP_Adapter
Backend --> SSE_Adapter
AGUI_Adapter --> UnifiedExec
A2A_Adapter --> UnifiedExec
ACP_Adapter --> UnifiedExec
SSE_Adapter --> UnifiedExec
subgraph UnifiedDiscovery ["Unified Discovery Layer (core/config.py)"]
DAL["ORCH-1.2: get_discovery_registry()"]
KG_Registry["<b>Knowledge Graph</b><br/><i>(Unified Specialist Registry)</i>"]
KG_Registry --> DAL
DAL --> DSRoster["ECO-4.6: MCPAgentRegistryModel"]
end
DSRoster --> Graph
Graph --> Specialists["ORCH-1.2: Specialist Superstates"]
Specialists --> MCP["ECO-4.0: MCP Servers"]
Specialists --> Skills["ECO-4.6: Universal Skills, Skill Graphs"]
subgraph ElicitationFlow [Human-in-the-Loop Flow]
MCP -- 1. Tool needs approval --> TG[OS-5.2: tool_guard: requires_approval]
TG -- 2. DeferredToolRequests --> AM[OS-5.1: ApprovalManager]
AM -- 3. asyncio.Future await --> EQ[ORCH-1.21: Event Queue]
EQ -- 4. SSE sideband event --> Backend
MCP -- 1b. ctx.elicit --> GEC[ORCH-1.3: global_elicitation_callback]
GEC -- 2b. Queue + Future --> AM
end
end
Backend -- 4. approval_required event --> WebUI
Backend -- 4. approval_required event --> TUI
WebUI -- 5. POST /api/approve --> Backend
TUI -- 5. POST /api/approve --> Backend
Backend -- 6. Future.resolve --> AM
Protocol Layer Architecture¶
The framework provides three canonical protocol adapters:
- ACP (Agent Communication Protocol): Primary protocol for standardized sessions, planning, and streaming
- A2A (Agent-to-Agent): Peer-to-peer agent communication and coordination
- AG-UI: Legacy streaming interface for backward compatibility with native Pydantic AI clients
All protocol adapters are centralized in agent_utilities/protocols/:
- acp_adapter.py: ACP envelope formatting, session management, per-session agent_factory
- a2a.py: A2A peer discovery, JSON-RPC client, registry management
- agui_emitter.py: AG-UI wire format translator for direct graph execution events
- Server endpoints: /acp (MOUNT), /a2a (MOUNT), /ag-ui (POST)
Direct Graph Execution (Fast Path)¶
When a graph_bundle is present and GRAPH_DIRECT_EXECUTION=true (default), the AG-UI endpoint bypasses the outer LLM agent entirely:
# Legacy (Agent-mediated):
User Query → /ag-ui → Agent.run() → LLM → "call run_graph_flow" → graph.run()
# Direct (Fast Path):
User Query → /ag-ui → graph.iter() → [step events] → AGUIGraphEmitter → wire format
This eliminates one full LLM inference round-trip per request. The fast path uses graph.iter() (pydantic-graph beta API) for step-by-step execution, yielding per-node events that are translated to AG-UI wire format by AGUIGraphEmitter.
The fast path is gated on:
1. graph_bundle containing a real Graph object with .iter() support
2. GRAPH_DIRECT_EXECUTION env var set to true (default)
The ACP adapter uses pydantic-acp's agent_factory callback for per-session agent creation, binding graph context directly to each session's closure.
The A2A path now supports two modes:
1. Graph-native (CONCEPT:ECO-4.0): When a graph_bundle is present, PlannerGraphSkill is registered as an A2A skill, enabling direct graph-backed planning without LLM orchestration overhead.
2. LLM-mediated (fallback): For multi-agent negotiation scenarios, the legacy run_graph_flow tool call path is retained.
3-Stage Hybrid Routing (CONCEPT:ORCH-1.2, CONCEPT:AHE-3.3, CONCEPT:KG-2.1)¶
The router implements a cascading 3-stage routing strategy that avoids unnecessary LLM inference:
Stage 1: TeamConfig Match (CONCEPT:AHE-3.3)
└─ Check KG for a proven specialist coalition matching the query pattern
└─ If found → skip LLM, dispatch the team directly
Stage 2: Self-Model Bias (CONCEPT:KG-2.1)
└─ Inject domain proficiency scores into the specialist prompt
└─ High-proficiency domains are weighted higher in LLM selection
Stage 3: LLM Planning (filtered via CONCEPT:ORCH-1.2)
└─ Registry Hot Cache provides only top-7 relevant specialists
└─ LLM sees a focused prompt instead of 50+ specialist descriptions
This strategy means the system progressively learns: the more queries it handles, the more TeamConfigs accumulate, and the fewer LLM planning round-trips are needed.
Authentication Passthrough (custom_headers)¶
create_agent_server() and create_graph_agent_server() accept a generic custom_headers: dict[str, Any] | None = None kwarg that is propagated verbatim to the LLM HTTP client as request headers. agent-utilities itself is auth-agnostic -- it does not ship provider-specific auth code (OIDC, client-credentials flows, bearer-token fetchers, etc.) and has no opinion about where those headers come from. Downstream packages are free to populate the dict from any source: environment variables, a token-fetching library, static config, a secret manager, or a callable that refreshes on every run. The same kwarg is reused by ssl_verify for self-signed gateways. See agents/repository-manager/repository_manager/agent_server.py for a reference implementation that builds the dict from LLM_CUSTOM_HEADERS / LLM_HEADER_* environment variables without pulling any provider-specific dependency into this core package.
Graph Orchestration Architecture¶
graph TB
Start("ORCH-1.0: User Query + Images")
End("ORCH-1.21: End Result")
UsageGuard["ORCH-1.3: Usage Guard: Rate Limiting"]
router_step["ORCH-1.2: Router: Topology Selection"]
dispatcher["ORCH-1.0: Dispatcher: Dynamic Routing"]
mem_step["KG-2.3: Memory: Context Retrieval"]
ACPLayer["<b>ACP / AG-UI / SSE</b><br/><i>(Unified Protocol Layer)</i>"]
Start --> ACPLayer
ACPLayer --> UsageGuard
UsageGuard -- "Allow" --> router_step
UsageGuard -- "Block" --> End
router_step -- "Trivial Query" --> End
router_step -- "Full Pipeline" --> dispatcher
dispatcher -- "First Entry" --> mem_step
mem_step --> dispatcher
subgraph DiscoveryPhase ["Discovery Phase"]
direction TB
Researcher["<b>Researcher</b><br/>---<br/><i>u-skill:</i> web-search, web-crawler, web-fetch<br/><i>t-tool:</i> project_search, read_workspace_file"]
Architect["<b>Architect</b><br/>---<br/><i>u-skill:</i> c4-architecture, spec-generator, product-strategy, user-research, brainstorming<br/><i>t-tool:</i> developer_tools"]
MCPDiscovery["<b>Unified Registry</b><br/>---<br/><i>source:</i> Knowledge Graph"]
res_joiner[ORCH-1.0: Research Joiner: Barrier Sync]
end
dispatcher -- "Research First" --> Researcher
dispatcher -- "Research First" --> Architect
dispatcher -- "Research First" --> MCPDiscovery
Researcher --> res_joiner
Architect --> res_joiner
MCPDiscovery --> res_joiner
res_joiner -- "Coalesced Context" --> dispatcher
subgraph ExecutionPhase ["Execution Phase"]
direction TB
subgraph Programmers ["Programmers"]
direction LR
PyP["<b>Python</b><br/>---<br/><i>u-skill:</i> agent-builder, tdd-methodology, mcp-builder, jupyter-notebook<br/><i>g-skill:</i> python-docs, fastapi-docs, pydantic-ai-docs<br/><i>t-tool:</i> developer_tools"]
TSP["<b>TypeScript</b><br/>---<br/><i>u-skill:</i> react-development, web-artifacts, tdd-methodology, canvas-design<br/><i>g-skill:</i> nodejs-docs, react-docs, nextjs-docs, shadcn-docs<br/><i>t-tool:</i> developer_tools"]
GoP["<b>Go</b><br/>---<br/><i>u-skill:</i> tdd-methodology<br/><i>g-skill:</i> go-docs<br/><i>t-tool:</i> developer_tools"]
RustP["<b>Rust</b><br/>---<br/><i>u-skill:</i> tdd-methodology<br/><i>g-skill:</i> rust-docs<br/><i>t-tool:</i> developer_tools"]
CSP["<b>C Programmer</b><br/>---<br/><i>u-skill:</i> developer-utilities<br/><i>g-skill:</i> c-docs<br/><i>t-tool:</i> developer_tools"]
CPP["<b>C++ Programmer</b><br/>---<br/><i>u-skill:</i> developer-utilities<br/><i>t-tool:</i> developer_tools"]
JSP["<b>JavaScript</b><br/>---<br/><i>u-skill:</i> web-artifacts, canvas-design, developer-utilities<br/><i>g-skill:</i> nodejs-docs, react-docs<br/><i>t-tool:</i> developer_tools"]
end
subgraph InfraGroup ["Infrastructure"]
direction LR
DevOps["<b>DevOps</b><br/>---<br/><i>u-skill:</i> cloudflare-deploy<br/><i>g-skill:</i> docker-docs, terraform-docs<br/><i>t-tool:</i> developer_tools"]
Cloud["<b>Cloud</b><br/>---<br/><i>u-skill:</i> c4-architecture<br/><i>g-skill:</i> aws-docs, azure-docs, gcp-docs<br/><i>t-tool:</i> developer_tools"]
DBA["<b>Database</b><br/>---<br/><i>u-skill:</i> database-tools<br/><i>g-skill:</i> postgres-docs, mongodb-docs, redis-docs<br/><i>t-tool:</i> developer_tools"]
end
subgraph Specialized ["Specialized & Quality"]
direction LR
Sec["<b>Security</b><br/>---<br/><i>u-skill:</i> security-tools<br/><i>g-skill:</i> linux-docs<br/><i>t-tool:</i> developer_tools"]
QA["<b>QA</b><br/>---<br/><i>u-skill:</i> spec-verifier, tdd-methodology<br/><i>g-skill:</i> testing-library-docs<br/><i>t-tool:</i> developer_tools"]
UIUX["<b>UI/UX</b><br/>---<br/><i>u-skill:</i> theme-factory, brand-guidelines, algorithmic-art<br/><i>g-skill:</i> shadcn-docs, framer-docs<br/><i>t-tool:</i> developer_tools"]
Debug["<b>Debugger</b><br/>---<br/><i>u-skill:</i> developer-utilities, agent-builder<br/><i>t-tool:</i> developer_tools"]
end
end
dispatcher -- "Parallel Dispatch" --> Programmers
dispatcher -- "Parallel Dispatch" --> InfraGroup
dispatcher -- "Parallel Dispatch" --> Specialized
exe_joiner["ORCH-1.0: Execution Joiner: Barrier Sync"]
verifier["AHE-3.1: Verifier: Quality Gate"]
council["ORCH-1.2: Council: Multi-Perspective Deliberation"]
synthesizer["ORCH-1.0: Synthesizer: Response Composition"]
planner_step["ORCH-1.1: Planner: Re-plan"]
Programmers --> exe_joiner
InfraGroup --> exe_joiner
Specialized --> exe_joiner
exe_joiner -- "Implementation Results" --> dispatcher
dispatcher -- "Plan Complete" --> verifier
dispatcher -- "Council" --> council
council --> exe_joiner
verifier -- "Pass: Score >= 0.7" --> synthesizer
verifier -- "Fail: Score < 0.7" --> dispatcher
dispatcher -- "Terminal Failure" --> End
planner_step --> dispatcher
synthesizer -- "Final Response" --> End
subgraph SDD_Lifecycle ["Spec-Driven Development"]
direction TB
Const["<b>Constitution</b><br/>(Governance)"] --> Spec["<b>Specification</b><br/>(Spec)"]
Spec --> SDDPlan["<b>Technical Plan</b><br/>(ImplementationPlan)"]
SDDPlan --> SDDTasks["<b>Tasks</b><br/>(Tasks)"]
SDDTasks --> SDDExec["<b>Execution</b><br/>(Parallel Dispatch)"]
SDDExec --> SDDVerify["<b>Verification</b><br/>(Spec Audit)"]
end
style Researcher fill:#e1d5e7,stroke:#9673a6,stroke-width:2px
style Architect fill:#e1d5e7,stroke:#9673a6,stroke-width:2px
style MCPDiscovery fill:#e1d5e7,stroke:#9673a6,stroke-width:2px
style Programmers fill:#dae8fe,stroke:#6c8ebf,stroke-width:2px
style InfraGroup fill:#fad9b8,stroke:#d6b656,stroke-width:2px
style Specialized fill:#e0d3f5,stroke:#82b366,stroke-width:2px
style verifier fill:#fff2cc,stroke:#d6b656,stroke-width:2px
style council fill:#fce4ec,stroke:#c62828,stroke-width:2px
style synthesizer fill:#d5e8d4,stroke:#82b366,stroke-width:2px
style planner_step fill:#dae8fe,stroke:#6c8ebf,stroke-width:2px
style End fill:#f8cecc,stroke:#b85450,stroke-width:2px
style res_joiner fill:#f5f5f5,stroke:#666,stroke-dasharray: 5 5
style exe_joiner fill:#f5f5f5,stroke:#666,stroke-dasharray: 5 5
style dispatcher fill:#d5e8d4,stroke:#666,stroke-width:2px
style Start fill:#38B6FF
style ACPLayer fill:#38B6FF,stroke-width:2px
Note: MCP ecosystem agents (AdGuard, Jellyfin, Ansible Tower, etc.) are dynamically spawned as
CallableResourcenodes in the Knowledge Graph. They are discovered at runtime frommcp_config.jsonand do not appear in this static diagram.Unified Toolkit Ingestion (CONCEPT:ECO-4.0)¶
This pipeline allows single-shot ingestion of all agent capabilities, bridging the gap between isolated codebases and the unified Knowledge Graph.graph LR subgraph Ingestion Pipeline Sources[Sources: mcp_config.json, SKILL.md dirs, A2A URLs] --> AutoDetect{Auto-Detect} AutoDetect -- "mcp_config" --> ParseMCP[Extract Servers & Flags] AutoDetect -- "skill_directory" --> ParseYAML[Parse Frontmatter] AutoDetect -- "a2a_url" --> FetchCard[Fetch /.well-known/agent.json] end ParseMCP --> LiveDiscovery[Live Tool Discovery<br><i>(list_tools)</i>] LiveDiscovery --> KGInsert[Insert CallableResource] ParseYAML --> KGInsert FetchCard --> KGInsert KGInsert --> KG[(Knowledge Graph)] style Sources fill:#f5f5f5,stroke:#666 style AutoDetect fill:#dae8fe,stroke:#6c8ebf style KG fill:#e1d5e7,stroke:#9673a6,stroke-width:2px
Council Deliberation Node¶
The Council is a specialized graph node that implements Karpathy's LLM Council pattern for high-stakes decision-making. It provides a 4-stage deliberative pipeline:
graph LR
Q[ORCH-1.0: Query] --> A1[ORCH-1.2: Contrarian]
Q --> A2[KG-2.2: First Principles]
Q --> A3[ORCH-1.2: Expansionist]
Q --> A4[ORCH-1.2: Outsider]
Q --> A5[ORCH-1.21: Executor]
A1 --> Anon[OS-5.1: Anonymize]
A2 --> Anon
A3 --> Anon
A4 --> Anon
A5 --> Anon
Anon --> R1[ORCH-1.2: Reviewer 1]
Anon --> R2[ORCH-1.2: Reviewer 2]
Anon --> R3[ORCH-1.2: Reviewer 3]
R1 --> Chair[ORCH-1.2: Chairman]
R2 --> Chair
R3 --> Chair
Chair --> V[ORCH-1.2: CouncilVerdict]
style Anon fill:#fff2cc,stroke:#d6b656
style Chair fill:#d5e8d4,stroke:#82b366
style V fill:#c8e6c9,stroke:#2e7d32
| Stage | Purpose | Implementation |
|---|---|---|
| 1. Advisors | 5 parallel agents with distinct thinking styles | run_orthogonal_regions / sequential dispatch |
| 2. Anonymize | Shuffle identities behind labels (A-E) | Pure Python, zero LLM cost |
| 3. Peer Review | 3 reviewers rank, critique, find collective gaps | Independent reviewer agents |
| 4. Chairman | Synthesize into structured CouncilVerdict |
output_type=CouncilVerdict |
Key features:
- Hybrid model routing: Uses ModelRegistry to assign different real LLM models to different advisor roles
- Generalized transcripts: AgentTranscript and render_agent_transcript_markdown() work for any agent output, not just council
- KG persistence: Verdicts are stored as DecisionNode entries for future reference
- Trigger modes: Auto-routed by the Router, keyword-triggered ("council this"), or invocable as a tool
Package Structure¶
With the recent modularization, agent-utilities has been restructured to cleanly separate routing, protocols, execution, and discovery mechanisms into isolated domains.
| Directory | Purpose | Key Modules |
|---|---|---|
core/ |
Foundational primitives, exceptions, and decorators. | workspace.py, exceptions.py, decorators.py |
agent/ |
Bootstrapping and configuring agent ecosystems from workspace.yml and CLI. |
factory.py, registry_builder.py |
protocols/ |
Interface adapters connecting outer HTTP/RPC boundaries to inner graphs. | acp_adapter.py, a2a.py, agui_emitter.py |
graph/ |
The core Pydantic-Graph routing and orchestration machinery. | protocol_agnostic_execution.py, steps.py, executor.py, routing/, planning/ |
mcp/ |
Specific wrappers for fastmcp to normalize tool discovery and error handling. |
server_factory.py, context_helpers.py, agent_manager.py |
security/ |
Centralized identity verification, JWT validation, and API authentication. | auth.py, browser_auth.py |
prompts/ |
Version-controlled JSON schema blueprints that replace unstructured text prompts. | *.json |
knowledge_graph/ |
The unified semantic and structural memory backbone over the layered GraphBackend interface. |
facade.py, core/engine.py, core/maintainer.py, retrieval/hybrid_retriever.py |
harness/ |
Agentic Harness Engineering (AHE) tools for execution observability and prompt evaluation. | verifier.py, trace_backend.py, evolve_agent.py |
rlm/ |
Recursive Language Model handlers for autonomous sub-shells and self-prompting loops. | repl.py |
sdd/ |
Spec-Driven Development pipelines decomposing .specify files into actionable graphs. |
orchestrator.py |
server/ |
FastAPI applications hosting all HTTP, ACP, and SSE routes. | app.py, routers/ |
gateway/ |
Homepage-style service dashboard data layer (CONCEPT:OS-5.9). 50 widget types, aggregator, REST+WS API. Synthesized from former service-dashboard-core. |
models.py, registry.py, config.py, aggregator.py, api.py, ws.py, widgets/ |
Hierarchical State Machine (HSM) Architecture¶
The graph orchestration system is a Hierarchical State Machine. It follows the same formal model used in robotics, game engines, UML statecharts, and SCXML workflow engines.
HSM Level Mapping¶
Level 0: Root Graph (N Orchestration Nodes)
├── usage_guard → router → dispatcher → memory_selection → dispatcher
├── researcher, architect, verifier (discovery/validation)
├── parallel_batch_processor → expert_executor (fan-out)
├── research_joiner, execution_joiner (fan-in)
├── verifier → synthesizer → END (quality gate + response composition)
└── planner (re-planning on verification failure)
Level 1: Superstates - Specialist Agents
├── Specialist Roster (Dynamically discovered from the **Knowledge Graph**)
│ Each loads: name-matched prompt + discovered capabilities + mapped MCP toolsets
│ Supports: 'prompt' (local), 'mcp' (stdio), and 'a2a' (remote) agent types
└── Unified Execution: Dynamic routing based on registry-provided metadata
Level 2: Substates - Agent Internal Loop
└── Pydantic AI Agent.run() = UserPromptNode → ModelRequestNode → CallToolsNode → ...
Multi-turn tool iteration (max 3 iterations per specialist)
Level 3: Leaf States - MCP Tool Execution
└── Each tool call invokes an MCP server subprocess via stdio/HTTP
Atomic operations: get_project(), list_branches(), run_cypher_query(), etc.
Concept Mapping¶
| agent-utilities Concept | HSM Concept | Details |
|---|---|---|
| Root graph | Root state machine | N Orchestration nodes |
| Router -> Dispatcher | Top-level transitions | Router generates plan, dispatcher executes |
| Planner (re-plan only) | Re-entry transition | Invoked by verifier on score < 0.4 |
| Synthesizer | Terminal action | Composes final response from the results |
NODE_SKILL_MAP agents |
Superstates (L1) | N hardcoded domains |
| Dynamic agents (unified) | Superstates (L1) | N from discover_all_specialists() (MCP + A2A) |
_execute_specialized_step() |
Enter superstate | Loads prompt + skills + deduplicated MCP toolsets |
Agent.run() internal loop |
Substates (L2) | Model request/tool cycles |
| MCP tool call (stdio) | Leaf states (L3) | Atomic operations |
| Verifier feedback loop | Re-entry transition | Parent re-dispatches to child |
| Circuit breaker (open) | Guard condition | Blocks entry to failed state |
node_transitions guard |
Watchdog timer | Force-terminates after 50 transitions |
| Memory-first dispatch | Entry action | Enriches context before first step |
| Research-before-execution | Phase ordering | Discovery completes before execution |
| Process-Guided Planning | Knowledge Influx | KG-native SOPs injected into Planner context |
| Policy Guardrails | Transition Guard | Policies enforce constraints at state boundaries |
HSM Design Principles¶
- Treat subgraphs as macro-states. A specialist should behave as a single opaque state to the dispatcher. Define clear input/output contracts.
- Scale horizontally, not vertically. Add new subgraphs (new MCP servers, new agent packages) instead of adding nodes to existing graphs.
- Plan enhancements by level. Routing concern -> L0. Domain behavior -> L1 specialist. Tool-level fix -> L3 MCP.
- Use types as boundaries.
ExecutionStep,GraphPlan,GraphResponse, andMCPAgentare the boundary contracts between levels. - Defer flattening. Never visualize the full system as one graph. Visualize one level at a time.
- The growth test: If tempted to add more nodes to a graph, ask whether you should add a new state machine instead.
Behavior Tree (BT) Concepts¶
The graph incorporates key Behavior Tree patterns inside the HSM structure.
| agent-utilities Concept | BT Concept | Details |
|---|---|---|
_attempt_specialist_fallback, static_route_query |
Selector (priority/fallback) | Specialist fallback chain, static route before LLM |
dispatcher_step, assert_state_valid |
Sequence (fail-fast) | Plan step execution with cursor |
_execute_dynamic_mcp_agent, expert_executor_step |
Retry decorator | Tool-level retries with exponential backoff |
asyncio.wait_for() in specialist execution |
Timeout decorator | Per-node timeout via ExecutionStep.timeout |
check_specialist_preconditions |
Precondition guard | Check server health before entering specialist |
assert_state_valid() |
Boundary re-evaluation | State invariants at dispatcher and verifier boundaries |
Design rule: If logic chooses between options -> BT concept. If logic defines long-lived phases -> HSM concept.
Server Endpoint Reference¶
| Endpoint | Method | Tag | Description |
|---|---|---|---|
/health |
GET | Core | Health check and server metadata |
/ag-ui |
POST | Agent UI | AG-UI streaming endpoint with sideband graph events |
/stream |
POST | Agent UI | Generic SSE stream endpoint for graph agent execution |
/acp |
MOUNT | ACP | Agent Communication Protocol (pydantic-acp) |
/a2a |
MOUNT | A2A | Agent-to-Agent (fastA2A) JSON-RPC endpoint |
/api/approve |
POST | Human-in-the-Loop | Resolves pending tool approvals and MCP elicitation requests |
/chats |
GET | Core | List all stored chat sessions |
/chats/{chat_id} |
GET | Core | Get full message history for a specific chat |
/chats/{chat_id} |
DELETE | Core | Delete a specific chat session |
/mcp/config |
GET | Interoperability | Return the current MCP server configuration |
/mcp/tools |
GET | Interoperability | List all tools from connected MCP servers |
/mcp/reload |
POST | Interoperability | Hot-reload MCP servers and rebuild graph |
/api/dashboard/layout |
GET/PUT | Dashboard (OS-5.9) | Get or save dashboard service layout |
/api/dashboard/data |
GET | Dashboard (OS-5.9) | Fetch all widget data from 50 services |
/api/dashboard/data/{id} |
GET | Dashboard (OS-5.9) | Fetch single service widget data |
/api/dashboard/full |
GET | Dashboard (OS-5.9) | Layout + data in single request (initial load) |
/api/dashboard/widgets |
GET | Dashboard (OS-5.9) | List available widget types |
/api/dashboard/health |
GET | Dashboard (OS-5.9) | Health check across all services |
/api/dashboard/discover |
GET | Dashboard (OS-5.9) | Auto-discover services from mcp_config |
/ws/dashboard |
WS | Dashboard (OS-5.9) | Real-time streaming updates |
The Complete Execution Journey¶
Phase 1: Ingress & Protocol Handling¶
- Entry: A user query (text + optional images) arrives via any supported protocol: AG-UI (
/ag-ui), ACP (/acp), SSE (/stream), or REST (/api/chat). - Direct Dispatch Check: If a
graph_bundleis present andGRAPH_DIRECT_EXECUTION=true, AG-UI routes directly toexecute_graph_iter()— bypassing the outer LLM agent. - Unified Execution: All protocols funnel through the same graph engine via
graph/protocol_agnostic_execution.py. Theexecute_graph_iter()entry point usesgraph.iter()for step-by-step control. - State Initialization: A fresh
GraphStateis initialized with the synthesizedquery_parts.
Phase 2: Safety & Policy Enforcement¶
- Usage Guard: The
usage_guard_stepchecks session's token usage and estimated cost against safety limits. - Policy Check: If enabled, a lightweight LLM check validates the query against security policies.
Phase 3: Routing & Planning¶
- Fast-Path Check: Trivial or conversational queries are answered directly, bypassing the full graph pipeline.
- Routing: The
router_stepanalyzes the multi-modal intent and generates aGraphPlan. - Infinite-Loop Guard: A
node_transitionscounter (max 50) prevents runaway graph execution.
Phase 4: Context Enrichment & Dispatch¶
- Memory Selection: On first entry, the
dispatcherroutes tomemory_selection_stepfor RAG-style context injection. - Research-Before-Execution: The dispatcher reorders the plan to guarantee research steps execute before specialist steps.
- Dispatch: The
dispatcherspawns selected specialist nodes with concurrent execution viaparallel_batch_processor.
Phase 5: Parallel Execution¶
- Specialist Loop: Each specialist enters a high-fidelity
Agent.run()loop with dedicated system prompts, domain-specific toolsets, and original multi-modal query parts. - Convergence: Results are coalesced at the
execution_joinerand written to theresults_registry.
Phase 6: Verification & Synthesis¶
- Verification: The
verifier_stepcompares results against user intent using aValidationResultscore (0.0-1.0). - Feedback Loop: Score 0.4-0.7 -> re-dispatch same plan with feedback. Score < 0.4 -> full re-plan via
planner_step. - Synthesis: Once validated (score >= 0.7), the
synthesizer_stepcomposes the final markdown response. - Memory Persistence: Execution metadata is persisted to the Knowledge Graph as a
historical_executionmemory.