Epistemic Graph Service Layer Architecture¶
CONCEPT:KG-2.7 — Tokio-first graph service
Overview¶
The epistemic-graph service layer is a long-running Tokio process that holds multiple named graphs in memory and serves requests over Unix Domain Socket (UDS) or TCP. It replaces the previous PyO3 in-process FFI approach as the primary compute backend.
Python (agent-utilities API gateway)
└── GraphComputeEngine (async UDS/TCP client)
└── epistemic-graph-server (Tokio, long-running)
├── GraphRegistry (named graphs)
├── ChannelManager (P2P, 1:1, many:many, bus)
├── IsolationLayer (ACL enforcement)
└── Checkpoint/Persistence
Graph Topology¶
Graph Types¶
| Type | Naming Convention | Access | Purpose |
|---|---|---|---|
| Bus | __bus__ |
All agents R/W | Global event broadcast, inter-agent messaging |
| Agent | agent:<id> |
Owner: full, Manager: full, Peers: denied | Private agent knowledge, episode memory |
| Team | team:<name> |
Members: read, Manager: R/W | Shared team context, project knowledge |
| Global | global:<name> |
All: read-only | System ontology, tool registry |
Isolation Rules¶
- Peer isolation: Agent graphs are invisible to peer agents
- Hierarchical access: Manager agents have full access to subordinate graphs
- Bus is public:
__bus__readable/writable by all authenticated agents - Team scoping: Team graphs are read-only for members, read-write for manager
- Global read-only: Global graphs are system-managed, agent-readable
Dynamic Communication Channels¶
Agents can create ephemeral channels for P2P or group communication:
- 1:1 channels:
channel:p2p:<agent_a>:<agent_b>— direct messaging - Many:many channels:
channel:group:<uuid>— group created by any agent - Lifecycle: Create → Join → Leave → Close
- KG Imprint: On close, the channel creates a permanent KG record with:
- Vectorized embedding of the conversation summary
- Participant edges preserved permanently
- Topic metadata and timestamps
Configuration¶
All settings are available in the XDG config.json:
| Field | Env Var | Default | Description |
|---|---|---|---|
graph_service_socket |
GRAPH_SERVICE_SOCKET |
$XDG_RUNTIME_DIR/epistemic-graph.sock |
UDS socket path |
graph_service_tcp_addr |
GRAPH_SERVICE_TCP_ADDR |
None |
TCP address (e.g., 0.0.0.0:9100) |
graph_service_auth_secret |
GRAPH_SERVICE_AUTH_SECRET |
None |
HMAC-SHA256 shared secret |
graph_service_checkpoint_secs |
GRAPH_SERVICE_CHECKPOINT_SECS |
300 |
Auto-checkpoint interval |
graph_service_persist_on_shutdown |
GRAPH_SERVICE_PERSIST_ON_SHUTDOWN |
true |
Serialize on shutdown |
Authentication¶
All connections require HMAC-SHA256 authentication:
- Client computes HMAC-SHA256(secret, request_id) and sends it as auth_token
- Server verifies the token before processing any request
- For UDS-only deployments, Unix file permissions provide additional isolation
- TCP deployments require authentication
API Gateway Integration¶
The service lifecycle is tied to the agent-utilities API gateway:
- Startup: Gateway sends Reconcile to push authoritative state from the backend
- Shutdown: Gateway sends Checkpoint to persist all graphs
- The service process is the epistemic-graph-server Rust daemon (run via
cargo run -p epistemic-graph). GraphComputeEngine will auto-start it on
first connect when EPISTEMIC_GRAPH_AUTOSTART=1 is set; otherwise the daemon
must already be running.
Migration from PyO3¶
The PyO3 in-process FFI path has been fully removed. GraphComputeEngine
talks to the epistemic-graph-server Tokio daemon exclusively over the
out-of-process MessagePack/UDS (or TCP) client (epistemic_graph.client); there
is no in-process embedded fallback. If the daemon is unreachable, set
EPISTEMIC_GRAPH_AUTOSTART=1 to have the engine spawn it on first connect.
Sharding (Stage 2)¶
With 2+ entries in GRAPH_SERVICE_ENDPOINTS, GraphComputeEngine routes each
named graph to its owning shard by HRW rendezvous hashing (tenant → named
graph → HRW → shard, CONCEPT:KG-2.58). Autostart then applies only to the
local unix:// endpoint — an unreachable remote shard is a fail-loud
ConnectionError naming the shard. See
engine_sharding.md.