Agentic Harness Engineering (AHE) — Architecture¶
CONCEPT:AHE-3.0 — Agentic Harness Engineering
Overview¶
AHE is a closed-loop optimization framework where an Evolve Agent iteratively improves the agent harness — its tools, middleware, memory, skills, sub-agents, and system prompt — guided by three pillars of structured observability.
Hybrid State Model¶
| State Layer | Implementation | Storage |
|---|---|---|
| Epistemic (what the agent knows) | IntelligenceGraphEngine + MAGMA views |
knowledge_graph.db |
| Normative (what the agent is allowed to do) | Component files (prompts, middleware, tools) | Filesystem + git |
| Causal (what caused improvement) | Change Manifests | .specify/manifests/ + KG |
AHE Evolution Loop¶
graph LR
A[OS-5.4: Langfuse Traces] --> B[AHE-3.1: Automated Distillation]
B --> C["KG-2.6: Summaries & Clusters"]
C --> D[ORCH-1.21: Failure Taxonomies]
D --> E[KG-2.6: Layered Evidence Corpus]
E --> F[ORCH-1.1: Evolve Agent Decisions]
B -.-> G[OS-5.4: langfuse-agent API]
C -.-> H[KG-2.6: RLM Summarizer]
D -.-> I[KG-2.0: KG Semantic Clustering]
E -.-> J[KG-2.0: Versioned Files + KG Nodes]
Component Types¶
AHE decomposes the harness into 7 independently editable component types:
graph TD
subgraph "AHE Component Types"
SP["System Prompt<br>prompt_builder.py<br>structured_prompts.py"]
TD["Tool Description<br>tool_filtering.py<br>SKILL.md frontmatter"]
TI["Tool Implementation<br>tools/*.py<br>mcp_server.py"]
MW["OS-5.3: Middleware<br>middlewares.py<br>guardrails.py<br>tool_guard.py"]
SK["Skills<br>universal-skills/"]
SA["Sub-Agents<br>graph/steps/<br>HSM specialist nodes"]
LM["Long-Term Memory<br>knowledge_graph/<br>MemoryNode"]
end
subgraph "OS-5.4: Observability Pillars"
CO["OS-5.4: Component Observability<br>File-level diffs + git"]
EO["OS-5.4: Experience Observability<br>TraceDistiller → EvidenceCorpus"]
DO["OS-5.4: Decision Observability<br>ChangeManifest + VerificationResult"]
end
SP --> CO
TD --> CO
TI --> CO
MW --> CO
SK --> CO
SA --> CO
LM --> CO
Constraint Hierarchy¶
Constraints escalate through 4 enforcement levels when violations are detected:
graph LR
P["PROMPT<br>Level 1: Advisory"] --> TD2["TOOL_DESCRIPTION<br>Level 2: Descriptive"]
TD2 --> M["MIDDLEWARE<br>Level 3: Blocking"]
M --> TI2["TOOL_IMPLEMENTATION<br>Level 4: Hardcoded"]
style P fill:#4caf50,color:#fff
style TD2 fill:#ff9800,color:#fff
style M fill:#f44336,color:#fff
style TI2 fill:#9c27b0,color:#fff
When a constraint is violated at the prompt level, the ConstraintEngine
auto-escalates it to middleware-level enforcement after the escalation
threshold is reached. This ensures the agent cannot repeatedly "forget"
important constraints.
Package Structure¶
agent_utilities/harness/
├── __init__.py # Package exports (CONCEPT:AHE-3.0)
├── manifest.py # ComponentType, ComponentEdit, ChangeManifest
├── evidence_corpus.py # EvidenceLayer, EvidenceEntry, EvidenceCorpus
├── component_registry.py # HarnessComponentRegistry
├── trace_backend.py # TraceBackend ABC + Langfuse/OTel/File backends
├── evolve_agent.py # EvolveAgent (lightweight + full modes)
├── verifier.py # ManifestVerifier + auto-revert
└── constraint_engine.py # ConstraintLevel, ConstraintEngine
Integration Points¶
- SDD Pipeline: Manifests stored in
.specify/manifests/alongside specs/plans - Knowledge Graph:
ChangeManifest,ComponentEditRecord,EvidenceRecord,ConstraintStatenode types - RLM:
TraceDistiller(knowledge_graph/adaptation/trace_distiller.py) uses RLM for deep failure analysis on massive trace data - Langfuse Agent: Direct API import via
from langfuse_agent.api_client import LangfuseApi(seeharness/trace_backend.pyLangfuseTraceBackend)