Skip to content

Semantic Compactor & Refactorer (CONCEPT:KG-2.7)

Overview

The Semantic Compactor resolves the exponential database bloat caused by active agents generating millions of step-by-step reasoning traces.

It periodically executes an offline compilation and refactoring sweep. The compactor aggregates raw execution traces, interaction logs, and tool parameters, replacing thousands of individual AgentProcess nodes with synthesized high-level declarative state triples (e.g., summarizing total execution steps, token usage, and outcomes).

Problem Statement

At 1M+ agent scale, each agent execution generates 10–100 AgentProcess trace nodes. After 10K executions: - 1M+ trace nodes accumulate in the graph - Query latency degrades from sub-ms to 100ms+ - Memory consumption grows linearly without bound - Graph traversal algorithms (centrality, PageRank) become O(n²)

The Semantic Compactor reduces the trace node count by 95–99% while preserving aggregate semantics.

Lifecycle

1. QUERY: Find AgentProcess nodes for a given agent_id
2. THRESHOLD: If count < threshold (default 10), skip
3. AGGREGATE: Sum token usage, count state distributions
4. CREATE: Merge a SemanticSummary node with aggregated stats
5. LINK: Connect agent → SemanticSummary via HAS_COMPACTED_HISTORY
6. PRUNE: DETACH DELETE all original AgentProcess nodes
7. LOG: Report compaction count and summary node ID

API Surface

from agent_utilities.knowledge_graph.memory import SemanticCompactor

compactor = SemanticCompactor(engine=kg_engine)

# Compact traces for a specific agent (threshold: 10 traces)
deleted = compactor.compact_traces("agent:planner", threshold=10)
# Returns: number of trace nodes deleted (e.g., 47)

# Custom threshold for high-frequency agents
deleted = compactor.compact_traces("agent:scraper", threshold=5)

Compaction Thresholds

Agent Type Recommended Threshold Rationale
Background scrapers 5 High volume, low value per trace
Orchestration routers 10 (default) Medium volume, moderate diagnostic value
Critical planners 25 Lower volume, high forensic value
Evaluation runners 50 Traces needed for statistical analysis

Graph Schema

Before Compaction

(Agent {id: "agent:planner"})
  -[:HAS_PROCESS]→ (AgentProcess {id: "proc:1", state: "completed", tokens_used: 1500})
  -[:HAS_PROCESS]→ (AgentProcess {id: "proc:2", state: "completed", tokens_used: 2300})
  -[:HAS_PROCESS]→ (AgentProcess {id: "proc:3", state: "failed", tokens_used: 800})
  ... (47 more)

After Compaction

(Agent {id: "agent:planner"})
  -[:HAS_COMPACTED_HISTORY]→ (SemanticSummary {
      id: "summary:agent:planner:50_compacted",
      compacted_count: 50,
      total_tokens_consumed: 85000,
      agent_id: "agent:planner"
  })

OWL Alignment

The SemanticSummary node type should be declared in the SDD ontology as:

:SemanticSummary a owl:Class ;
    rdfs:subClassOf :KnowledgeNode ;
    rdfs:comment "Synthesized execution trace summary produced by KG-2.7 compaction" ;
    :hasProperty :compacted_count, :total_tokens_consumed, :agent_id .

:HAS_COMPACTED_HISTORY a owl:ObjectProperty ;
    rdfs:domain :Agent ;
    rdfs:range :SemanticSummary ;
    rdfs:comment "Links an agent to its compacted trace history" .

This ensures SHACL validators can verify that every Agent node with more than threshold historical processes has a corresponding SemanticSummary.

Error Handling

Condition Behavior
No KG engine provided Returns 0 (no-op)
Query fails Catches exception, logs error, returns 0
Partial deletion failure Logs individual failures, returns count of successful deletes
Agent has no traces Returns 0 (below threshold)
Backend returns unexpected row format Handles dict, list/tuple, and object formats defensively

Integration Points

  • Cognitive Scheduler (OS-5.2): After preempting and completing many processes, schedule periodic compaction
  • Telemetry Engine (OS-5.6): Compaction events are logged for observability dashboards
  • Memory Tiers (KG-2.6): Compaction operates on the episodic tier, preserving semantic and procedural memories

Implementation Details

  • Source Code: agent_context.py (SemanticCompactor)
  • Classes: SemanticCompactor
  • Tests: test_synergies.py
  • Pillar: KG
  • Package Export: agent_utilities.knowledge_graph.memory.SemanticCompactor