Skip to content

Epistemic Resource Scheduler (CONCEPT:OS-5.8)

Overview

Traditional schedulers allocate compute based on basic heuristics like NICE levels or first-come, first-served queues. The Epistemic Resource Scheduler scales scheduling priority, CPU affinity, and execution token budgets dynamically based on the topological importance of each agent inside the active Knowledge Graph.

By computing the degree centrality of active specialists, the scheduler prioritizes nodes that act as reasoning bottlenecks (e.g., shared planners or legal validators).

Priority Formula

The scheduler applies a two-stage priority adjustment when an agent process is submitted:

Stage 1: Centrality Computation

centrality = degree(agent_id) / (num_nodes - 1)

Where degree(agent_id) is the number of edges (both in and out) connected to the agent's node in the active Knowledge Graph. This approximates eigenvector centrality for sparse graphs.

Centrality Range Interpretation
0.0 – 0.3 Peripheral agent (leaf node, few connections)
0.3 – 0.6 Mid-tier agent (moderate connectivity)
0.6 – 1.0 Hub agent (routing bottleneck, many dependents)

Stage 2: Dynamic Scaling

Two adjustments are made based on centrality:

Token Quota Scaling

if centrality > 0.5:
    final_quota = base_quota × (1.0 + centrality)
Base Quota Centrality Final Quota
100,000 0.3 100,000 (unchanged)
100,000 0.6 160,000 (+60%)
100,000 0.8 180,000 (+80%)
100,000 1.0 200,000 (+100%)

Priority Boost

if centrality > 0.6 and priority > CRITICAL:
    adjusted_priority = max(CRITICAL, priority - 1)
Original Priority Centrality Adjusted Priority
LOW (3) 0.7 NORMAL (2)
NORMAL (2) 0.8 HIGH (1)
HIGH (1) 0.9 CRITICAL (0)
CRITICAL (0) 1.0 CRITICAL (0) — unchanged

Preemption Protocol

The Cognitive Scheduler implements a multi-stage preemption cascade:

┌──────────────────────────────────────────────────────────────┐
│ Stage 1: Budget Warning (85% threshold)                     │
│   → Log NEAR_QUOTA warning                                  │
│   → No action taken                                         │
├──────────────────────────────────────────────────────────────┤
│ Stage 2: Cost-Aware Auto-Downgrade (70% cost threshold)     │
│   → Switch to cheaper model tier (super → standard → lite)  │
│   → Continue execution with degraded quality                │
├──────────────────────────────────────────────────────────────┤
│ Stage 3: Token Quota Exceeded (100%)                        │
│   → Checkpoint context to KG                                │
│   → Move process to PAUSED state                            │
│   → Schedule next waiting process                           │
├──────────────────────────────────────────────────────────────┤
│ Stage 4: Cost Budget Exceeded                               │
│   → Try one more auto-downgrade                             │
│   → If no cheaper tier: checkpoint + preempt                │
└──────────────────────────────────────────────────────────────┘

Context Paging

When a process is preempted, its context is serialized to a KG checkpoint:

checkpoint_id = f"ckpt:{uuid.uuid4().hex[:8]}"
proc.checkpoint_id = checkpoint_id
proc.state = ProcessState.PAUSED
proc.preempted_at = time.time()

Resumption restores the checkpoint:

scheduler.resume(process_id)
# → RUNNING if capacity available, WAITING if full

Inference Budget Control

Each AgentProcess carries an InferenceBudget with cost-aware tier management (Research: 2605.05701v1):

budget = InferenceBudget(
    cost_budget_usd=1.0,        # Max $1 spend
    current_tier="standard",     # Start with GPT-4o class
    auto_downgrade=True,         # Degrade before preempting
    fallback_chain=["super", "standard", "lite"],
    downgrade_threshold=0.70,    # Downgrade at 70% budget usage
)

Tier Cost Model

Tier Cost per 1K Tokens Example Models
lite $0.00015 Gemini Flash, GPT-4o-mini
standard $0.002 Gemini Pro, GPT-4o
super $0.015 Gemini Ultra, o3

API

# Record an inference call with cost tracking
result = scheduler.record_inference(proc_id, tokens=5000, model_tier="standard")
# → {"within_budget": True, "cost_incurred": 0.01, "recommended_tier": "standard", "downgraded": False}

# Get budget statistics
stats = scheduler.get_budget_stats(proc_id)
# → {"budget_usage_pct": 45.2, "cost_remaining_usd": 0.548, ...}

# Get recommended tier
tier = scheduler.get_recommended_tier(proc_id)
# → "lite" (if budget pressure is high)

Process Lifecycle States

WAITING ──(capacity available)──► RUNNING ──(complete)──► COMPLETED
   ▲                                │
   │                                ├──(fail)──► FAILED
   │                                │
   └──(resume, no capacity)──── PAUSED ◄──(preempt)

Integration Points

  • Cognitive Scheduler (OS-5.2): This concept is implemented directly within CognitiveScheduler
  • Knowledge Graph (KG-2.0): Centrality computed from engine.graph (NetworkX)
  • Convergence Monitor (AHE-3.2): Optional multi-loop convergence tracking
  • AgentProcessNode (KG model): Processes persisted as KG nodes for observability

Implementation Details