Generalizing GEPA (CONCEPT:ORCH-1.30)¶
Overview¶
Makes GEPA-optimized skills transfer off the optimization split — the AppWorld RLM-GEPA result
(the +7.2 pp SGC that held on held-out test). Adds the GEPA paper's held-out feedback/Pareto split,
predict-rlm's AgentSpec anti-overfit grounding, and held-out candidate selection. Extends ORCH-1.13.
How it works¶
- Held-out split (
split_dataset) —D_train → D_feedback(propose on) +D_pareto(held-out select on). Withdev_fraction > 0,GEPAOptimizer.optimizeproposes/evaluates on the feedback set and selects the final candidate by held-out score (select_best_on_heldoutover_score_candidate_on), so a candidate that merely memorized the minibatch does not win. AgentSpecgrounding —use_cases+runtime_grounding+scoring_rule+counterfactual_axisare prepended (as_prompt()) to the reflective-mutation prompt, steering the proposer toward a general standard-operating-procedure rather than rules that overfit the examples.- Patch-merge selection —
select_best_on_heldoutpicks the winning instruction graft on the held-out set (ties → earlier/simpler generation).
Key files / API¶
| Piece | Location |
|---|---|
| Generalization core | rlm/gepa.py (AgentSpec, split_dataset, select_best_on_heldout, GEPAOptimizer.optimize(dev_fraction=...), _score_candidate_on) |
Wiring (≤3 hops)¶
graph_orchestrate(action="rlm_optimize") → optimize_rlm_skill → GEPAOptimizer.optimize (≤3 hops).
Research provenance¶
GEPA paper (Agrawal et al., ICLR 2026 — D_feedback/D_pareto split, Algorithm 1); predict-rlm src/rlm_gepa/schema.py (AgentSpec) — verified.