Code Intelligence — type/scope-resolved call graph (CONCEPT:KG-2.100)¶
The Knowledge Graph models code as :Code symbols linked by :calls, :inherits,
:realizes, :dependsOn, :covers. The accuracy of those links is set by how
calls are resolved. Before KG-2.100 resolution was name-only: a method call
obj.run() bound to any symbol named run, and a callee name shared by more than
ten symbols was dropped entirely. KG-2.100 makes resolution type- and
scope-aware, computed in the Rust engine and shipped already-resolved.
What changed¶
- Resolution is in Rust, in one round-trip. The
epistemic-graphengine'sIndexRepositoryop parses every file (rayon) and resolves cross-file calls over the whole batch, returning one merged graph. The Python pipeline calls it once instead of doing a separate parse pass plus a per-symbol Python resolution loop. - A resolution ladder, most-specific first (
crates/eg-compute/src/parser/resolve.rs): same_file— a definition in the caller's own file.scoped— aself/this/super(or implicit-this) call, or an explicit receiver naming a class, binds to that class's method or an inherited one.arity— same-name overloads disambiguated by argument count.unique— a single definition anywhere in the batch.- otherwise unresolved — an ambiguous callee is never guessed.
Every resolved
callsedge carries astrategyand aconfidence. - Structural class edges. Class base/interface lists produce
inherits(subclass to base) andrealizes(class to interface) edges, resolved the same conservative way. - The OWL layer reasons over them.
:inheritsis transitive, so the reasoner extrapolates inheritance chains;:callsreachability stays available to the graph algorithms (PageRank / community detection) over the now-accurate edges.
The two ingest paths, one resolver¶
Both code-ingest paths consume the same Rust resolver:
- Local
EnrichmentPipeline(enrichment/pipeline.py) — when the engine advertisesIndexRepository, oneindex_repositorycall yields the symbols and the resolvedCALLS/INHERITS/REALIZESedges. Name-only resolution remains only as the engine-unreachable fallback. - GitLab / source-sync (
core/gitlab_indexer.py) — already shipped each project toindex_repository; it now also passes theinherits/realizesedges through, namespaced per instance.
Surfaces¶
The resolved graph is queryable on both surfaces (same _execute_tool core):
- MCP:
graph_analyze(action="call_graph", node_id=<symbol>, target=<callees|callers|inherits>). - REST:
GET /graph/analyze/call-graph?id=<symbol>&direction=<callees|callers|inherits>(the action-routedPOST /graph/analyzealso accepts it).
Flow¶
flowchart LR
Files["Source files"] --> Index["index_repository RPC"]
Index --> Parse["tree-sitter parse: scope, arity, call_sites, bases"]
Parse --> Resolve["resolve_site ladder: same_file, scoped, arity, unique"]
Resolve --> Edges["calls with strategy + confidence; inherits; realizes"]
Edges --> KG["Knowledge Graph :Code symbols"]
KG --> OWL["OWL reasoning: transitive inherits, call reachability"]
KG --> Query["call_graph query: MCP action and REST twin"]
OWL --> Query
Key files¶
| Layer | File |
|---|---|
| Rust extraction | epistemic-graph/crates/eg-compute/src/parser/tree_sitter.rs |
| Rust resolution | epistemic-graph/crates/eg-compute/src/parser/resolve.rs |
| Pipeline consumer | agent_utilities/knowledge_graph/enrichment/pipeline.py, enrichment/extractors/code_test.py |
| GitLab consumer | agent_utilities/knowledge_graph/core/gitlab_indexer.py |
| Ontology | agent_utilities/knowledge_graph/ontology_software.ttl (:inherits, :realizes) |
| Reasoning | agent_utilities/knowledge_graph/core/owl_bridge.py |
| Surfaces | agent_utilities/mcp/tools/analysis_tools.py, agent_utilities/mcp/kg_server.py |
Model-free similarity (CONCEPT:KG-2.101)¶
Code search and clone detection must keep working when the embedder is offline (the recurring GB10 502s). So similarity is model-free, computed in the same Rust round-trip:
- Each symbol gets a MinHash signature over its normalized AST-leaf trigrams — identifiers/strings/numbers/types are abstracted to class tokens (so a renamed-variable clone still matches) while keywords/operators/punctuation are kept verbatim (so structure is preserved).
- The resolver LSH-bands the signatures: symbols colliding in any band are
candidate pairs, linked with a symmetric scored
similar_toedge when their estimated Jaccard ≥ 0.5 (capped per node; mega-buckets skipped). The signature is a compute-only input and is stripped from the graph nodes. :similarTois a symmetric OWL property; the reasoner closes it both ways.- Query it embedder-free:
graph_analyze(action="similar_code", node_id=…)/GET /graph/analyze/similar-code?id=….
This is the near-clone signal B4 reuses for CodeClone.
Code ↔ service linking (CONCEPT:KG-2.102)¶
Routes are the seam between a service's code and the live ecosystem. From the route
decorators the parser captured, the routes pass emits Route nodes (method+path)
and serves edges (handler Code → Route); a best-effort name match links each
Route to a deployed ecosystem Service (servedBy). The OWL surpass: reasoning
chains Code –serves→ Route –servedBy→ Service –deployedOn→ Node — a fact a
siloed per-repo code tool can't produce because it never sees the topology. Query
it: graph_analyze(action="routes") / GET /graph/analyze/routes. (gRPC/GraphQL
detection and event channels are later increments.)
Infra, coupling, clones, decisions (CONCEPT:KG-2.103–2.105)¶
The graph spans past the code itself:
- IaC → Resource (KG-2.103). Dockerfiles, K8s/Kustomize manifests, and
Terraform are parsed into
Resourcenodes (image/kind/name) and linked to the deployedServicetheyprovision— so code → infra → topology is one graph. - Git change-coupling → FILE_CHANGES_WITH (KG-2.104). Files that keep changing
together get a symmetric weighted edge — the hidden blast radius the AST can't
see.
graph_analyze(action="change_coupling", target=<repo>). - Near-clones are the
similar_toedges from KG-2.101 (MinHash) — no separate pass needed. - ADRs (KG-2.105).
graph_analyze(action="adr")creates/listsArchitectureDecisionRecordnodes so design decisions live in the same KG.
Grammar coverage (CONCEPT:KG-2.106)¶
The core ast tier parses 9 languages (Python/JS/TS/Go/Rust/Java/C/C++/C#). The
feature-gated ast-extended tier (folded into the engine's full build) adds
Ruby, PHP, Bash, Scala, and Lua — so a slim build stays lean while the deployed
engine spans the common ecosystem languages. New grammars wire in by adding the
crate to the gate, an extension→grammar arm, and any new node-kind mappings; the
resolver/similarity/route passes are language-agnostic and benefit for free.