egeria-mcp Overview¶
egeria-mcp wraps the Apache Egeria OMAG platform's View Server (OMVS) REST
surface with thin MCP tools, and is the write side of the Egeria↔KG federation.
Egeria is federated into agent-utilities as the metadata / governance / lineage system-of-record, alongside the epistemic-graph Knowledge Graph (the cognition/orchestration plane). Two hard invariants hold:
- The KG never becomes the lineage store — Egeria owns data lineage, the business glossary, and data-governance classifications.
- Egeria never orchestrates —
graph_orchestrate/ the policy router stay the orchestration brain; Egeria is the metadata oracle they query and write provenance back to.
What it provides¶
-
EgeriaApi(egeria_mcp.api.api_client_egeria) — a tolerant raw-httpx REST facade over the View Server. It deliberately avoids thepyegeriaruntime dependency: pyegeria's synchronous wrappers callasyncio.get_event_loop(), which raises on Python 3.14; plainhttpxworks identically on 3.11 and 3.14. Every call degrades to[]/ a clear error rather than raising. It is also the injectedconfig["client"]for the KGegeriaenrichment extractor. -
MCP tools (
egeria-mcpconsole script) — 21 tools: - Read (granular):
egeria_asset_search,egeria_glossary_lookup,egeria_glossary_categories,egeria_lineage,egeria_governance_for,egeria_list_policies. - Read (broad, action-dispatch):
egeria_catalog(assets, connections, connector-types, endpoints, infrastructure, technology types, schema types/attributes),egeria_data_design(data structures/fields/value specs),egeria_collection(collections, digital products),egeria_solution(information supply chains, blueprints, components),egeria_governance_catalog(governance definitions, external references, valid values),egeria_actors(actor profiles/roles, user identities, projects, communities, locations, cohorts),egeria_metadata(generic find/get across all element types). - Routing:
egeria_governed_route— the federation delivering a decision. - Harvest (write-gated):
egeria_harvest(layer)dispatches any of 34 bottom-up source layers (orall) via the runner — spanning infrastructure (hosts, Swarm/containers, DNS, Caddy), data stores (Postgres, MongoDB, Qdrant, Jena RDF, Kafka), business systems (Firefly + emerald-exchange markets, Twenty CRM, ERPNext, Listmonk, Nextcloud, data-science ML), identity/governance (Keycloak, OpenBao, ServiceNow), knowledge/EA (Confluence, M365, ArchiMate, LeanIX), automation/code/work (Ansible, Camunda, Plane/Jira, GitLab, GitHub), and collaboration/observability (Mattermost, Grafana, Uptime Kuma, ArchiveBox, Langfuse). (The original ten also keep dedicatedegeria_harvest_*tools.) - Reconcile (write-gated):
egeria_reconcilecross-links the harvested layers into one graph (reconcile()) — deterministic matchers create labelledDataFlowedges across layers (host→asset, service↔store, dataset→store, ingress→service, monitor→target, CMDB identity, access-control, glossary semantic assignment) and propagate confidentiality up hosting chains, sogoverned_routeimpact is cross-layer-aware. Idempotent. Alsopython -m egeria_mcp.harvest reconcile. - Audit (read-only):
egeria_auditreports unlinked "island" assets and per-layer lineage coverage % — what reconciliation/harvest still misses.
The reconciled graph also federates back into the KG: EgeriaApi.list_data_flows()
enumerates the catalogue's lineage edges and the agent-utilities egeria extractor
turns each into a :flowsTo (data movement) or :dependsOn (structural) edge — so
the epistemic-graph sees the whole estate as one dependency graph.
- Write (gated by EGERIA_ENABLE_WRITE): egeria_classify,
egeria_create_term, egeria_create_asset, egeria_create_collection,
egeria_create_project, egeria_assert_lineage.
Governed routing — the federation delivering value¶
egeria_governed_route(asset_guid) consults Egeria's Confidentiality
classification and downstream DataFlow lineage for an asset and returns an
enforceable decision the policy router acts on:
| Confidentiality level | Downstream lineage | Decision |
|---|---|---|
| ≥ 2 (Confidential/Sensitive/Restricted) | any | require_approval |
| < 2 | > 0 | review |
| < 2 | 0 | proceed |
Egeria's Confidentiality scale: 0 Unclassified · 1 Internal · 2 Confidential ·
3 Sensitive · 4 Restricted.
Bottom-up harvest¶
egeria_mcp.harvest populates Egeria from the data estate in lineage order so
every edge resolves to an already-catalogued target. The data-store layer
(harvest/datastores.py, declared in harvest/topology.py) is the anchor: it
idempotently catalogs the business-glossary backbone, the data-store servers +
databases (with Confidentiality classifications), and the DataFlow lineage
between them. Run it with python -m egeria_mcp.harvest (needs
EGERIA_ENABLE_WRITE=true). Higher layers follow the same config-driven, tolerant pattern: Camunda
(harvest_processes, BPMN definitions → Process assets, CAMUNDA7_URL), ERPNext
(harvest_erpnext, DocTypes → data assets with confidentiality by data kind,
ERPNEXT_URL+ERPNEXT_TOKEN), and GitLab (harvest_repositories, projects →
DeployedSoftwareComponent assets, GITLAB_URL+GITLAB_TOKEN). Each skips
gracefully (reported, not raised) when its source is unconfigured/unreachable.
The estate is declared in harvest/topology.py as a generic, non-sensitive
example. Point the harvest at your real data stores with the
EGERIA_HARVEST_TOPOLOGY environment variable (path to a JSON file of the same
shape) — keep that file outside any public repository so internal
hostnames/addresses are never published.
Egeria 6.0 REST contract (verified)¶
| Operation | Endpoint | Notes |
|---|---|---|
| Token | POST {platform}/api/token |
{userId,password} → bearer JWT |
| Find | POST .../{service}/{noun}/by-search-string |
SearchStringRequestBody; empty string = match-all |
| Create | POST .../{service}/{noun} |
NewElementRequestBody, typed properties.class |
| Read-back | POST .../asset-maker/assets/{guid}/retrieve |
classifications are named elementHeader keys |
| Classify | POST .../classification-explorer/elements/{guid}/{name} |
bean field is confidentialityLevel |
| Lineage write | POST .../lineage-linker/from-elements/{a}/via/DataFlow/to-elements/{b}/attach |
DataFlowProperties |
| Lineage read | POST .../asset-catalog/assets/{guid}/as-lineage-graph |
edges in element.lineageLinkage |
Integration with agent-utilities¶
- Extractor:
agent_utilities/knowledge_graph/enrichment/extractors/egeria.py(CONCEPT:KG-2.9) — pure transform;EgeriaApiinjected asconfig["client"]. - Ontology:
agent_utilities/knowledge_graph/ontology_egeria.ttl— ArchiMate crosswalk reusing the enterprise classes (GlossaryTerm→:Concept, Asset/Connection→:DataConnector, Policy→:Policy, DataFlow→:flowsTo). - Federation key: every Egeria-sourced node carries
externalToolId(the Egeria GUID) +domain="egeria", so it reconciles with ServiceNow / ERPNext / Camunda / infra nodes by GUID/hostname rather than forking parallel nodes.
Configuration (environment)¶
| Var | Default | Meaning |
|---|---|---|
EGERIA_PLATFORM_URL |
https://localhost:9443 |
OMAG platform URL |
EGERIA_VIEW_SERVER |
qs-view-server |
View server name |
EGERIA_USER |
erinoverview |
User id |
EGERIA_USER_PASSWORD |
secret |
Password / token |
EGERIA_VERIFY_SSL |
False |
Verify TLS (self-signed homelab) |
EGERIA_ENABLE_WRITE |
False |
Gate every write/harvest tool |
EGERIATOOL |
True |
Register the Egeria tool set |