ADR-0001 — Model Authority: Hybrid (YAML canonical, RDF projected) with a trigger¶
Status: Accepted Date: 2026-05-26 Supersedes: none Superseded by: none
Context¶
Middle-core is a model-driven generative runtime. One governed model compiles to disposable contracts that a hand-authored C# runtime executes to produce evidence. The north-star is "One Model, Many Projections."
Today the pipeline is:
model/middle-core/model.yaml (CANONICAL, hand-authored)
+ model/middle-core/ontology/*.ttl (UFO-lite + BFO sidecar, hand-authored)
+ workflows/*.bpmn + decisions/*.dmn + projections/arcadedb.yaml
-> tools/modelgen/ (deterministic Python generator)
-> templates/middle-core/generated/
* middle-core.linkml.yaml (structural IR)
* *.g.cs (C# typed contracts)
* js/, rust/ (other-language emits)
* model-runtime.owl.ttl (OWL projection — verified, not source)
* model-runtime.shacl.ttl (SHACL projection — verified, not source)
* model-runtime.fixture.ttl (RDF instances)
-> hand-authored C# runtime over an in-memory object graph -> evidence packs
The platform vision pushes this further. As deployed assets generate feedback, the ontology should evolve, and an RDF graph engine (likely backend-core's ArcadeDB, which is multi-model) is the natural long-term home for that growth — not a hand-edited YAML file. Pre-built foundational ontologies (UFO, BFO, PROV-O) would be loaded into that engine; the transformation suite would read RDF and emit C#, Ruby, JavaScript, Python and other languages; feedback closes the outer loop back into the store.
The question this ADR answers: should authority flip from YAML to RDF now, later, or never?
Decision drivers¶
Three specialist judgments were collected independently and converged:
ontologist-ufo(UFO/OntoUML authoring lineage): hybrid now; flip when the YAML stereotype enum (currently 4 values) stops being expressive enough.knowledge-engineer(KG operationalization, reasoners, SPARQL): defer the full flip; the determinism boundary aroundextractand the content-address ofmodel.yamlare the production guarantees and must not regress.architect-reviewer(independent architecture review): mild two-sources-of-truth smell exists today (hand-edited TTLs alongside YAML) but is not yet rot; reversibility of the flip is asymmetric — cheap before authoring moves into the graph, expensive after.
Round-trip fidelity is the disqualifier today: OWL → YAML loses axiomatization (disjointness, property chains, equivalent-class, OntoUML stereotype-derived axioms); YAML → OWL loses OntoUML stereotype expressivity beyond the 4 values the validator currently allows. Lossless either direction is a precondition for flipping, and we have neither.
Decision¶
Hybrid authority, with a measurable trigger.
-
Domain schema, projections, runtime config: YAML remains canonical.
model/middle-core/model.yamlis the single source for object types, state machines, workflow steps, scenarios, projections, and data objects. The generator continues to emit OWL/SHACL/RDF as verified projections. -
Foundational-ontology layer: RDF is canonical for the imported foundations we don't own — gUFO, BFO 2020, PROV-O, and (when imported) external domain ontologies. These are not generated; they are reused. The hand-authored
model/middle-core/ontology/top-level-ufo-lite.ttlandmiddle-core.ttlare the local glue between the imported foundations and the YAML domain model. This carves the current "two sources of truth" smell into a defensible split: foundations live where their authors keep them (RDF), domain schema lives where humans review it (YAML). -
Trigger condition. Flip authority for the domain schema to RDF when two or more of the following hold simultaneously:
- ≥ 3 imported external foundational ontologies (today: ~2).
- ≥ 7 OntoUML stereotypes in regular use on the domain model (today: 4 in
middle_core_model.py'sUFO_STEREOTYPES; the langgraph emit'sGUFO_TIERalready covers 8 because that subsystem already needs them). - Loop 5 evolution exceeds ~1 accepted proposal per week (today: ~0).
- ≥ 5 cross-ontology subsumption axioms YAML cannot express.
model.yamlexceeds ~2000 lines (today: 387).- Feedback-write-back from deployed assets actually ships and writes to a persistent graph store on a regular cadence.
None of these is met today. Re-evaluate at every quarterly model review and whenever Loop 5 graduates from skeleton to production.
- Preserve the determinism boundary unconditionally. The only
non-deterministic step in the pipeline is
extract(knowledge → observations). Everything downstream of it —propose,govern,apply,regenerate,build— is a pure function of structured data and gate-testable against SHACL, OWL consistency, drift, and model-health. The flip must not dissolve that boundary. If/when RDF becomes canonical, the build reads from a content-addressed snapshot of the store, not from the live store.
Design that the flip will follow when triggered¶
Recording the design now so the trigger is actionable, not a research project:
Reproducible builds against a stateful authority¶
When RDF becomes canonical, the generator does not query the live store mid-build. Each build:
- Exports the canonical named graph(s) as canonical N-Quads (sorted, normalized).
- Computes a SHA-256 of the export — call it the model snapshot hash.
- Embeds the hash in the PROV-O
provenanceheader that every generator output already carries (mirroringProvenanceStamp.AgentId,activity_id,schema_version,recorded_at). - The N-Quads file is the build's reproducible input — equivalent to what
model.yamlis today. The live store is the authoring surface, not the build input.
Shared structural IR across emit targets¶
middle-core.linkml.yaml continues to serve as the frozen structural IR per
build. All language generators (C#, Ruby, JavaScript, Python, Rust, etc.)
consume the frozen IR, never the RDF directly. This preserves the pure-function
guarantee downstream of extraction even when the upstream authority is a
stateful graph store, and keeps the "One Model" promise concrete: every emit
target derives from the same byte-identical IR per build.
Loop 5 (propose → govern → apply) under RDF authority¶
proposal.json remains the human-readable decision-record surface (diff-friendly,
PR-reviewable). apply emits a SPARQL UPDATE wrapped in a PROV-O
prov:Activity with prov:used pointing at the snapshot hash and
prov:wasAttributedTo pointing at the agent. Each accepted proposal lands in
a timestamped named graph, giving bitemporal history without breaking
the current governance gate shape.
Hypergraph semantics, not graph semantics¶
The C# runtime today walks binary edges, but the model already commits to UFO
relators as n-ary mediating endurants (the "hyperedge-as-vertex" pattern,
per UfoStereotype.cs:36-40). The intent is hypergraph, not graph. As the
schema grows past ingest-evidence, the IR and the Runtime ports for every
language target (C#, Ruby, JavaScript, Python, Rust) must encode n-ary
mediation explicitly — typed roles on a relator, not a chain of binary
relationships. The relator surface gets its own ADR (ADR-0002, forthcoming) and
its own YAML block; today they are encoded only via ontology_concept tags
with stereotype: Relator.
Consequences¶
Positive¶
- The current pipeline keeps its production guarantees: PR-reviewable diffs, reproducible builds, deterministic gates, no live-store dependency at build time.
- The two-sources-of-truth smell is carved into a defensible split rather than ignored: foundations are RDF-canonical, domain is YAML-canonical.
- The trigger gives future agents a concrete, measurable condition for the flip. No more "should we do this?" — only "have we hit ≥ 2 of the six?"
- The hypergraph commitment is now binding across emit targets, not implicit in the C# runtime.
Negative¶
- Two authoring surfaces (YAML and TTL) continue to coexist, and contributors must understand which lives where. Mitigation: the generator already references and verifies the foundational TTLs — drift would surface on the L3 verification chain.
- OntoUML stereotype expressivity remains capped at what the YAML enum
allows, and the validator currently allows only
Kind | SubKind | EventType | Relator. The companion change in this PR extends the enum to addPhaseandRole(safely emittable as direct gUFO type-tier classes);ModeandSituationare deferred because they require the gUFO individual-tier pattern that lives intools/modelgen/langgraph_ontology.py:37but not yet in the main generator (tools/modelgen/generate_middle_core.py:1037-1040). - When the flip happens, contributors must learn SPARQL UPDATE for model
evolution. Mitigation: the decision-record surface stays YAML/JSON, and
applydoes the translation.
Companion change shipped with this ADR¶
This PR extends UFO_STEREOTYPES in tools/modelgen/middle_core_model.py to
add Phase and Role, and extends the matching C# UfoStereotype enum in
templates/middle-core/Runtime/Pinning/UfoStereotype.cs. No domain concept
uses the new stereotypes yet; this is scaffolding so that future ontology
contributions (and propose_model_evolution.py outputs) can land
anti-rigid sortals (Phase) and externally-dependent sortals (Role)
without re-opening the validator on every evolution.
References¶
- Charter: middle-core-charter.md
- Knowledge loop design (Loop 5): middle-core-knowledge-loop.md
- Runtime ADR: middle-core-model-runtime.md
- Inter-layer contracts (hub-published): https://nickpclarke.github.io/AgentArmy/contracts/
- gUFO tier mapping (the existing langgraph subsystem's reference set):
tools/modelgen/langgraph_ontology.py:37(GUFO_TIER). - The verified foundational TTLs that this ADR designates RDF-canonical:
model/middle-core/ontology/top-level-ufo-lite.ttlmodel/middle-core/ontology/middle-core.ttlmodel/middle-core/ontology/langgraph.ttl(generator-emitted fromlanggraph-meta.yaml)model/middle-core/ontology/langgraph.bfo.ttl(derived BFO sidecar)