Middle-Core Knowledge Loop (Loop 5)¶
Status: design note + deterministic skeleton
Why¶
Today middle-core is an open-loop compiler:
model.yaml -> generator -> contracts -> runtime -> evidence
Evidence is terminal. Knowledge landed by the knowledge-drop scenario becomes
searchable knowledge-chunks and a knowledge-graph-snapshot, and then nothing
flows back. The Labs north-star — "One Model, Many Projections" — implies the
inverse arc too: what the platform ingests should be able to reshape the platform's
own governed model. That closes the loop:
knowledge -> (extract) -> observations -> (propose) -> model delta -> (govern) -> model.yaml -> regenerate -> contracts -> ...
This note defines that loop and ships the deterministic middle arc (propose) as a
skeleton. The non-deterministic arc (extract) is an explicit, pluggable boundary so
the rest of the system keeps its deterministic, gate-testable guarantees.
The loop, arc by arc¶
| Arc | Input → Output | Determinism | Owner |
|---|---|---|---|
| 1. Ingest (exists) | knowledge-source → knowledge-chunks, knowledge-graph-snapshot, evidence-pack |
deterministic runtime | KnowledgeDropScenarioRunner |
| 2. Extract (boundary, new) | chunks → observations.json (candidate concepts / object types / relationships) |
non-deterministic (NLP/LLM) — pluggable | nlp-engineer → knowledge-engineer |
| 3. Propose (skeleton, new) | observations.json + model.yaml → proposal.json (model delta) |
deterministic | tools/modelgen/propose_model_evolution.py |
| 4. Govern (vocabulary exists) | proposal.json → decision-record (proposed → accepted) |
human/agent gated | ontologist-ufo / knowledge-engineer + reviewer |
| 5. Apply + regenerate (exists) | accepted delta merged into model.yaml → regenerate |
deterministic | generator + drift gate |
The determinism boundary¶
The only non-deterministic step is Extract. Everything downstream of it is a pure
function of structured data, so the existing gates (model validation, drift, SHACL, OWL)
still hold. propose never sees free text — it consumes a structured observations.json
that an extractor (or a human, or an ontology agent) produced. This keeps the loop
honest: a model change is only ever proposed by deterministic, reviewable diffing, and
only ever applied through governance.
Governed, never auto-applied¶
propose does not mutate model.yaml. It emits a proposal.json shaped like the
input to a decision-record (it carries a PROV-O-aligned provenance header: agent_id,
activity_id, schema_version, recorded_at — mirroring ProvenanceStamp). A reviewer
(or an evidence gate; see Loop 1) accepts it before any model edit. This reuses the
governance primitives the model already defines (decision-record, evidence-pack,
capability-exercise) rather than inventing a side channel.
The skeleton: propose_model_evolution.py¶
A deterministic "lift" that diffs structured observations against the current model and proposes only the genuinely new elements.
python tools/modelgen/propose_model_evolution.py \
--model model/middle-core/model.yaml \
--observations model/middle-core/examples/knowledge-loop-observations.example.json \
--agent knowledge-engineer \
--activity knowledge-drop \
--recorded-at 2026-05-25T00:00:00Z # optional; omit to stamp "now"
Output (proposal.json):
provenance—agent_id,activity_id,schema_version(read from the model),recorded_at(injected clock, likeISerializationClock, so output is reproducible).proposed_additions—ontology_concepts,object_types,relationship_typesthat are not already in the model (sorted by id).already_present— observed ids the model already has (the proposal is idempotent: observations ⊆ model ⇒ empty proposal).conflicts— observed elements that cannot be added consistently (e.g. a relationship whosefrom/tois neither in the model nor in the same proposal, or an object type referencing an unknown ontology concept).status— alwaysproposed.
Determinism: ids validated (kebab-case for object/relationship ids, PascalCase for
concepts), all lists sorted, json.dumps(..., sort_keys=True, indent=2) + trailing
newline. Re-running with the same inputs yields byte-identical output.
How this composes with the other loops¶
- Loop 1 (evidence-gated promotion) is the natural
Governgate: a proposal can be required to carry a passingcapability-exercise+ completeevidence-packbefore adecision-recordmay moveproposed → accepted. - Loop 4a (single-source gUFO stereotypes) is what lets a proposed
ontology_conceptcarry a realstereotype(Kind/SubKind/EventType/Relator), so a lifted concept lands as a first-class gUFO commitment, not a bare string. - Agent visibility (Loop 6, delivered) — the model now declares an
ActorgUFO Kind withagentas its SubKind, plus the provenance linksagent --performs--> work-packetandevidence-pack --attributed-to--> agent(PROV-OwasAttributedTo). The proposal'sagent_idandProvenanceStamp.AgentIdnow correspond to a first-class ontology node. Remaining: wiring the runtime to attach anagentnode and anattributed-toedge to the pinned scenario graph (changes the UI node/edge counts, so it ships with the UI update).
What is intentionally not here¶
- Extraction. No NLP/LLM. The extractor is a boundary;
observations.jsonis its contract. A reference extractor can be added later behind that contract without touching the deterministic core. - Auto-apply. Merging an accepted delta into
model.yamlis deliberately left to the governance step (human/agent + the existing drift gate), not automated by the skeleton.
Future arcs¶
- A reference extractor (
extract_observations.py) overknowledge-chunkexcerpts. - Round-trip lift from external OWL/LinkML contributions into
observations.json(the LinkML projection is already the structural IR). - An
apply_proposal.pythat merges an accepted proposal and runs the generator + gates in one governed step, emitting thedecision-recordandevidence-packas it goes.