Skip to content

MCR-F4 — Middle-core data-platform contract (consumer binding for the UDA)

Context and Problem Statement

The Universal Data Adapter (ADR 0001, epic #13) is meant to consume middle-core's data-platform objects + projection interfaces — the contract the hub registry calls MCR-F4 (DataPlatformContracts.g.cs). In the registry MCR-F4 is producer: middle-core (RT7) → consumer: backend-core UDA (RT6), and its status is PENDING — middle-core #11 (draft). backend-core #40 ("Bind UDA to MCR-F4") is therefore BLOCKED.

Two problems block a binding even once #11 ships:

  1. No language-neutral artifact. MCR-F4 is generated C#. backend-core's UDA is Python (FastAPI) with a Rust serving path — neither can bind to C# types. A neutral on-the-wire schema is required so both sides generate from one source of truth.
  2. No agreed type system / versioning across the hop. The UDA's own type system is Apache Arrow (ADR 0001: introspect_schema → pyarrow.Schema). The contract must map cleanly to Arrow and to C#, and be versioned + drift-guarded (the stack already uses Pact-style provider/consumer tests + a CI drift gate, ARC-ADR-005).

This ADR proposes the neutral wire contract for MCR-F4 so middle-core can ratify it and #11 can publish authoritative records/projections against it. It adds no consumer code — only the contract artifact and this rationale.

Decision Drivers

  • Language-neutral — one schema; C# (producer) and Python/Rust (consumer) are generated, no language privileged.
  • Arrow- and CDM-aligned — types map 1:1 to Arrow (UDA introspection) and to a CDM semantic vocabulary for cross-source alignment.
  • Contract-first + drift-guarded — versioned (schemaVersion), pinned by the consumer, enforced in CI (consistent with ARC-ADR-005).
  • Read-only projections — projection interfaces are views; they never mutate (the UDA maps each to a connector read).
  • Minimal + extensible — scalars + array/map/record cover v1; additive evolution is non-breaking.

Considered Options

  1. Schema-first neutral contract — a JSON-Schema meta-model is the source of truth; middle-core's C# (DataPlatformContracts.g.cs) and backend-core's consumer types are both generated from it. (recommended)
  2. C#-first — keep DataPlatformContracts.g.cs authoritative; generate the neutral schema from the C#. Workable, but privileges one language and couples the cross-layer contract to a C# toolchain.
  3. Share C# types directly — rejected: no viable Python/Rust binding; violates the neutral-contract principle.
  4. OpenAPI-only or AsyncAPI-only — premature until we know whether the data platform is request/response (serve) or push (event stream). See open questions.

Decision Outcome

Proposed: Option 1 — a schema-first, language-neutral MCR-F4 contract. The artifact is contracts/proposed/mcr-f4.data-platform.schema.json (JSON Schema 2020-12), with a worked mcr-f4.example.json that validates against it and doubles as the future consumer-test fixture.

Shape

  • EnvelopeschemaVersion (semver). Additive (new optional field / new record / new projection) = MINOR; breaking (remove/rename/retype required field, change cardinality) = MAJOR. Consumer pins MAJOR; CI drift gate compares pinned vs published.
  • DataRecordname, fields[] (name, type, nullable, semanticType), primaryKey[].
  • ProjectionInterfacename, source record, cardinality (one|many), readOnly:true, fields[] (subset/renamed view), parameters[] (filters the consumer supplies).

Cross-language type mapping (normative)

Neutral Arrow (UDA) C# (MCR-F4) JSON wire
string large_utf8 string string
int32 int32 int number
int64 int64 long number
float32 float32 float number
float64 float64 double number
bool bool_ bool boolean
timestamp timestamp[us, UTC] DateTimeOffset RFC 3339 string
date date32 DateOnly ISO date string
time time64[us] TimeOnly ISO time string
bytes large_binary byte[] base64 string
decimal decimal128 decimal string
uuid large_utf8 (logical) Guid string
json large_utf8 JsonElement any
array<T> list<T> IReadOnlyList<T> array
map<string,T> map<utf8,T> IReadOnlyDictionary<string,T> object
record<R> struct nested type object

Transport binding (proposed)

Sync request/response over HTTP/JSON for v1: each ProjectionInterface is invoked with its parameters; the UDA materialises results as Arrow. (If middle-core's platform pushes records, this becomes an AsyncAPI channel instead — see open questions.)

Consumer obligations (backend-core UDA, when ratified — not in this ADR)

  • Pin schemaVersion (MAJOR) and validate incoming records/projections against the schema.
  • Map each projection → a connector read; return Arrow/JSON.
  • Add a Pact-style consumer contract test drift-guarded in CI, using mcr-f4.example.json as the seed fixture; register in the hub contracts registry.

Open questions (for hub / middle-core ratification)

  1. Source of truth: schema-first vs C#-first (Option 1 vs 2) — the key governance call.
  2. Serve vs push — request/response (OpenAPI) or event stream (AsyncAPI)? Decides the transport binding above.
  3. CDM vocabulary — adopt a shared semanticType term set (ties to the BigQuery introspect_schema CDM work in ADR 0001).
  4. Artifact home — promote this from contracts/proposed/ to the hub contracts registry once ratified, so middle-core and backend-core consume one copy.

Consequences

  • Positive — unblocks a concrete, drift-safe path for #40; gives middle-core #11 a schema to publish against; no language is privileged; aligns with Arrow + ARC-ADR-005.
  • Negative / risk — adds a schema-generation step to middle-core's build if Option 1 is chosen; the neutral type set may need extension as real records land.
  • #40 stays BLOCKED until (a) this contract is ratified at the hub and (b) middle-core #11 publishes authoritative records/projections; only then is the consumer adapter + test built.
  • backend-core #40 (UDA ↔ MCR-F4 binding), epic #13, ADR 0001 (UDA)
  • Hub: ARC-ADR-002 (JWT forwarding, accepted), ARC-ADR-005 (provider/consumer contracts), inter-layer contracts registry (contracts_url in .agent/hub.json)
  • Artifacts: contracts/proposed/mcr-f4.data-platform.schema.json, contracts/proposed/mcr-f4.example.json