Skip to content

ARC-ADR-003: No LLM key in the browser — Empty adapter security boundary

Metadata

Field Value
ID ARC-ADR-003
Status Proposed
Date 2026-05-25
Deciders Architecture Review
Supersedes
Superseded by
Tags security, copilotkit, empty-adapter, secrets, llm, boundary

Spoke-authored draft. Referenced as proposed by epic #12 / issue #13 at the hub path docs/decisions/ARC-ADR-003-no-llm-key-in-browser.md, but not yet published. This file mirrors that ID/filename so the issue links resolve once upstreamed.


Context and Problem Statement

The CopilotKit architecture isolates all LLM work in middle-core (Python / LangGraph / Cerebras). frontend-core hosts only a thin Next.js runtime route plus React generative-UI components. The LLM provider key (CEREBRAS_API_KEY) is a high-value secret: if it reaches the browser — in a bundle, an env var, or a network response — anyone can exfiltrate it and run up arbitrary spend or abuse the model.

CopilotKit's CopilotRuntime normally pairs with a service adapter that calls an LLM (e.g., OpenAI/Anthropic/Cerebras adapter), which needs a key wherever the runtime runs. But in this architecture the agent in middle-core already performs all LLM calls; the Next.js route should do no LLM calls of its own — it only proxies to middle-core's /copilotkit via remoteEndpoints. We need to decide which adapter the route uses and how we guarantee no LLM key is ever present client-side.

A common anti-pattern makes this urgent: in Next.js, any env var prefixed NEXT_PUBLIC_ is inlined into the client bundle. A single mislabeled key (NEXT_PUBLIC_CEREBRAS_API_KEY) or a real adapter constructed in a client component would leak the secret.


Decision Drivers

# Driver
D1 No LLM provider key may appear in any client bundle, network request, or browser-readable env var.
D2 All LLM calls happen in middle-core; the Next.js route must not call an LLM, so it needs no LLM key at all.
D3 The boundary must be enforceable and verifiable in CI (bundle analysis), not just by convention.
D4 Keep the route a thin, stateless proxy — minimal server responsibility (forward JWT per ARC-ADR-002, nothing else).
D5 Use a supported, documented CopilotKit configuration to minimize integration risk.

Considered Options

  1. ExperimentalEmptyAdapter + remoteEndpoints to middle-core (chosen) — the route uses CopilotKit's no-op adapter and delegates all model work to the remote middle-core agent; the route holds no LLM key.
  2. Real service adapter in the Next.js route, server-only key — construct an OpenAI/Cerebras adapter in the route with a non-NEXT_PUBLIC_ server env var.
  3. Direct browser → Cerebras with a public key — call the LLM from the client.
  4. Real adapter, key fetched at runtime from a secrets manager — route pulls the key at request time and uses a real adapter.

Decision Outcome

Option 1 — ExperimentalEmptyAdapter with remoteEndpoints is adopted.

app/api/copilotkit/route.ts constructs:

const runtime = new CopilotRuntime({
  remoteEndpoints: [{ url: process.env.MIDDLE_CORE_URL + "/copilotkit" }],
});
// no LLM key — the remote agent owns all model calls
export const POST = (req) =>
  copilotRuntimeNextJSAppRouterEndpoint({
    runtime,
    serviceAdapter: new ExperimentalEmptyAdapter(),
    endpoint: "/api/copilotkit",
  }).POST(req);

The Empty adapter is a no-op service adapter: it satisfies CopilotKit's runtime contract without performing any LLM call, because the LangGraph agent behind remoteEndpoints does the inference. Consequently the route requires no LLM provider key (D2, D4). The only secret the route handles is the forwarded user JWT (ARC-ADR-002), which is server-side and never a provider key.

The boundary is made enforceable (D3): CEREBRAS_API_KEY (and any provider key) lives only in middle-core's environment, is never referenced in frontend-core code, and is never given a NEXT_PUBLIC_ name. CI runs a bundle/secret scan that fails the build if any provider key pattern or NEXT_PUBLIC_*_API_KEY appears in client output.

Confirmation Criteria

  • Client bundle analysis finds no LLM provider key and no NEXT_PUBLIC_*_API_KEY (CI gate).
  • The browser network panel shows no request to any LLM provider; the only AI traffic is the browser → same-origin /api/copilotkit call.
  • frontend-core source contains zero references to CEREBRAS_API_KEY or any provider SDK.
  • GET /api/copilotkit returns 200 and a POST streams from middle-core using the Empty adapter with no LLM key configured (verifiable against a mocked /copilotkit).

Pros and Cons

Option 1 — ExperimentalEmptyAdapter + remoteEndpoints (chosen)

Pros:

  • Structurally impossible to leak an LLM key from the route: there is no key to leak (D1, D2).
  • Keeps the route a thin proxy — its only job is forwarding (D4), aligning with ARC-ADR-002.
  • Matches CopilotKit's documented self-hosted "agent owns the LLM" pattern (D5).

Cons:

  • Marked experimental by CopilotKit; the adapter name/API may change across versions (pin the @copilotkit/runtime version and track upgrades).
  • All inference availability now depends on middle-core being reachable; the route has no local fallback (acceptable — that is the intended boundary).

Option 2 — Real adapter in the route, server-only key

Pros:

  • Keeps the key server-side if NEXT_PUBLIC_ is scrupulously avoided.

Cons:

  • Violates D2: the route would make LLM calls, duplicating model logic that belongs in middle-core and splitting the agent across two layers.
  • One mislabeled env var or one client-component import leaks the key — the failure mode this ADR exists to prevent (weakens D1/D3).

Option 3 — Direct browser → Cerebras with a public key

Pros:

  • Simplest possible wiring for a throwaway demo.

Cons:

  • Violates D1 catastrophically: the key is public by construction; immediate abuse/spend risk.
  • Bypasses middle-core entirely, discarding the whole agent/RBAC architecture.

Option 4 — Real adapter, key from a secrets manager at runtime

Pros:

  • Key never sits in env files; centralized rotation.

Cons:

  • Still violates D2 (route does LLM calls) and adds a secrets-manager dependency and latency to every request — complexity with no benefit over letting middle-core own the LLM.

Positive Consequences

  • The browser is provably free of LLM secrets, satisfying the epic's hard acceptance criterion ("No LLM API key appears in any browser-side bundle or network request").
  • A clean separation of duties: middle-core owns the model + key; frontend-core owns UI + JWT forwarding. Each layer's secret inventory is minimal and auditable.

Negative Consequences

  • Dependence on an experimental adapter requires version pinning and an upgrade watch.
  • No copilot functions if middle-core is down; degraded-mode UX (disable the sidebar, show a clear message) must be designed rather than silently failing.

Implementation Notes

  • Pin @copilotkit/runtime and note ExperimentalEmptyAdapter in an upgrade checklist.
  • Add a CI step (e.g., grep the built client output) that fails on NEXT_PUBLIC_*_API_KEY or known provider key prefixes — this operationalizes D3 rather than relying on review.
  • MIDDLE_CORE_URL is required; missing value throws at startup (issue #13). No provider key env var should exist in frontend-core at all.
  • Design a degraded state for "middle-core unreachable" (sidebar disabled + message) since the route has no fallback by design.

  • Depends on: ARC-ADR-007 (server route exists to host the runtime).
  • Pairs with: ARC-ADR-002 (the route's only secret is the forwarded JWT; the Empty adapter guarantees it handles no LLM key).
  • Relates to: epic #12 (out-of-scope item "LLM key in the browser — enforced by Empty adapter"); issue #13; hub plan docs/plans/copilotkit-generative-ui.md.

Caveats

  • "Experimental" status is upstream-owned; behavior could change. End-to-end streaming verification needs middle-core (private repo); frontend-core verifies against a mock.

Revision History

Version Date Author Change
0.1 2026-05-25 Architecture Review Initial proposal (spoke draft)