Skip to content

ADR-0014: ObjectiveSignals is extra="forbid", frozen=True; static-introspection CI test enforces ADR-0008

Status: Accepted Date: 2026-05-12 Tags: trust · enforcement · type-safety Related: ADR-0003, production ADR-0008

Context

Production ADR-0008 forbids LLM self-confidence as a trust-score input — the strict-AND consumes objective facts only, not the LLM's opinion of its own work. Three risks for Phase 5: (a) a future contributor adds a confidence field to a signal sub-model; (b) a details dict contains {"llm_self_assessment": "high"}; (c) a new signal kind sneaks in a hidden judgment field. Prose enforcement (the ADR) is too weak; the synthesis: enforce by code via Pydantic extra="forbid", frozen=True plus a CI introspection test that walks every field name reachable from ObjectiveSignals and rejects forbidden substrings. See phase-arch-design.md §Component design — SandboxSpec/SandboxRun/ObjectiveSignals and final-design.md §Load-bearing commitments §2.2.

Options considered

  • Prose ADR only — ADR-0008 is the enforcement. Trust contributors. Fails at the first PR that adds a confidence field with a "but it's just for logging" excuse.
  • Runtime check — At evaluation time, scan signal dicts for forbidden keys. Easy to bypass (skipped on test paths; performance cost).
  • Pydantic extra="forbid" + CI introspection — Compile-time enforcement (extra="forbid" rejects unknown fields); CI test walks every field name (recursive type walk through model_fields) and asserts no field name contains confidence, llm, self_reported, model_says.

Decision

Every Phase 5 signal sub-model and ObjectiveSignals itself carries model_config = ConfigDict(extra="forbid", frozen=True). details: dict[str, str | int | bool] — no nested dict, no float, no list. tests/sandbox/test_objective_signals_static.py walks every field reachable from ObjectiveSignals recursively (including dict value types) and asserts no field name contains any of the four forbidden substrings.

Tradeoffs

Gain Cost
ADR-0008 is enforced by code; a PR adding a confidence field fails CI loudly Adding a legitimate field whose name happens to contain a forbidden substring requires renaming (open Q9: coverage_evidence_strength instead of coverage_confidence)
Type-system rigor: extra="forbid" + frozen=True means signals cannot be mutated post-construction or carry hidden fields Pydantic's extra="forbid" is per-model; the test walks the full type tree to enforce transitively
details: dict[str, str | int | bool] prevents structural smuggling (e.g., {"meta": {"confidence": ...}} is rejected) Some legitimate details (durations as floats, lists of failing tests) must serialize to strings/ints
Introspection test is fast (<1 s) and runs every CI build Test must be kept in sync with the Pydantic model structure — adding a new sub-model adds a path to walk

Consequences

  • src/codegenie/sandbox/signals/models.py defines ObjectiveSignals and six sub-models, all with extra="forbid", frozen=True.
  • tests/sandbox/test_objective_signals_static.py is the load-bearing static test (walks recursively through pydantic.fields.FieldInfo).
  • The honest-confidence pattern (signal evidence weak) is expressed by details["coverage_evidence_strength"] = "low"not coverage_confidence (forbidden substring).
  • New signal kinds register via decorator and add a new optional field on ObjectiveSignals; their sub-model also gets extra="forbid", frozen=True.
  • The static test is part of the Phase 5 PR; it will block any Phase 7+ PR that adds a banned-substring field.
  • New invariant: a Phase 5+ contributor cannot smuggle an LLM self-assessment into a trust signal — by either field name or nesting depth.

Reversibility

Low. Relaxing extra="forbid" re-opens silent field addition. Relaxing the introspection test re-opens the confidence smuggle. The constraints are intentionally rigid and aligned with the load-bearing ADR-0008.

Evidence / sources