ADR-0014: ObjectiveSignals is extra="forbid", frozen=True; static-introspection CI test enforces ADR-0008¶
Status: Accepted Date: 2026-05-12 Tags: trust · enforcement · type-safety Related: ADR-0003, production ADR-0008
Context¶
Production ADR-0008 forbids LLM self-confidence as a trust-score input — the strict-AND consumes objective facts only, not the LLM's opinion of its own work. Three risks for Phase 5: (a) a future contributor adds a confidence field to a signal sub-model; (b) a details dict contains {"llm_self_assessment": "high"}; (c) a new signal kind sneaks in a hidden judgment field. Prose enforcement (the ADR) is too weak; the synthesis: enforce by code via Pydantic extra="forbid", frozen=True plus a CI introspection test that walks every field name reachable from ObjectiveSignals and rejects forbidden substrings. See phase-arch-design.md §Component design — SandboxSpec/SandboxRun/ObjectiveSignals and final-design.md §Load-bearing commitments §2.2.
Options considered¶
- Prose ADR only — ADR-0008 is the enforcement. Trust contributors. Fails at the first PR that adds a
confidencefield with a "but it's just for logging" excuse. - Runtime check — At evaluation time, scan signal dicts for forbidden keys. Easy to bypass (skipped on test paths; performance cost).
- Pydantic
extra="forbid"+ CI introspection — Compile-time enforcement (extra="forbid"rejects unknown fields); CI test walks every field name (recursive type walk throughmodel_fields) and asserts no field name containsconfidence,llm,self_reported,model_says.
Decision¶
Every Phase 5 signal sub-model and ObjectiveSignals itself carries model_config = ConfigDict(extra="forbid", frozen=True). details: dict[str, str | int | bool] — no nested dict, no float, no list. tests/sandbox/test_objective_signals_static.py walks every field reachable from ObjectiveSignals recursively (including dict value types) and asserts no field name contains any of the four forbidden substrings.
Tradeoffs¶
| Gain | Cost |
|---|---|
ADR-0008 is enforced by code; a PR adding a confidence field fails CI loudly |
Adding a legitimate field whose name happens to contain a forbidden substring requires renaming (open Q9: coverage_evidence_strength instead of coverage_confidence) |
Type-system rigor: extra="forbid" + frozen=True means signals cannot be mutated post-construction or carry hidden fields |
Pydantic's extra="forbid" is per-model; the test walks the full type tree to enforce transitively |
details: dict[str, str | int | bool] prevents structural smuggling (e.g., {"meta": {"confidence": ...}} is rejected) |
Some legitimate details (durations as floats, lists of failing tests) must serialize to strings/ints |
| Introspection test is fast (<1 s) and runs every CI build | Test must be kept in sync with the Pydantic model structure — adding a new sub-model adds a path to walk |
Consequences¶
src/codegenie/sandbox/signals/models.pydefinesObjectiveSignalsand six sub-models, all withextra="forbid", frozen=True.tests/sandbox/test_objective_signals_static.pyis the load-bearing static test (walks recursively throughpydantic.fields.FieldInfo).- The honest-confidence pattern (signal evidence weak) is expressed by
details["coverage_evidence_strength"] = "low"— notcoverage_confidence(forbidden substring). - New signal kinds register via decorator and add a new optional field on
ObjectiveSignals; their sub-model also getsextra="forbid", frozen=True. - The static test is part of the Phase 5 PR; it will block any Phase 7+ PR that adds a banned-substring field.
- New invariant: a Phase 5+ contributor cannot smuggle an LLM self-assessment into a trust signal — by either field name or nesting depth.
Reversibility¶
Low. Relaxing extra="forbid" re-opens silent field addition. Relaxing the introspection test re-opens the confidence smuggle. The constraints are intentionally rigid and aligned with the load-bearing ADR-0008.