ADR-0013: Layered additionalProperties schema policy — strict envelope, loose probes.*, per-probe sub-schemas¶
Status: Accepted Date: 2026-05-11 Tags: schema · validation · extension · contract Related: ADR-0003, production ADR-0007
Context¶
The JSON Schema for repo-context.yaml is additionalProperties-policy-sensitive. The security lens proposed additionalProperties: false at every level — strict structural validation that rejects unknown fields as buggy-probe output. The best-practices lens proposed false at the root only, leaving probes.* loose to allow new probe types to drop in.
../critique.md §2.1.3 makes the conflict structural: production/design.md §2.5 requires "extension by addition — adding Java, Python, or a new task type must be new probes + new Skills, never edits to existing probes or the coordinator." If the schema rejects unknown fields under probes.*, adding a new probe in Phase 1 requires editing the envelope schema — violating the extension-by-addition commitment.
The two policies are incompatible at the seam. A layered solution is needed: structural strictness at the envelope (where buggy fields would be ambiguous metadata), looseness under probes.* (where new probes drop in), and per-probe sub-schemas constraining their own slice.
Options considered¶
additionalProperties: falseeverywhere ([S]). Strict validation. Catches every buggy probe field — but violates extension by addition. Phase 1'sNodeBuildSystemprobe requires a schema edit; Phase 7's distroless-migration probes require schema edits. The extension-by-addition seam is broken.additionalProperties: trueeverywhere (lens-design implicit). Maximum extensibility, zero validation strictness. Buggy probe fields land in the YAML and pollute downstream consumers. Phase 11 commits the pollution to PRs.additionalProperties: falseat root only ([B]). Envelope strict; underprobes.*anything goes. New probes drop in. No per-probe validation — a probe's typo in its own slice is silently accepted.- Layered: root
false,probes.*true, per-probe sub-schemas (synth). Strict where strictness is structural (envelope), loose where extension matters (probes namespace), per-probe sub-schemas atsrc/codegenie/schema/probes/<name>.schema.jsoncomposed via$ref. Adding a probe = adding a sub-schema file + one$refline. Strict validation per probe.
Decision¶
The JSON Schema for repo-context.yaml uses a layered additionalProperties policy:
additionalProperties: falseat the top-level envelope (schema_version,generated_at,repo,probesrequired keys; no unknown top-level fields).additionalProperties: trueunderprobes.*(the probes-namespace map; any probe-name key allowed).- Each probe owns a sub-schema at
src/codegenie/schema/probes/<name>.schema.jsoncomposed into the envelope via$ref. The sub-schema may itself be strict (additionalProperties: false) at the probe-slice level — at the probe author's discretion.
Adding a probe in Phase 1+ is one new file under schema/probes/ plus one $ref line in the envelope schema. No edit to existing probe sub-schemas.
Tradeoffs¶
| Gain | Cost |
|---|---|
Extension by addition (production/design.md §2.5) holds at the schema level — Phase 1's six probes land as six new sub-schema files; Phase 7's distroless probes likewise |
Two levels of strictness to reason about; documented in the probe-authoring guide |
| Structural validation strictness where it matters — envelope is metadata and must be exact; a typo in a top-level key is a real bug | Probe sub-schemas are optional in principle (the probe could ship without one) — the convention is "always ship a sub-schema with your probe" |
| Per-probe versioning works cleanly with ADR-0003's two-level cache key — bumping a probe's sub-schema invalidates only that probe's cache | More schema files to maintain (one per probe); compensated by each being small and probe-author-owned |
The $ref composition pattern is JSON Schema standard — no custom code; jsonschema library handles it natively |
Schema authoring requires the probe author to understand $ref mechanics; mitigated by the Phase 1 probe-authoring guide and the LanguageDetectionProbe example |
Adding a new envelope-level field (e.g., generated_by for tooling provenance) is a deliberate edit — additionalProperties: false makes the addition visible |
Envelope changes are slightly more friction than they would be with true; the friction is the point |
Consequences¶
src/codegenie/schema/repo_context.schema.jsonships in Phase 0 withadditionalProperties: falseat top level andtrueunderprobes.*.src/codegenie/schema/probes/language_detection.schema.jsonis the first per-probe sub-schema — establishes the convention and provides Phase 1's authors a working example.- Phase 1's six new probes add six new sub-schema files. No edit to existing schemas.
- The
Probe.schema_pathconvention (or equivalent) letscache/keys.py'sper_probe_schema_version(ADR-0003) read each probe's own$idfor cache-key versioning. - Validation runs in two layers naturally:
jsonschematraverses the envelope checking root strictness, then descends intoprobes.*per-key into the matching sub-schema via$ref. - The "envelope is metadata, sub-schema is contract" architectural distinction is load-bearing for both schema validation and cache invalidation scope (ADR-0003).
Reversibility¶
Low. Tightening probes.* to additionalProperties: false later would require every probe ship a sub-schema and be enumerated as a property of probes — breaks extension by addition. Loosening the envelope to additionalProperties: true invites the metadata-typo class of bug the strictness exists to catch. The layered policy is the design's structural commitment.
Evidence / sources¶
../final-design.md §2.9(Schema validation — layered policy)../final-design.md §L3 row 4(Layered wins 12 vsfalse-everywhere's 4 vsfalse-at-root's 11)../critique.md §2.1.3(Critic establishes the incompatibility with extension by addition)../phase-arch-design.md §Component design / Schema validator../phase-arch-design.md §Data model(envelope JSON Schema example)- production ADR-0007 —
RepoContextschema is over-the-wire format; layered policy lifts unchanged - ADR-0003 — cache invalidation scope depends on this layering