Story S8-01 — ConfidenceSection renderer with exhaustive match + assert_never enforcement¶
Step: Step 8 — Confidence section renderer + CI ratchet + advisory benches + Phase-3 handoff
Status: Done — GREEN 2026-05-18 (phase-story-executor; see _attempts/S8-01.md for the per-AC evidence table + AC-3 ritual mypy stderr captures). AC-2's literal "single match statement" was relaxed to the codebase's nested-match convention (Fresh|Stale outer, StaleReason inner) because mypy does not fully narrow Pydantic nested discriminated unions in a single flat match; the producer (codegenie.probes.layer_b.index_health._derive_confidence) already uses the nested form for the same reason. Nested form gives strictly stronger assert_never enforcement (both levels). Conflict surfaced in the attempt log per CLAUDE.md Rule 7.
Effort: M
Depends on: S7-04 (tests/adv/phase02/test_phase3_handoff_smoke.py lands skipped + the in-memory secret-leak boundary test), S7-05 (portfolio integration sweep wired)
ADRs honored: 02-ADR-0006 (IndexFreshness sum-type location at codegenie.indices.freshness); 02-ADR-0009 (no pytest-xdist — serial); 02-ADR-0005 (no plaintext secret persistence — extends to renderer-constructed strings); 02-ADR-0010 (RedactedSlice smart constructor at writer boundary — renderer reads RedactedSlice.slice only); production ADR-0033 §3–4 (make illegal states unrepresentable; assert_never is the type-level enforcement)
Validation notes¶
Validated: 2026-05-18 Verdict: HARDENED Findings addressed: 23 total — 4 blocks, 14 hardens, 5 nits (deferred to Notes-for-implementer)
Changes applied:
- AC-1 narrowed — ConfidenceSectionRenderer class dropped; only render_confidence_section exported. (Design-Patterns DP-3 + Test-Quality TQ-11 + Coverage COV-10 — premature class wrapper with no state, no precedent in output/writer.py for stateless renderer classes. Rule 2 / Rule 11.)
- AC-2 strengthened — exhaustiveness over the nested IndexFreshness | StaleReason shape (two assert_never sites) to match the established Phase-2 idiom in src/codegenie/probes/layer_b/index_health.py:239-279 (_derive_confidence, _last_indexed_at). Per-row negative-space assertion added — each row contains its own variant's marker AND no other variant's marker. (Design-Patterns DP-1 + Test-Quality TQ-1; Rule 11.)
- AC-3 rewritten — repo-wide [tool.mypy] warn_unreachable = true (set in Phase 0 S1-02, verified by S1-11's tests/unit/test_mypy_warn_unreachable_fixture.py) is the load-bearing setting; this story inherits it. The manual stderr-snapshot ritual is retained as a Step-8 PR-review checklist item BUT exercised at BOTH nesting levels (outer Fresh/Stale removal AND inner StaleReason removal — DP-7). (Consistency CON-1 + Test-Quality TQ-5 + Design-Patterns DP-7. The "S1-11 per-module override fires once code lands here" narrative was factually wrong — warn_unreachable is global, not per-module.)
- AC-4 strengthened — deterministic order is now byte-pinned via re.findall against the FULL emitted row sequence (not just position-comparisons on three ASCII-lowercase names). Three discriminating fixtures added covering ASCII-lex-vs-casefold, numeric-vs-lexicographic, and full-sequence order. Naive-datetime, long IndexerError.message, and last_indexed non-SHA cases pinned. (Test-Quality TQ-2 + Coverage COV-6.)
- AC-5 REWRITTEN (block) — IndexerError("slice_malformed:" + str(e)) removed. The renderer emits a sentinel IndexerError(message="slice_malformed") (stable identifier matching freshness.py:73-80's contract) and routes the ValidationError details to a structlog event report.confidence_section.slice_malformed with structured fields (index_name, error_count, first_loc). The Markdown row reads - [STALE] <name> · indexer_error · slice_malformed. Negative-space test added: an envelope whose malformed slice contains a secret-shaped value (e.g., AKIA…) must NOT produce that value in the rendered row. (Consistency CON-3 + Design-Patterns DP-2 + Coverage COV-3 — protects 02-ADR-0005 / 02-ADR-0010 plaintext-secret invariant; preserves IndexerError.message smart-constructor contract.)
- AC-6 rewritten — writer integration is already wired in src/codegenie/output/writer.py:138-156, 233-239 (_publish_context_report(envelope.slice, output_dir)). This story's AC-6 now verifies the existing call site exercises the renderer that lands here. The renderer's input type is pinned: Mapping[str, Any] (the post-redaction RedactedSlice.slice, NOT a raw envelope dict) — preserves 02-ADR-0010's chokepoint. Byte-identical-across-runs requires producer time-source freezing (sub-bullet added). Row-count assertion strengthened to len(rows) == len(envelope[..][index_health]) — kills the empty-renderer mutant. (Coverage COV-2 + Consistency CON-2 + CON-4 + Test-Quality TQ-6.)
- AC-7 narrative corrected — mypy --strict src/codegenie/report/ passes; warn_unreachable is repo-wide, not per-module. (Consistency CON-1.)
- AC-8 strengthened — denylist extended to {codegenie.probes, codegenie.coordinator, codegenie.cache, codegenie.adapters, codegenie.tccm}; subprocess.run(..., check=False) so import failure is visible (TQ-4). (Coverage COV-7 + Test-Quality TQ-4.)
- AC-9 ADDED — empty-envelope and zero-registered-indices paths emit ## Confidence\n\n_No index sources registered._\n (placeholder text); test pins the exact body. Closes the lazy-impl "always emit empty heading" gap. (Coverage COV-4 + Test-Quality TQ-6.)
- AC-10 ADDED — duplicate index_name upstream raises ValueError; writer catches via the existing try/except Exception in _publish_context_report (writer.py:148-156) and logs report.confidence_section.render_failed without aborting repo-context.yaml. Fail-loud per Rule 12. (Coverage COV-5.)
- AC-11 ADDED — renderer is pure: AST-walking test asserts no open, print, Path.write_text, no import logging, no import structlog, no import os, no import pathlib from the renderer module. Closes the silent-side-effect drift gap. (Design-Patterns DP-8 + CLAUDE.md "Functional core / imperative shell".)
- AC-12 ADDED — secrets in malformed-slice values do NOT leak. Negative-test: a slice containing IndexerError(message="AKIA1234567890ABCDEF") (test-fixture-shaped) renders the value verbatim (trusts the slice, AC-4 contract), BUT a malformed slice whose offending value is AWS-key-shaped renders ONLY slice_malformed — the offending input value never reaches the row. (Consistency CON-3 + Test-Quality TQ-10.)
- AC-13 ADDED — property-based metamorphic test (Hypothesis): adding/removing an index never alters the rendering of the other indices; each row's marker set isolates its variant. (Test-Quality TQ-7.)
- Notes-for-implementer extended — registry-of-formatters anti-pattern documented (DP-6); writer-vs-CLI wiring rationale pinned to writer.py (DP-5); newtype erasure at slice boundary noted (DP-9).
Full audit log: _validation/S8-01-confidence-section-renderer.md.
Context¶
IndexFreshness is the typed answer to commitment §2.3 — "silent staleness is the worst failure mode of the entire system" (CLAUDE.md, production/design.md §2.3). Phase 2's design ships one consumer of that sum type so the variant set is exercised from day 1 and a missed case becomes a build error rather than a runtime surprise: the Confidence section of CONTEXT_REPORT.md, rendered by src/codegenie/report/confidence_section.py. That module is intentionally outside probes/ so a CONTEXT_REPORT render does not pull in the probe registry; Phase 3 adapters and Phase 8 Bundle Builder will import it without circular-dependency risk (phase-arch-design.md §"Component design" #2 §"Why not co-located").
The renderer is what makes the discipline real: mypy warn_unreachable = true is set repo-wide in pyproject.toml [tool.mypy] (Phase 0 S1-02; verified by S1-11's tests/unit/test_mypy_warn_unreachable_fixture.py). A removed case arm against any IndexFreshness variant must produce a CI build error — verified at BOTH nesting levels (outer Fresh | Stale and inner StaleReason) in the Step 8 PR-review checklist (Implementation risk #4). The S1-11 automated mypy fixture test is the load-bearing evidence; the ritual in this story is the human-readable confirmation.
This story is the type-level enforcement of B2's load-bearing role. Without it, every other guardrail in this phase (the stale-scip adversarial, the freshness registry, repo-wide mypy warn_unreachable) is decoration around a sum type nobody pattern-matches on.
References — where to look¶
- Architecture:
../phase-arch-design.md §"Component design" #2(IndexFreshnesssum type — variant set,__all__, smart constructor, "why not co-located").../phase-arch-design.md §"Logical view"(class diagram:ConfidenceSectionRenderer.render(slices) -> strwith<<Phase 2 — only consumer of IndexFreshness in Phase 2>>annotation;ConfidenceSectionRenderer --> IndexFreshness : pattern-matches).../phase-arch-design.md §"Process view"— sequence step 7: "CR-->>WR: CONTEXT_REPORT.md" (renderer runs after writer's atomicos.replace).../phase-arch-design.md §"Reading guide"— "New types (IndexFreshness, ...) live in their own packages and are imported, not inherited from kernel ABCs."- Phase ADRs:
../ADRs/0006-index-freshness-sum-type-location.md— namescodegenie.indices.freshnessas the module; consumer iscodegenie.report.confidence_section.- Production ADRs:
../../../production/adrs/0033-domain-modeling-discipline.md§3 ("make illegal states unrepresentable") + §4 (sum types +assert_never).- Source design:
../final-design.md §"Phase-2-internal consumer"(lines ~207) — explicitly names this renderer as the closer for shared blind spot #1 (sum-type-without-a-consumer).../final-design.md §"Synthesis ledger"row "mypy --warn-unreachable rollout" — per-module config oncodegenie.{indices, probes/index_health.py, report, adapters, tccm}/**is the resolved decision.- Existing code (Phase 2 contract from earlier steps — DO NOT WEAKEN):
src/codegenie/indices/freshness.py(S1-01) —Fresh | Stale(reason: StaleReason);StaleReason = CommitsBehind | DigestMismatch | CoverageGap | IndexerError. The__all__andLiteral[...]kinddiscriminators are the only thing the renderer pattern-matches against.IndexerError.messageis documented as "a stable identifier — not a free-form human string" (lines 73-80); AC-5 must preserve that.src/codegenie/probes/layer_b/index_health.py(S4-01) — emits oneIndexFreshnessper index source serialized viamodel_dump(mode="json")(line ~370). The renderer receives a JSON dict atenvelope.slice["probes"]["index_health"]["index_health"][<index_name>]["freshness"]and re-validates viaTypeAdapter(IndexFreshness).validate_python(...). Established two-levelmatchprecedent:_derive_confidence(lines 239-279) and_last_indexed_atuse outermatch value: case Fresh()/case Stale(reason=r):then innermatch r:overStaleReason; the renderer MUST mirror this shape per Rule 11.src/codegenie/output/writer.py(Phase 0 + S3-03; renderer integration at lines 138-156 + 233-239) —_publish_context_report(envelope.slice, output_dir)is the call site; it invokescodegenie.report.render_confidence_sectionon the post-redactionRedactedSlice.slice(adict) and atomically publishesCONTEXT_REPORT.mdvia the same.tmp → fsync → os.replacediscipline asrepo-context.yaml. The renderer's input type is thereforeMapping[str, Any], NOTRedactedSlice. The writer's existingtry/except Exceptionaround the call (line 152) logsreport.confidence_section.render_failedand continues —repo-context.yamlis unaffected if the renderer raises.pyproject.toml(Phase 0 S1-02 + S1-11 verification) —[tool.mypy] warn_unreachable = trueis set repo-wide at the top-level[tool.mypy]block (line 172), NOT in a[[tool.mypy.overrides]]block. The renderer module is therefore covered by inheritance. S1-11'stests/unit/test_mypy_warn_unreachable_fixture.pyautomates the "incompletematch→ mypy fails" invariant against anIndexFreshnessfixture; this story does NOT duplicate that test.tests/unit/test_mypy_warn_unreachable_fixture.py(S1-11 AC-5) — already automates the AC-3 invariant against an incompletematch: IndexFreshness. AC-3's manual ritual is documentation of human-readable confirmation for the PR-review checklist, NOT a duplicate automation.
Goal¶
Implement src/codegenie/report/__init__.py and src/codegenie/report/confidence_section.py as the only Phase-2 consumer of IndexFreshness. The renderer pattern-matches on the typed sum using the established Phase-2 nested-match idiom (mirror _derive_confidence / _last_indexed_at in probes/layer_b/index_health.py:239-279): an outer match value: over Fresh | Stale(reason) with case _: assert_never(value), and an inner match reason: over CommitsBehind | DigestMismatch | CoverageGap | IndexerError with case _: assert_never(reason). The renderer accepts Mapping[str, Any] (the post-redaction RedactedSlice.slice produced by the writer's chokepoint per 02-ADR-0010), re-validates the per-index freshness JSON dicts via pydantic.TypeAdapter(IndexFreshness).validate_python(...), and produces a CONTEXT_REPORT.md string with a "Confidence" section whose row order is deterministic (ASCII-lex sorted by index_name), whose Fresh rows render as - [OK] <index_name> · indexed_at=<iso8601-UTC-Z>, and whose Stale rows render with a per-variant suffix. The writer is already wired (writer.py:138-156, 233-239) to call this renderer; this story implements the renderer module the writer imports.
Critically: with [tool.mypy] warn_unreachable = true set repo-wide (Phase 0 S1-02; pyproject.toml line 172), removing any case arm at either nesting level produces a [unreachable] build error at the corresponding assert_never(...) line in CI. This is the type-level enforcement of B2's load-bearing role. The Step 8 PR-review checklist requires deliberately removing a case arm at BOTH the outer level (one case from Fresh | Stale) AND the inner level (one case from StaleReason) and confirming CI fails each time (Implementation risk #4).
Acceptance criteria¶
-
[ ] AC-1 (module surface).
src/codegenie/report/__init__.pyexportsrender_confidence_sectiononly (closed__all__ = ["render_confidence_section"]).src/codegenie/report/confidence_section.pycontains the function. NoConfidenceSectionRendererclass — Rule 2 (no abstractions for single-use code); no precedent inoutput/for a stateless renderer class. Forbidden imports (asserted by AC-8):codegenie.probes.*,codegenie.coordinator.*,codegenie.cache.*,codegenie.adapters.*,codegenie.tccm.*. Permitted Phase-2 dependencies:codegenie.indices.freshness,pydantic, and stdlib (typing.assert_never,datetime,re). (validator: narrowed — class wrapper dropped per DP-3/TQ-11/COV-10; denylist tightened per COV-7.) -
[ ] AC-2 (exhaustive nested
matchoverIndexFreshness).render_confidence_sectioninvokes a two-levelmatchmirroringprobes/layer_b/index_health.py:239-279: outermatch value:overFreshandStale(reason=r)withcase _: assert_never(value); innermatch r:overCommitsBehind,DigestMismatch,CoverageGap,IndexerErrorwithcase _: assert_never(r). Bothassert_neverarms are required. Tests assert: (a) every variant's marker appears on the row keyed by its own index name and not on any other row (per-row negative-space — kills the "single-row-with-all-markers" mutant); (b) row count equals the input dict size for a 5-variant fixture (out.count("- [OK]") == 1ANDout.count("- [STALE]") == 4). (validator: hardened — original ACE-2 passed for a degenerate impl emitting all five markers in one row; DP-1 mandates the established nested idiom.) -
[ ] AC-3 (mypy
warn_unreachableenforces exhaustiveness at both levels). Repo-wide[tool.mypy] warn_unreachable = true(pyproject.toml L172) coverscodegenie.report.*by inheritance — there is NO per-module override; the global setting is the load-bearing one (verified by S1-11'stests/unit/test_mypy_warn_unreachable_fixture.py). The automated invariant is already covered by S1-11; this story adds a two-pass human-readable PR-review ritual: - (a) Delete one
casearm from the outermatch value:(e.g.,case Fresh(indexed_at=ts):). Runmypy src/codegenie/report/. Confirm non-zero exit with[unreachable]atassert_never(value). Capture stderr to_attempts/S8-01.md. Revert. -
(b) Delete one
casearm from the innermatch r:(e.g.,case CommitsBehind(...):). Runmypy src/codegenie/report/. Confirm non-zero exit with[unreachable]atassert_never(reason). Capture stderr to_attempts/S8-01.md. Revert. (validator: rewritten — "per-module override" narrative was factually wrong per CON-1; nested-match ritual added per DP-7.) -
[ ] AC-4 (deterministic row order + per-variant format, byte-pinned). Rows are ASCII-lex sorted by
index_name. Format pins (each test asserts exact substring match on a single line): Fresh(indexed_at)→- [OK] <index_name> · indexed_at=<iso8601-UTC-Z>. Naive (timezone-unaware) datetime: renderer emitsslice_malformedrow instead (AC-5 path) — preserves invariant that theZsuffix is present iff the timestamp is UTC-aware.Stale(CommitsBehind(n, last_indexed))→- [STALE] <index_name> · commits_behind=<n> · last_indexed=<first-8-chars>(last_indexed[:8]— rendered verbatim, no SHA validation; arbitrary strings pass through; test asserts a non-SHA string is rendered aslast_indexed[:8]unmodified).Stale(DigestMismatch(expected, actual))→- [STALE] <index_name> · digest_mismatch · expected=<first-8-chars>… · actual=<first-8-chars>….Stale(CoverageGap(files_indexed, files_in_repo))→- [STALE] <index_name> · coverage_gap · indexed=<files_indexed>/<files_in_repo>.-
Stale(IndexerError(message))→- [STALE] <index_name> · indexer_error · <message>wheremessagelonger than 200 chars is truncated tomessage[:200] + "…". Determinism tests: (a) full row sequence pinned viare.findall(r"^- \[(?:OK|STALE)\] (\S+) ", out, re.M) == sorted(input.keys())— catches reverse-sort, casefold-sort, hash-sort mutants; (b) ASCII-lex fixture{"B": ..., "a": ..., "C": ...}proves code-point order, not case-insensitive; (c){"idx10": ..., "idx2": ..., "idx1": ...}proves lex, not natural-sort. Output endings:out.endswith("\n")and"\n\n\n" not in out. Renderer does NOT re-sanitize (AC-12 verifies). (validator: hardened — original position-comparison test passed under three lowercase ASCII fixtures by coincidence; TQ-2 + COV-6.) -
[ ] AC-5 (defense-in-depth on malformed slice — sentinel only, no error-detail leak into typed slice). If a per-index slice's
freshnessfield failsTypeAdapter(IndexFreshness).validate_python(...), the renderer constructsIndexerError(message="slice_malformed")(a stable identifier perfreshness.py:73-80; NOT"slice_malformed:" + str(e)), routes it through the standardStale(reason=IndexerError(...))arm, and emits the row- [STALE] <index_name> · indexer_error · slice_malformed. The structured error detail (error_count, first_loc) is emitted to a structlog eventreport.confidence_section.slice_malformedwith fieldsindex_name,error_count,first_loc— never into the Markdown row, never as part of the typedIndexerError.message. (validator: REWRITTEN block-severity — originalstr(e)synthesis violatedIndexerError.messagesmart-constructor contract per DP-2/COV-3 and risked leaking unredacted offending-value content past 02-ADR-0005/02-ADR-0010 chokepoints per CON-3.) -
[ ] AC-6 (writer integration verified; renderer takes
Mapping[str, Any], not RedactedSlice). The writer integration is already in place atsrc/codegenie/output/writer.py:138-156, 233-239:_publish_context_report(envelope.slice, output_dir)invokescodegenie.report.render_confidence_sectionon the post-redaction dict and atomically publishesCONTEXT_REPORT.md. Renderer signature:def render_confidence_section(envelope_slice: Mapping[str, Any]) -> str. It locates per-index freshness JSON dicts atenvelope_slice["probes"]["index_health"]["index_health"][<index_name>]["freshness"]and re-validates each viaTypeAdapter(IndexFreshness).validate_python(...). Integration testtests/integration/test_writer_renders_confidence_section.pyasserts: - (a)
CONTEXT_REPORT.mdexists post-gather andoutput_dir / "CONTEXT_REPORT.md.tmp"does NOT exist (atomic publish). - (b)
CONTEXT_REPORT.mdstarts with the exact line## Confidence. - (c) Row count equals
len(envelope_slice["probes"]["index_health"]["index_health"])AND every input index_name appears in exactly one row. (kills the empty-renderer mutant.) -
(d) Two back-to-back runs against the same fixture produce byte-identical
CONTEXT_REPORT.md. Precondition: the producer'sFresh.indexed_atsource is deterministic for the fixture (either the fixture pre-seeds a stale-only state with noFreshrows, or the integration test patchesIndexHealthProbe's time source viamonkeypatch). If determinism cannot be achieved, the byte-identical sub-assertion is replaced with a regex-mask comparison onindexed_at=\d{4}-\d{2}-\d{2}T.... (validator: rewritten — original AC ambiguous on whetherrender_confidence_section(merged_envelope)took a dict orRedactedSlice; writer is already wired; row-count and time-source determinism pinned per COV-2/CON-2/CON-4/TQ-6.) -
[ ] AC-7 (mypy + ruff green).
mypy --strict src/codegenie/report/passes; repo-widewarn_unreachable = trueis honored (no[[tool.mypy.overrides]]block silences it for this module).ruff check src/codegenie/report/ tests/unit/report/andruff format --checkboth green. (validator: narrative corrected —warn_unreachableis repo-wide, not per-module.) -
[ ] AC-8 (renderer-import side-effect denylist — subprocess clean import). Importing
codegenie.report.confidence_sectionin a fresh Python subprocess (subprocess.run([sys.executable, "-c", script], check=False)) loads NO module under any of{codegenie.probes, codegenie.coordinator, codegenie.cache, codegenie.adapters, codegenie.tccm, codegenie.output.sanitizer}. The subprocess test assertsproc.returncode == 0first (so ImportError surfaces as a meaningful failure, not a maskedCalledProcessError), then the denylist check. Structural guarantee from phase-arch-design.md §"Component design" #2 §"Why not co-located"; extended per Phase 2 commitment that the renderer composes by data, not by registry coupling. (validator: hardened — denylist tightened per COV-7;check=Falseper TQ-4.) -
[ ] AC-9 (empty / no-IndexHealth-slice path renders placeholder). When
(with a trailing newline). Tests pin this body byte-for-byte for: (a)envelope_slicehas noprobes.index_health.index_healthkey, OR that key is an empty dict, OR every per-index slice is malformed, the renderer returns exactly:envelope_slice == {}, (b)envelope_slice == {"probes": {}}, (c)envelope_slice == {"probes": {"index_health": {"index_health": {}}}}. (validator: added — closes the "always-emit-empty-heading" mutant gap per COV-4 + TQ-6.) -
[ ] AC-10 (duplicate
index_nameupstream — fail loud, writer recovers). If the upstream slice contains structurally-duplicate index_name keys (impossible via dict but possible via merged structure or test fixture), the renderer raisesValueError("duplicate index_name: <name>"). The writer's existingtry/except Exceptionin_publish_context_report(writer.py:148-156) catches, logsreport.confidence_section.render_failedwitherror=<exception type>, and does NOT raise —repo-context.yamlis unaffected. Unit test asserts the renderer raises; integration test asserts the writer continues. Rule 12 — fail loud at the renderer, recover at the chokepoint. (validator: added per COV-5.) -
[ ] AC-11 (renderer purity — AST-walking guard). Test
tests/unit/report/test_confidence_section_purity.py::test_renderer_has_no_side_effectsparsessrc/codegenie/report/confidence_section.pyviaast.parseand asserts: (a) noCallnode whose target isopen,print, or anyAttributeending in.write/.write_text/.write_bytes; (b) no top-levelImportorImportFromforlogging,structlog,pathlib,os,os.path,sys(exceptsysif needed forassert_neverimport in older Pythons — exempt by name),subprocess,shutil,tempfile. Mirrors the precedent of other Phase 2 purity tests (grep -rn "ast.parse\|ast.walk" tests/unit/). Pure renderer ⇒ pure unit tests ⇒ no environmental flake. (validator: added per DP-8 + CLAUDE.md "Functional core / imperative shell".) -
[ ] AC-12 (no plaintext-secret leak via slice_malformed path — 02-ADR-0005 invariant). Test
test_malformed_slice_does_not_leak_offending_value: construct a malformed slice whose offending value is an AWS-key-shaped fixture ({"freshness": {"kind": "bogus_kind", "leak_field": "AKIA1234567890ABCDEF"}}— test fixture; gate with# noqa: S105if the secret-pattern hook flags it). Assert that the renderedCONTEXT_REPORT.mddoes NOT containAKIA1234567890ABCDEFanywhere — only the literalslice_malformedsentinel. Mirror test: a well-formedIndexerError(message="AKIA1234567890ABCDEF")slice DOES render the value verbatim (the renderer trusts the redactor's prior chokepoint per AC-4). The asymmetry is the load-bearing invariant: the renderer trusts redacted slice content; the renderer NEVER constructs new strings from offending-value inputs. (validator: added per CON-3 + TQ-10 — protects 02-ADR-0005 / 02-ADR-0010.) -
[ ] AC-13 (metamorphic property — adding/removing an index leaves other rows unchanged). Property test
test_each_row_isolates_its_variant(Hypothesis): generateslices: dict[str, IndexFreshness]fromst.dictionaries(st.text(alphabet=printable_ascii, min_size=1, max_size=16), st.sampled_from(FRESHNESS_INSTANCES), min_size=1, max_size=8). Assert: (a) each rendered row contains exactly one variant's marker — no cross-talk; (b)render(slices) == render(slices | extra_slice)for the rows that are not the extra (extract per-row substrings and compare); (c) row order issorted(slices.keys())for every generated input. Metamorphic invariant: rendering is purely per-row + sorting; no cross-row state. (validator: added per TQ-7.)
Out of scope¶
- Rendering anything other than the Confidence section —
CONTEXT_REPORT.mdmay have other sections in later phases; Phase 2 only commits to the Confidence section. The top-of-file# CONTEXT_REPORT — <repo_path>heading is a one-liner; deeper structure waits. - Localization, emoji styling, terminal-color escape codes. The renderer outputs ASCII Markdown only.
- Re-redacting
IndexerError.message— secret redaction is the writer chokepoint's job (S3-01/3-02/3-03). The renderer trusts the slice. - Adding new
IndexFreshnessvariants — the variant set is frozen by 02-ADR-0006 and was decided in S1-01. A fifthStaleReasonrequires an ADR amendment, not an edit here. - Phase 3 plugin-side rendering. Phase 3 may layer
AdapterConfidenceoverIndexFreshnessin bundle metadata (phase-arch-design.md §"Integration with Phase 3"); that's a Phase 3 concern.
Files to touch¶
New:
src/codegenie/report/__init__.py— re-exportsrender_confidence_section; closed__all__ = ["render_confidence_section"]. (No class wrapper — DP-3.)src/codegenie/report/confidence_section.py— the renderer. ~120 LOC (down from ~150 — class removed).tests/unit/report/__init__.py— empty package init.tests/unit/report/test_confidence_section.py— unit tests AC-1, AC-2, AC-4, AC-5, AC-8, AC-9, AC-10 (renderer-side raise), AC-12, AC-13.tests/unit/report/test_confidence_section_purity.py— AST-walking purity test (AC-11).tests/integration/test_writer_renders_confidence_section.py— integration test AC-6 + AC-10 (writer recovers fromValueError).
Verify (already wired — DO NOT MODIFY):
src/codegenie/output/writer.pylines 138-156 + 233-239 —_publish_context_report(envelope.slice, output_dir)invokes the renderer. The writer'sRedactedSlicechokepoint per 02-ADR-0010 is unchanged. AC-6's integration test exercises this call site against the renderer this story implements.
Untouched (DO NOT EDIT):
src/codegenie/indices/freshness.py— the variant set is frozen by ADR-0006;IndexerError.messageis documented as "a stable identifier — not a free-form human string" (lines 73-80). AC-5 preserves that contract.src/codegenie/probes/layer_b/index_health.py— the producer; the consumer reads fromProbeOutput.schema_slice, never imports the probe.pyproject.toml—[tool.mypy] warn_unreachable = trueis already global (Phase 0 S1-02); do NOT add[[tool.mypy.overrides]]blocks for the renderer module.- Any
src/codegenie/probes/**/*.py,src/codegenie/output/sanitizer.py,src/codegenie/adapters/**, orsrc/codegenie/tccm/**file. Renderer must not depend on any of these (AC-8 denylist).
TDD plan — red / green / refactor¶
RED (failing tests committed first):
test_module_surface_closed(AC-1) —set(codegenie.report.confidence_section.__all__) == {"render_confidence_section"}. NoConfidenceSectionRenderersymbol. Fails red — module does not exist.test_exhaustive_match_every_variant(AC-2) — input dict with 5 entries, one per variant, each keyed to a distinctindex_name. Assertions: (a)out.count("- [OK]") == 1; (b)out.count("- [STALE]") == 4; (c) for each(name, expected_marker)pair, the row whose first whitespace-separated token-after-[STALE]/[OK]equalsnamecontainsexpected_markerAND does NOT contain any other variant's marker (per-row negative-space). Built by parsingout.splitlines()into a{name: row}dict via regex^- \[(?:OK|STALE)\] (\S+) ·.test_row_format_per_variant_fresh|commits_behind|digest_mismatch|coverage_gap|indexer_error(AC-4) — one test per variant; each asserts the exact substring per the AC-4 format pins.test_row_format_indexer_error_message_truncated—IndexerError(message="x" * 300)renders withmessage[:200] + "…".test_row_order_full_sequence(AC-4) —re.findall(r"^- \[(?:OK|STALE)\] (\S+) ·", out, re.M) == sorted(input.keys())for three discriminating fixtures: uppercase/lowercase mix{"B": ..., "a": ..., "C": ...}proves code-point order; numeric{"idx10": ..., "idx2": ..., "idx1": ...}proves lex-not-natural. Mutation-resistance: would fail undersorted(reverse=True),sorted(key=str.casefold),sorted(key=hash).test_output_endings(AC-4) —out.endswith("\n")and"\n\n\n" not in out.test_ascii_only_no_emoji(AC-4) — every output codepoint is ASCII OR ∈{"·", "…"}.test_malformed_slice_emits_sentinel_only(AC-5) — slice{"freshness": {"kind": "not-a-known-kind"}}renders- [STALE] <name> · indexer_error · slice_malformedand NOTHING afterslice_malformed(regex^- \[STALE\] \S+ · indexer_error · slice_malformed$). Subsequent valid slice still renders correctly.test_malformed_slice_emits_structlog_event(AC-5) — usingstructlog.testing.capture_logs, assert onereport.confidence_section.slice_malformedevent was emitted with fields{index_name, error_count, first_loc}. The event payload contains the diagnostic detail; the row does NOT.test_empty_envelope_renders_placeholder(AC-9) — three sub-cases ({},{"probes": {}}, fully-emptyindex_health) all return exactly"## Confidence\n\n_No index sources registered._\n"byte-for-byte.test_duplicate_index_name_raises_value_error(AC-10) — construct an upstream-merged-shape slice with duplicate index_name keys (e.g., wrap a list-shaped sub-structure that decodes to two entries with the same key); assertrender_confidence_section(envelope)raisesValueErrorwith message matchingr"duplicate index_name: .+".test_no_probe_registry_import(AC-8) — subprocess script importscodegenie.report.confidence_sectionthen printsLOADED:+ sorted modules. Parent assertsproc.returncode == 0first, then no module inproc.stdoutstarts with any prefix in{"codegenie.probes", "codegenie.coordinator", "codegenie.cache", "codegenie.adapters", "codegenie.tccm", "codegenie.output.sanitizer"}.check=False.test_renderer_does_not_re_sanitize(AC-12, AC-4 negative-space) — a well-formedIndexerError(message="AKIA1234567890ABCDEF")renders verbatim (renderer trusts the redactor's prior chokepoint).test_malformed_slice_does_not_leak_offending_value(AC-12) — malformed slice whose offending value matches AWS-key fixture; assert"AKIA1234567890ABCDEF" not in out.test_each_row_isolates_its_variant(AC-13, Hypothesis property-based) — see AC-13. Hypothesis health-checksuppress_health_check=[HealthCheck.too_slow]if needed; boundedmax_examples=200.test_renderer_has_no_side_effects(AC-11) — AST walk; lives intest_confidence_section_purity.py.- Integration
tests/integration/test_writer_renders_confidence_section.py::test_context_report_md_atomic_and_complete(AC-6) — runscodegenie gatheragainsttests/fixtures/portfolio/minimal-ts. Asserts:.codegenie/context/CONTEXT_REPORT.mdexists; no.tmpshadow.out.splitlines()[0] == "## Confidence".len(re.findall(r"^- \[", out, re.M)) == len(envelope_slice["probes"]["index_health"]["index_health"]).- Every input
index_nameappears in exactly one row. - Two back-to-back runs produce byte-identical
CONTEXT_REPORT.md(withFresh.indexed_atpatched to a deterministic value viamonkeypatchon the producer's time source, OR fixture pre-seeded as stale-only).
- Integration
test_writer_recovers_from_renderer_value_error(AC-10) — patchcodegenie.report.render_confidence_sectionto raiseValueError; assertrepo-context.yamlstill publishes successfully ANDreport.confidence_section.render_failedis logged.
All RED tests fail because codegenie.report.confidence_section does not yet exist.
GREEN (minimum code to pass):
- Create
src/codegenie/report/__init__.pywith__all__ = ["render_confidence_section"]and re-export. - Implement
render_confidence_section(envelope_slice: Mapping[str, Any]) -> str: - Locate
slices = envelope_slice.get("probes", {}).get("index_health", {}).get("index_health", {})(empty dict if missing at any level). - If
slicesis empty (or non-Mapping), return the placeholder block from AC-9. - Detect duplicates upstream (the AC-10 fixture shapes); raise
ValueError("duplicate index_name: …")before sorting. - For each
(name, slice_dict)insorted(slices.items()):- Try
freshness = TypeAdapter(IndexFreshness).validate_python(slice_dict.get("freshness")). - On
ValidationError as e: emit_log.warning("report.confidence_section.slice_malformed", index_name=name, error_count=len(e.errors()), first_loc=".".join(str(p) for p in e.errors()[0].get("loc", ())) or "<root>"); setfreshness = Stale(reason=IndexerError(message="slice_malformed")). - Dispatch via outer
match value:→ innermatch value.reason:(DP-1 / AC-2). Emit the row per AC-4 format pins.
- Try
- Return
"## Confidence\n\n" + "\n".join(rows) + "\n". - Do NOT introduce a
ConfidenceSectionRendererclass (DP-3). - Do NOT edit
pyproject.toml's mypy config (CON-1; warn_unreachable is already global). - Do NOT edit
src/codegenie/output/writer.py(CON-2; already wired).
REFACTOR:
- Extract per-variant row-format helpers (
_fresh_row,_commits_behind_row, ...) only if the inline arms exceed 5 lines each. Keep the nestedmatchshape — exhaustiveness at both levels is the point. - Confirm
mypy --strict src/codegenie/report/is clean. (warn_unreachableis honored repo-wide.) - Run the AC-3 dual ritual:
- (a) Delete
case Fresh(indexed_at=...):from the outermatch value:. Runmypy src/codegenie/report/. Confirm[unreachable]atassert_never(value). Capture stderr →_attempts/S8-01.md. Revert. - (b) Delete
case CommitsBehind(...):from the innermatch reason:. Runmypy src/codegenie/report/. Confirm[unreachable]atassert_never(reason). Capture stderr →_attempts/S8-01.md. Revert. ruff format; ensure no# type: ignorein this module.
Notes for the implementer¶
-
Read S1-01 first. The
IndexFreshnessvariant set +Literal[...]discriminators are non-negotiable; if you find yourself wanting to add a sixthStaleReason, stop — that requires an ADR amendment per phase-arch-design.md §"Integration with Phase 3" guarantee #2. -
Mirror the established nested-match idiom (DP-1). Read
src/codegenie/probes/layer_b/index_health.py:239-279— the producer module already has two consumers of the same sum type (_derive_confidenceand_last_indexed_at) using nestedmatchwithassert_neverat BOTH levels. The renderer MUST mirror that shape. A flat 5-arm match (e.g.,case Stale(reason=CommitsBehind()):) reduces the exhaustiveness signal — mypy may not always catch a missed inner reason through a single-level pattern over anAnnotated[Union[..], Field(discriminator=...)]type. Twoassert_neverarms is the load-bearing structural enforcement. -
The
assert_neverarms are the proof. If during AC-3's ritualmypydoes not fail on a removedcase, the repo-wide[tool.mypy] warn_unreachable = truesetting is broken — fixpyproject.toml(Phase 0 S1-02 territory) rather than weakening this story. -
IndexerError.messageis a stable identifier, not a free-form string (DP-2). Readsrc/codegenie/indices/freshness.py:73-80. NEVER constructIndexerError(message=f"prefix:{str(exception)}")or similar — that erodes the smart-constructor discipline every other Phase-2/3 consumer relies on. The AC-5 path emitsIndexerError(message="slice_malformed")(stable sentinel) and routes diagnostic detail to a structlog event. The same pattern applies to every future variant ofStaleReason. -
The renderer must NOT introduce a
@register_freshness_row_formatterdecorator-registry (DP-6 — anti-pattern alert). Even though@register_index_freshness_check(src/codegenie/indices/registry.py) and@register_dep_graph_strategy(src/codegenie/depgraph/) suggest "registry is the Open/Closed answer," that pattern is correct for producer extension (new index sources extend by addition), and wrong for consumer exhaustiveness here. A registry-of-formatters would make an unregistered sixth variant silently no-op; the explicitmatch+assert_nevermakes it a compile-time error. This asymmetry is intentional — the producer side is open, the consumer side is closed-by-design. If a reviewer proposes "make the renderer pluggable," point them at this paragraph and at 02-ADR-0006 §Consequences. -
The writer is already wired (CON-2).
src/codegenie/output/writer.py:138-156defines_publish_context_report(envelope.slice, output_dir); line 239 calls it insideWriter.write. The renderer's import path (from codegenie.report import render_confidence_section) is on line 148. You do NOT edit writer.py for this story — you implement the module the writer already imports. If the import fails at runtime, the writer'stry/except Exception(line 152) logsreport.confidence_section.render_failedand continues;repo-context.yamlis unaffected. -
Renderer input type (DP-4). The renderer accepts
Mapping[str, Any](RedactedSlice.slice— a dict). Do NOT widen toRedactedSlice(would couplecodegenie.reporttocodegenie.output.redacted_slice); do NOT narrow to a typed view object (Rule 2 — no single-use abstraction). Re-validate per-indexfreshnessJSON dicts viapydantic.TypeAdapter(IndexFreshness).validate_python(...)at the renderer's entry — this is the typed boundary. -
Newtype erasure at the slice boundary (DP-9). Map keys are
str(IndexName)— the newtype is erased atindex_health.py:369(results[str(name)] = ...) deliberately. Do NOT re-promoteIndexNameinside the renderer; keep keys asstr. (Per Rule 11; the slice boundary is the established type-erasure point.) -
CLI-vs-writer wiring (DP-5 — historical note). The current implementation lifts the renderer call into
_publish_context_reportinsidewriter.py, accepting one extra responsibility on the writer in exchange for atomicity within a single chokepoint. An alternative shape (CLI-side wiring afterwriter.write()returns) was considered; the writer-side path was chosen to keep the atomicity discipline in a single module. Do NOT relocate the call site as part of this story — change-management for that lives in a future surgical-edit story if it ever becomes necessary. -
The renderer takes the in-memory slice, not the persisted file. Reading
repo-context.yamlback from disk would re-parse YAML the writer just emitted; the in-memory slice is the source of truth (matches process view step 7). The writer passes it directly. -
Do not add
secrets_redacted_countto this story. That is S8-02's territory (CLI summary line). This story's renderer concerns itself only withIndexFreshness. -
Do not parallelize the
CONTEXT_REPORT.mdwrite with therepo-context.yamlwrite. Writer atomicity isrepo-context.yamlfirst (the canonical artifact),CONTEXT_REPORT.mdsecond (the human-readable companion). If the renderer raises (it shouldn't except via AC-10'sValueError), the writer logsreport.confidence_section.render_failedand continues;repo-context.yamlis intact. -
assert_neverimport:from typing import assert_neveris Python 3.11+; both Python versions Phase 2 supports include it withouttyping_extensions(CI matrix Python 3.11 and 3.12 per High-level-impl.md Step 8 done criterion #3). -
No emoji in
CONTEXT_REPORT.mdper user convention (CLAUDE.mdglobal Rule 11 — match codebase conventions; the codebase is ASCII-only). Allowed non-ASCII codepoints:·(separator) and…(truncation suffix). The AC-4 ASCII test pins this set explicitly. -
Phase 0 fence stays green: the renderer imports nothing from
anthropic/openai/langgraph/httpx/requests/socket. Trivially. -
CODEOWNERS:
src/codegenie/report/**does NOT need CODEOWNERS gating — onlyProbeContext(S1-09) is gated.