Story S8-02 — CLI summary block on stdout: secrets_redacted_count, fingerprints, skill_shadowed¶
Step: Step 8 — Confidence section renderer + CI ratchet + advisory benches + Phase-3 handoff
Status: Done — GREEN 2026-05-18 (phase-story-executor; see _attempts/S8-02.md for the per-AC evidence table + the Rule 7 conflict resolution print( → sys.stdout.write). The pure formatter (codegenie.cli_summary.summary_block) ships with an AST gate forbidding I/O imports; the impure shell is the three-call _emit_phase2_summary in codegenie.cli. RedactedSlice is not threaded into the new seam — primitives only — so the smart-constructor invariant (02-ADR-0010) stays intact at the closed two-site set. LoadOutcome + SkillsIndexSlice extended additively with shadowed_skills: tuple[ShadowedSkill, ...]; the existing skill_shadowed structlog event is preserved (no change to S2-01's emit-once contract). Zero new Phase-2 events (02-ADR-0008 stays clean). 39 new tests added; full pytest suite green (3588 passed); mypy --strict, ruff, lint-imports, make fence, pre-commit run --all-files all clean.
Effort: S
Depends on: S8-01 (CONTEXT_REPORT.md rendered alongside repo-context.yaml); S3-03 (writer chokepoint already emits envelope.written + secrets_redacted_count field — this story consumes that, does not add a new event); S2-01 (SkillsLoader — this story extends LoadOutcome additively to surface shadow records as data, not just structlog events)
ADRs honored: 02-ADR-0005 (no plaintext persistence — fingerprints only, 8-hex BLAKE3 only); 02-ADR-0008 (no new Phase-2 structlog events — this story adds zero new events; tests/unit/test_no_event_stream_in_phase_2.py discipline is preserved); 02-ADR-0010 (RedactedSlice.findings_count + RedactedSlice.fingerprints are the persisted-by-construction fields the CLI reads)
Validation notes (2026-05-18 — phase-story-validator)¶
This story was hardened by phase-story-validator. The draft was substantially restructured because it contradicted three sources of truth in the codebase and the ADRs:
- ADR-0008 contradiction. Draft added three new structlog events (
secrets.summary,fingerprints.summary,skills.shadowed.summary). 02-ADR-0008 forbids new Phase-2 events: "the discipline is 'no Phase-2 events'; the test enforcing this istests/unit/test_no_event_stream_in_phase_2.py." Resolution: zero new events; consume the existingenvelope.writtenevent forsecrets_redacted_count; the other two fields are stdout-only observability. - Duplicate emission. Draft re-introduced
secrets_redacted_countas a new event payload.src/codegenie/output/writer.py:250-253already emits it onenvelope.written(per S3-03 / Phase-1). Resolution: AC-2 now asserts the existing event carries the count once per gather; the stdout line reflects the same in-memory value, asserted via equality. - Fictional Phase-0 anchor. Draft AC-1 asserted "Phase 0 already prints a per-probe
Ran/CacheHit/Skippedaudit anchor" on stdout and required byte-for-byte preservation.src/codegenie/cli.pyhas zeroprint()calls; the audit anchor is the JSONruns/<utc-iso>-<short>.jsonfile + thecoordinator.dispatch.orderstructlog event. Resolution: this story introduces the first stdout surface incli.py; AC-1 reframed as "no stdout regression vs. master baseline (assertion: the only lines on stdout during a clean gather are the three this story adds)." - Wrong event name / field. Draft AC-4 referenced
probe.skill.shadowedevent andshadowed_by_tierfield. Actual event insrc/codegenie/skills/loader.py:430-437isskill_shadowed(constant_EVENT_SHADOWED) with fieldsskill_id, winning_tier, shadowed_tier, winning_path, shadowed_path. Resolution: import the constants by name; render<skill_id>:<shadowed_tier>using the real field name; assert names match the loader's module-level constants. - Missing fixture.
tests/fixtures/portfolio/secret-seeded/does not exist; S6-07'stests/adv/phase02/test_secret_in_source.pyis the actual landing spot. Resolution: AC-2/AC-3 usetmp_path-seeded inline secrets (the original draft's hedged fallback); no new portfolio fixture introduced unless S6-07's adversarial fixture is already reachable fromtests/integration/cli/via direct path import. - Phantom writer return struct. Draft told the implementer to add
secret_findings: list[SecretFinding]to "the writer's typed result struct."Writer.writereturnsNone. The redactor'sRedactedSlice.fingerprints(list[str]) andRedactedSlice.findings_count(int) are already in scope at_seam_redact_envelopeincli.py:349. Resolution: no writer-struct widening; consume the existingRedactedSlicefields directly. - AC-4 plumbing precondition.
SkillsIndexProbe(src/codegenie/probes/layer_d/skills_index.py:197-234) callsSkillsLoader().load_all()during gather butLoadOutcomedoes not surface shadowing as data — shadows are only emitted as structlog events. Resolution: this story extendsLoadOutcomewithshadowed_skills: tuple[ShadowedSkill, ...](an additive change to S2-01's API) and threads it throughSkillsIndexSliceso the CLI reads shadows from the same coordinator-merged envelope. No structlog-event-stream interception; data path only. !r-formatter bug. Draft GREEN code:print(f"fingerprints={fps!r}").repr(["aaaaaaaa"])produces['aaaaaaaa'](single-quoted). The draft's AC-3 regex rejected quoted entries, so the prescribed implementation could not pass the prescribed test. Resolution: explicitf"fingerprints=[{', '.join(fps)}]"; AC-3 regex matches the unquoted shape.- Plaintext-boundary holes. Draft AC-3 enumerated only
AKIA…andghp_…. The redactor's_PATTERNScatalog has six pattern classes (aws_access_key,github_token,jwt,rsa_private_key,npm_token,anthropic_key) plus entropy. Resolution: the test iteratessanitizer._PATTERNSso the boundary check co-evolves with the catalog — single source of truth, mutation-resistant. - Missing determinism property. Draft mentioned determinism in implementer notes but had no AC. Resolution: AC-9 added — two consecutive gathers on the same fixture produce byte-identical summary blocks.
- Pure-impure tangle. Draft
_emit_phase2_summary(findings, shadowed)mixed sort/dedup/format withprint. Resolution: puresummary_block(count, fingerprints, shadowed) -> SummaryBlock(frozen dataclass withas_lines() -> tuple[str, str, str]) + impure_emit_summary_block(block)that callsprint. Pure helper is unit-testable without log-capture, mirrors S8-01'srender_confidence_section/ writer split. - Anaemic shadow strings. Draft passed shadows as
list[str]already in"<skill_id>:<tier>"shape — re-parsing required by consumers. Resolution: typedShadowedSkillfrozen dataclass (already inferred in the loader's emit kwargs); formatting is the last step beforeprint, not the carrier shape.
Full critic findings + decision rationale archived at _validation/S8-02-cli-summary-line.md. Verdict: HARDENED.
Context¶
Phase 2 commits to zero plaintext in any persisted file (G5; production ADR-0005; phase ADR-0005). The audit-trail for secret findings has to be observable somewhere. Three observable surfaces exist in this phase:
repo-context.yaml— the artifact (persisted; secret findings appear as<REDACTED:fingerprint=BLAKE3_8>placeholders +RedactedSlice.fingerprintscarries the deduplicated 8-hex prefixes as a top-level field).CONTEXT_REPORT.md— the human-readable companion (S8-01; consumesIndexFreshness, notSecretFinding).envelope.writtenstructured-log event — already emitted bysrc/codegenie/output/writer.py:250-253carryingsecrets_redacted_count: intper S3-03. This is the only structured-log surface 02-ADR-0008 permits for Phase-2 secret-redaction observability.
Plus the operator-facing stdout surface, which is what this story introduces. The CLI summary block is three lines printed to stdout at the end of a clean gather, in this order:
secrets_redacted_count=<N>
fingerprints=[<8hex>, <8hex>, ...]
skill_shadowed=[<skill_id>:<shadowed_tier>, ...]
The values are computed from data already in scope at the end of the gather pipeline — no new structlog events, no widening of the writer's return type, no new persistence surfaces. The discipline 02-ADR-0008 enforces (tests/unit/test_no_event_stream_in_phase_2.py) is preserved.
The skill_shadowed line aggregates the three-tier-merge shadowing events that SkillsLoader currently emits only via structlog (src/codegenie/skills/loader.py:430-437). This story makes shadowing observable as data by extending LoadOutcome and SkillsIndexSlice with a shadowed_skills: tuple[ShadowedSkill, ...] field — the structlog event stays in place (operators who tail logs already see it), but the CLI now reads the aggregated list from the coordinator-merged envelope instead of intercepting events.
This is one of the smallest stories in Step 8 but load-bearing for the operator's ability to confirm that the redactor ran and that no skill silently shadowed another in a multi-tier deployment.
References — where to look¶
- Architecture:
../phase-arch-design.md§"Logging" — "Phase 2 adds one log field at the writer:secrets_redacted_count(int), so a 0-count run is grep-able." Already shipped via S3-03.../phase-arch-design.md§"Component design" #4SecretRedactor— "The returnedlist[SecretFinding]is collected in-memory for the CLI summary line; it is NOT persisted to any file." Fingerprint = first 8 hex of BLAKE3.../phase-arch-design.md§"Component design" §"SkillsLoader" — "first-tier-wins; collisions emit askill_shadowedwarning in the CLI summary."../phase-arch-design.md§"Process view" step 9 — CLI exit; "ProbeOutput emitted; CONTEXT_REPORT.md printed". This story extends step 9 with the stdout block, afterCONTEXT_REPORT.mdis written and before the CLI returns.- Phase ADRs:
../ADRs/0005-secret-findings-no-plaintext-persistence.md— fingerprints, no plaintext; 8-hex BLAKE3.../ADRs/0008-no-event-stream-in-phase-2.md— no new Phase-2 events. This story honors that. Thetests/unit/test_no_event_stream_in_phase_2.pydiscipline must stay green after this story lands.../ADRs/0010-redacted-slice-smart-constructor-at-writer-boundary.md—RedactedSlice.findings_count+RedactedSlice.fingerprintsare the existing typed fields the CLI reads. Both are persisted by construction; the CLI does not invent a parallel data path.- Production ADRs:
../../../production/adrs/0005-no-llm-in-gather.md— Phase 0 fence; CLI summary block is plain Pythonprint, no LLM.../../../production/adrs/0033-domain-modeling-discipline.md§3 — illegal-states-unrepresentable;SecretFindingandShadowedSkillare Pydantic frozen models.- Source design:
../final-design.md§"Component design" row 4 (SecretRedactor) — CLI summary path returned separately byredact_secrets, not threaded into aRedactedSlice.- Existing code (Phase 2 contract — DO NOT WEAKEN):
src/codegenie/output/sanitizer.py:258—SecretFinding(probe_name, fingerprint, pattern_class, cleartext_len)frozen model;_PATTERNSsix-class catalog at module level (the test iterates this for the plaintext-boundary check).src/codegenie/output/redacted_slice.py—RedactedSlice(slice, findings_count, fingerprints); the CLI readsfingerprintsandfindings_countdirectly.src/codegenie/output/envelope_redactor.py:274—_redact_envelope(envelope) -> RedactedSlice; called bycli._seam_redact_envelope. The returnedRedactedSlice.fingerprintsis deduplicated by insertion order.src/codegenie/output/writer.py:250-253— already emitsenvelope.writtenwithsecrets_redacted_count = envelope.findings_count. DO NOT add a parallel event; consume the existing one.src/codegenie/logging.py—EVENT_ENVELOPE_WRITTEN/SECRETS_REDACTED_COUNT_FIELDconstants. Import by name; do not hardcode strings at the call site.src/codegenie/skills/loader.py:86,430-437—_EVENT_SHADOWED = "skill_shadowed"constant; event payload hasskill_id, winning_tier, shadowed_tier, winning_path, shadowed_path.LoadOutcomecurrently exposesskills, per_file_errorsonly — this story addsshadowed_skills: tuple[ShadowedSkill, ...]toLoadOutcome(additive).src/codegenie/probes/layer_d/skills_index.py:197-234—SkillsIndexProbe.run; this story extendsSkillsIndexSlicewithshadowed_skills: tuple[ShadowedSkill, ...]and writes the field intoschema_slice.src/codegenie/cli.py:349—_seam_redact_envelope; theRedactedSliceis already in scope at the call site, so the CLI summary block consumes itsfindings_countandfingerprintsdirectly with no further plumbing.src/codegenie/cli.py:434-639—_run_gather_pipeline(11 steps); this story adds Step 11.5 (the stdout summary block) between the existing Step 11 (audit record) and CLI return.tests/smoke/conftest.py— the_seam_configure_loggingno-op fixture that keepsstructlog.testing.capture_logs()working duringCliRunner.invoke. Reuse this style for any new CLI integration test.tests/smoke/test_cli_end_to_end.py:39,234,256,294,315—from structlog.testing import capture_logsprecedent.
Goal¶
Extend src/codegenie/cli.py to emit a three-line summary block on stdout after Step 11 (audit record) succeeds and before the CLI returns. The block contains, in this exact order:
secrets_redacted_count=<N>—N == redacted_envelope.findings_count(the same value already emitted on theenvelope.writtenstructured-log event).fingerprints=[<8-hex>, <8-hex>, ...]—redacted_envelope.fingerprints(already deduplicated by_build_redacted_slice_pass), re-sorted ASCII-lex for determinism. Empty list rendered asfingerprints=[]. Never the plaintext value, never a hash longer than 8 hex.skill_shadowed=[<skill_id>:<shadowed_tier>, ...]— one entry perShadowedSkillreturned byLoadOutcome.shadowed_skills(read off theSkillsIndexProbe's slice), ASCII-sorted by(skill_id, shadowed_tier). Empty list rendered asskill_shadowed=[].
Formatting discipline:
- Pure formatter
summary_block(count: int, fingerprints: tuple[str, ...], shadowed: tuple[ShadowedSkill, ...]) -> SummaryBlocklives in a new modulesrc/codegenie/cli_summary.py. No I/O, no logger, no clock, no env reads. - Impure shell:
_emit_summary_block(block: SummaryBlock)callsprintthree times on stdout. This is the only new impure code. - No new structlog events (02-ADR-0008). The existing
envelope.writtencarriessecrets_redacted_count; the other two stdout lines have no structured-log counterpart and that is intentional.
Acceptance criteria¶
- [ ] AC-1 (No regression on existing CLI observability — stdout introduced cleanly). Before this story,
src/codegenie/cli.pyproduces zero stdout during a clean gather (verified bygrep -rn "print(\|click.echo" src/codegenie/cli.pyreturning empty on master). After this story, stdout contains exactly the three lines defined in the Goal, in order, separated by single newlines, with no leading/trailing blank lines.tests/integration/cli/test_summary_stdout_shape.py::test_only_three_linesruns a gather againsttests/fixtures/portfolio/minimal-tsand assertslen(stdout.strip().split("\n")) == 3. - [ ] AC-2 (
secrets_redacted_countline and event share the same value). Stdout contains a line matching^secrets_redacted_count=\d+$. The structured-log event captured bystructlog.testing.capture_logs()showsCounter(e["event"] for e in captured)["envelope.written"] == 1and the event'ssecrets_redacted_countfield equals the integer parsed from the stdout line.tests/integration/cli/test_summary_count_matches_event.py::test_count_equals_envelope_written_fieldasserts this againstminimal-ts(count == 0) and against atmp_pathfixture seeded with oneAKIA[0-9A-Z]{16}plaintext in a tracked file (count == 1). No newsecrets.summaryevent is emitted (asserted byCounter(...)["secrets.summary"] == 0). - [ ] AC-3 (
fingerprintsline: 8-hex only, sorted, deduplicated, no plaintext — mutation-resistant against the full pattern catalog). Stdout contains a line matching^fingerprints=\[(?:[0-9a-f]{8}(?:, [0-9a-f]{8})*)?\]$. The list is ASCII-sorted (sorted(set(redacted_envelope.fingerprints))). The plaintext-boundary assertion iteratescodegenie.output.sanitizer._PATTERNSand seeds one example per pattern class viatmp_path; for each pattern, the captured stdout (and the captured structured-log payload ofenvelope.written) is asserted to NOT contain the seeded plaintext. This is the boundary test for 02-ADR-0005; weakening either side (the iteration over_PATTERNSor the assertion) is a build break.tests/unit/cli/test_summary_fingerprints_format.pycovers the format regex + sort + dedup (with property-based generation viahypothesis.strategies.text(alphabet="0123456789abcdef", min_size=8, max_size=8));tests/integration/cli/test_summary_plaintext_boundary.pycovers the per-pattern boundary check. - [ ] AC-4 (
skill_shadowedline aggregated fromLoadOutcome.shadowed_skills— data path, not event interception). TheSkillsLoader.load_all()LoadOutcomecarries a newshadowed_skills: tuple[ShadowedSkill, ...]field (additive to S2-01).ShadowedSkillis a frozen Pydantic model with fieldsskill_id: SkillId, shadowed_tier: Tier, winning_tier: Tier, shadowed_path: str, winning_path: str— the same fields the existing structlog event already populates.SkillsIndexProbeprojects this tuple intoSkillsIndexSlice.shadowed_skills(additive to S2-01's sub-schema withadditionalProperties: false). The CLI reads it offgather_result.outputs["skills_index"].schema_slice["shadowed_skills"], sorts by(skill_id, shadowed_tier)ASCII-lex, formats one entry per shadow asf"{skill_id}:{shadowed_tier}", and rendersskill_shadowed=[entry, entry, ...]. The existingskill_shadowedstructlog event continues to fire once per collision (no change to S2-01's emit-once contract).tests/integration/cli/test_summary_skill_shadowed_data_path.pybuilds atmp_path-rooted fixture with two tiers defining the sameskill_id(one repo, one org), runs a gather, and asserts the stdout line matches the data path AND thatCounter(...)["skill_shadowed"] == 1(the per-collision event still fires once). Zero collisions →skill_shadowed=[]. - [ ] AC-5 (summary block emits after Step 11 audit record write succeeds, before CLI exit code 0).
tests/integration/cli/test_summary_order_after_audit.pyruns a gather and asserts (a) stdout sequence: the three summary lines appear after the captured-log eventenvelope.written(which is itself emitted after the writer's_atomic_write_bytesreturns — the audit anchor is on disk by the time stdout's first byte is written); (b) the CLI exit code is 0 on a clean gather, irrespective ofsecrets_redacted_countvalue. The implementation places the summary emission inside the_run_gather_pipelinebody after the Step 11_seam_audit_recordcall returns. - [ ] AC-6 (no new Phase-2 structlog events introduced — ADR-0008 discipline preserved).
tests/unit/test_no_event_stream_in_phase_2.pyremains green after this story lands. Additionally,tests/unit/cli/test_summary_no_new_events.pyruns a gather undercapture_logs()and assertsCounter(e["event"] for e in captured)contains exactly the event names present in the master baseline (a frozen set captured as_EVENTS_BASELINEat the top of the test file, sourced from a recent clean gather); no key likesecrets.summary,fingerprints.summary,skills.shadowed.summary, or any new event name appears. A mutation that introduces a new event fails this test. - [ ] AC-7 (zero-state grep-ability — all three lines always present on a clean gather). Per phase-arch-design.md §"Logging" — "a 0-count run is grep-able."
tests/integration/cli/test_summary_zero_state.pyruns againstminimal-ts(no seeded secrets, no skill collisions) and asserts stdout contains the literal substringssecrets_redacted_count=0,fingerprints=[], andskill_shadowed=[]. - [ ] AC-8 (pure formatter / impure shell split —
summary_blockis testable without log capture).src/codegenie/cli_summary.pyexposes a pure functionsummary_block(count: int, fingerprints: tuple[str, ...], shadowed: tuple[ShadowedSkill, ...]) -> SummaryBlock.SummaryBlockis a frozen@dataclasswith one methodas_lines() -> tuple[str, str, str].tests/unit/cli/test_summary_block_pure.pyconstructsSummaryBlockinstances directly (no gather, no logger, notmp_path) and asserts: (a) format regexes for each line; (b) idempotence —summary_block(*args) == summary_block(*args); (c) sortedness —block.as_lines()[1]lists fingerprints in ASCII-lex order; (d) dedup — supplying duplicate fingerprints yields the same output as supplying the deduplicated set. A mutation that makes any field impure (reads clock / env / I/O) is caught by static AST inspection intest_summary_block_pure.py::test_pure_no_io_imports(grepscli_summary.pyforimport os|import time|open(|print(|logger). - [ ] AC-9 (determinism — two consecutive gathers produce byte-identical summary blocks).
tests/integration/cli/test_summary_determinism.pyruns two gathers back-to-back againstminimal-ts(same source tree, same cache), captures stdout from each, and assertsstdout_1 == stdout_2byte-for-byte. Run against a fixture with three distinct seeded secrets to exercise the sort+dedup branch on a non-empty list. - [ ] AC-10 (
mypy --strict+ruff+lint-imports+fencegreen).mypy --strict src/codegenie/cli_summary.py src/codegenie/cli.py src/codegenie/skills/loader.py src/codegenie/skills/model.py src/codegenie/probes/layer_d/skills_index.pypasses.ruff check+ruff format --checkgreen on all touched files.make lint-importsgreen (no new cross-package edges;cli_summaryis a leaf module that imports onlydataclasses,codegenie.skills.model.ShadowedSkill, and stdlib).make fencegreen (no LLM/network imports introduced).
Out of scope¶
- Adding any new structlog event variant. 02-ADR-0008 forbids it;
tests/unit/test_no_event_stream_in_phase_2.pyenforces it. The existingenvelope.writtenandskill_shadowedevents are the only structured-log surfaces for this story's concerns. - Persisting the
list[SecretFinding](withpattern_class/cleartext_len) to any file. 02-ADR-0005 forbids this; fingerprints + count already live inRedactedSliceand are persisted-by-construction (per ADR-0010). - Rendering fingerprints or shadowed skills in
CONTEXT_REPORT.md. The renderer (S8-01) consumesIndexFreshness. A future story can extend the renderer; this one does not. - A
--jsonsummary-block mode. The stdout block is human-readable; structured-log events ARE the machine-readable surface; future task classes consume the YAML envelope. - Pagination of
fingerprintsorskill_shadowedwhen the lists are huge. A 1000-fingerprint line is a signal worth surfacing, not hiding. - Exit-code change on
secrets_redacted_count > 0. The CLI exits 0; the operator decides if a non-zero count is actionable. - Removing or relocating the existing
skill_shadowedper-collision structlog event. It stays; the CLI is now an additional observer of the same data via the probe-slice data path. - A
ShadowedSkillmodel insrc/codegenie/types/identifiers.py. The dataclass lives insrc/codegenie/skills/model.pynext toSkillandTier.
Files to touch¶
New:
src/codegenie/cli_summary.py— pureSummaryBlockfrozen dataclass +summary_block(...)factory +as_lines()formatter. ~40 LOC. No I/O.tests/unit/cli/test_summary_block_pure.py— AC-8 (pure formatter unit tests, hypothesis property tests for sort+dedup).tests/unit/cli/test_summary_no_new_events.py— AC-6 (baseline-vs-current event-name diff).tests/unit/cli/test_summary_fingerprints_format.py— AC-3 format regex + property-based dedup/sort.tests/integration/cli/test_summary_stdout_shape.py— AC-1 (exactly three lines, correct order).tests/integration/cli/test_summary_count_matches_event.py— AC-2 (stdout int ==envelope.writtenfield).tests/integration/cli/test_summary_plaintext_boundary.py— AC-3 (iteratessanitizer._PATTERNS; tmp_path-seeded plaintext per pattern; asserts none of the plaintexts appear in stdout or inenvelope.writtenevent payload).tests/integration/cli/test_summary_skill_shadowed_data_path.py— AC-4 (tmp_path two-tier collision fixture; asserts stdout line + existing event still fires once).tests/integration/cli/test_summary_order_after_audit.py— AC-5.tests/integration/cli/test_summary_zero_state.py— AC-7.tests/integration/cli/test_summary_determinism.py— AC-9 (two consecutive gathers, byte-identical stdout).
Modified:
src/codegenie/cli.py— append a_emit_phase2_summary(redacted_envelope, skills_slice)helper invoked at the end of_run_gather_pipelineafter_seam_audit_recordreturns. ~20 LOC delta. Readsredacted_envelope.findings_count,redacted_envelope.fingerprints(both already in scope at line 614+), andgather_result.outputs.get("skills_index", ...).schema_slice.get("shadowed_skills", []). Callssummary_block(...)then_emit_summary_block(block). Does NOT thread through the writer.src/codegenie/skills/model.py— addShadowedSkillfrozen Pydantic model (skill_id: SkillId, shadowed_tier: Tier, winning_tier: Tier, shadowed_path: str, winning_path: str). ~15 LOC.src/codegenie/skills/loader.py— extendLoadOutcomewithshadowed_skills: tuple[ShadowedSkill, ...](additive, default empty tuple). Append aShadowedSkill(...)to a local accumulator in the collision branch at line 428-438 in addition to the existing_logger.warning(_EVENT_SHADOWED, ...)call. Return the accumulator in theOk(LoadOutcome(...))at line 449-454. ~10 LOC delta.src/codegenie/probes/layer_d/skills_index.py— extendSkillsIndexSlice(in the same file or its model module) withshadowed_skills: tuple[ShadowedSkill, ...](default empty). Passoutcome.shadowed_skillsinto the slice constructor at line 218-222. ~5 LOC delta.src/codegenie/schema/probes/skills_index.schema.json— add ashadowed_skillsarray property; per-item schema mirrorsShadowedSkillfields. MaintainadditionalProperties: false. (Per the project's Phase 1 ADR-0004 sub-schema discipline.)
Untouched (DO NOT EDIT):
src/codegenie/output/writer.py— already emitsenvelope.writtenwithsecrets_redacted_count; do not add another field, do not widenWriter.write's return.src/codegenie/output/sanitizer.py(redact_secretssignature is frozen by S3-01/S3-02).src/codegenie/output/envelope_redactor.py(RedactedSliceshape frozen by 02-ADR-0010).src/codegenie/logging.py—EVENT_ENVELOPE_WRITTEN/SECRETS_REDACTED_COUNT_FIELDconstants stay as the single source of truth; the new tests import them by name.CONTEXT_REPORT.mdrenderer (S8-01).tests/unit/test_no_event_stream_in_phase_2.py— must stay green after this story.
TDD plan — red / green / refactor¶
RED (failing tests committed first):
test_summary_block_pure.py::test_format_regex_each_line— constructsSummaryBlock(count=0, fingerprints=(), shadowed=())and asserts the three lines match their format regexes. Fails red —cli_summary.pydoes not exist yet.test_summary_block_pure.py::test_sort_and_dedup_property— hypothesis property test generatinglist[str]of 8-hex strings (with deliberate duplicates and unsorted order); assertsblock.as_lines()[1]parses to a sorted unique list. Fails red.test_summary_block_pure.py::test_pure_no_io_imports— AST/grep assertion thatcli_summary.pydoes notimport os | time | open | print | logger | structlog. Fails red once the module exists if any I/O imports leak in.test_summary_stdout_shape.py::test_only_three_lines— runscodegenie gather minimal-tsviaCliRunner; assertslen(stdout.strip().split("\n")) == 3and the lines start withsecrets_redacted_count=,fingerprints=,skill_shadowed=in order. Fails red — the CLI does not print yet.test_summary_count_matches_event.py::test_count_equals_envelope_written_field— same gather undercapture_logs(); assertsint(stdout_count_line.split("=")[1]) == captured_event["envelope.written"]["secrets_redacted_count"]. Fails red.test_summary_fingerprints_format.py::test_fingerprints_format_regex— constructs aSummaryBlockwith three known 8-hex fingerprints; asserts stdout line matches the format regex. Fails red.test_summary_plaintext_boundary.py::test_no_pattern_class_plaintext_in_stdout— parameterized overcodegenie.output.sanitizer._PATTERNS; for each pattern, seeds an example plaintext in atmp_pathtracked file, runs gather, asserts the plaintext is NOT inresult.stdoutAND not in any captured-log event payload (serialized to string). Fails red.test_summary_skill_shadowed_data_path.py::test_collision_renders_stdout_entry— builds atmp_pathfixture with~/.codegenie/skills-org/foo/SKILL.mdand<repo>/.codegenie/skills/foo/SKILL.md(sameskill_id); runs gather; assertsskill_shadowed=[foo:org](or whatever the actualshadowed_tierstring is — assert against the loader's_EVENT_SHADOWEDpayload, single source of truth). AssertsCounter(captured)["skill_shadowed"] == 1. Fails red —LoadOutcome.shadowed_skillsdoes not exist yet.test_summary_order_after_audit.py::test_stdout_after_envelope_written— captures stdout + log event timeline; asserts first stdout byte appears after theenvelope.writtenevent's timestamp. Fails red.test_summary_no_new_events.py::test_event_names_match_baseline— runs a clean gather, captures all event names, assertsset(event_names) == _EVENTS_BASELINE(a constant defined at the top of the test, sourced from a fresh master-branch gather output). Fails red the moment a new event is introduced.test_summary_zero_state.py::test_three_zero_lines_present— runs gather onminimal-ts; assertssecrets_redacted_count=0,fingerprints=[],skill_shadowed=[]all literally in stdout. Fails red.test_summary_determinism.py::test_byte_identical_across_two_runs— runs gather twice on the same fixture; asserts byte equality. Fails red.
GREEN (minimum code to pass):
- Create
src/codegenie/cli_summary.pywith the pureSummaryBlockdataclass +summary_block()factory +as_lines()formatter. Usef"fingerprints=[{', '.join(sorted(set(fingerprints)))}]"(no!r). - Add
ShadowedSkilltosrc/codegenie/skills/model.py(Pydantic frozen model). - Extend
LoadOutcomeinsrc/codegenie/skills/loader.pywithshadowed_skills: tuple[ShadowedSkill, ...]; in the collision branch, append aShadowedSkill(skill_id=skill.id, shadowed_tier=tier, winning_tier=prior_tier, shadowed_path=str(skill_md), winning_path=str(prior_path))to a local accumulator; return it in theOk(LoadOutcome(...)). - Extend
SkillsIndexSlice(inprobes/layer_d/skills_index.pyor its model module) withshadowed_skills: tuple[ShadowedSkill, ...]; passoutcome.shadowed_skillsinto the constructor. - Add
shadowed_skillsarray tosrc/codegenie/schema/probes/skills_index.schema.json(per-item schema;additionalProperties: false). - In
src/codegenie/cli.py, at the end of_run_gather_pipeline(after_seam_audit_recordreturns), readredacted_envelope.findings_count,redacted_envelope.fingerprints, andgather_result.outputs.get("skills_index", ProbeOutput()).schema_slice.get("shadowed_skills", []); construct aSummaryBlockviasummary_block(...); call_emit_summary_block(block)which callsprint(line)for each ofblock.as_lines().
REFACTOR:
- Confirm the pure / impure split holds:
cli_summary.pyhas noimport os | time | open | print | logger; the only import beyond stdlib iscodegenie.skills.model.ShadowedSkill. - Confirm
make lint-importsdoes not flag a new cross-package edge. - Confirm
tests/unit/test_no_event_stream_in_phase_2.pyis still green. mypy --strictacross touched files;ruff format,ruff checkclean.- Manual sanity check: run
codegenie gather <minimal-ts>andgrep secrets_redacted_count=0on stdout — both succeed.
Notes for the implementer¶
- No new structlog events. 02-ADR-0008 is binding. If you find yourself reaching for
_log.info("secrets.summary", ...)or_log.info("fingerprints.summary", ...), stop. The existingenvelope.writtenevent carriessecrets_redacted_count; the other two stdout fields have NO structured-log counterpart, and that is the deliberate decision per ADR-0008 §Decision andtests/unit/test_no_event_stream_in_phase_2.py. - ADR-0005 boundary test is AC-3's plaintext regex, iterated over
_PATTERNS. Non-negotiable. If a future contributor adds a 7th pattern class tosanitizer._PATTERNS, the test automatically exercises it. Do NOT enumerate the pattern shapes in the test file — read them off the module-level constant. - Fingerprint truncation is 8 hex chars, period. Not 16, not the full BLAKE3. ADR-0005 chose 8 to make brute-force fingerprint→plaintext infeasible while keeping the line scannable.
RedactedSlice.fingerprintsalready enforces 8-hex; do not re-truncate or re-hash. - Zero-state grep-ability matters for ops.
grep secrets_redacted_count=0 <gather.stdout>should always succeed on a clean gather. Always print all three lines, even when empty. SkillsIndexProbemay not run (the probe registry filters by task type). Ifgather_result.outputs.get("skills_index")isNone, renderskill_shadowed=[]— the operator-visible signal is still "no shadowing observed in this gather," even though the underlying reason is "the probe didn't run." A future story can split this signal if the ambiguity becomes load-bearing; today it does not.- Determinism: sort fingerprints and shadowed entries ASCII-lex. The dedup in
RedactedSlice.fingerprintsis insertion-order; this story'ssummary_blockdoes the lex sort. The same gather on the same fixture must produce a byte-identical summary block across runs (AC-9 verifies this). SkillIdandTierare domain primitives. If they are not already newtypes insrc/codegenie/types/identifiers.py(checkgit grep "SkillId = NewType"),ShadowedSkillshould still use the existing strings from the loader's emit kwargs — do not invent a newtype as a side-effect of this story. Surface it as a follow-on if the loader exposes rawstr.- Rule 2 — no registry for three fields. Three stdout lines with three lines of code each is below the abstraction threshold. If a 4th summary field lands in Phase 3 (e.g., per-plugin metrics from the first plugin), introduce
@register_summary_field(name)then; not now. This pre-empts premature plugin-ification while documenting the OCP escape hatch. - Functional core, imperative shell.
summary_blockis pure (thetest_pure_no_io_importsAST test enforces this)._emit_summary_blockis the only impure new code: threeprintcalls. The pure helper is unit-testable without any gather setup; the impure shell is integration-tested viaCliRunner. - No new dependency.
structlogis Phase 0 baseline;printis stdlib. Phase 0fencestays green trivially. - Re-use S6-07's adversarial fixture only if it is reachable.
tests/adv/phase02/test_secret_in_source.pylands a seededAKIA…somewhere; if its fixture is importable fromtests/integration/cli/, reuse it. Otherwise usetmp_path-seeded inline secrets — do not duplicate fixture files in a new portfolio entry. - Re-entrancy: each
codegenie gatherinvocation is a fresh process. In production the CLI runs once per gather; in testsCliRunnersimulates this. There is no need to engineer for "two gathers in one process" — the structlogcapture_logscontext manager wraps a single invocation and that is the unit of analysis. - Plumbing path is
cli.py→RedactedSlice(already in scope at_seam_redact_envelope). Do NOT widenWriter.write's return type. Do NOT threadlist[SecretFinding]anywhere. The redactor's full findings list lives only insideenvelope_redactor._PassStateand is not surfaced; the CLI consumesRedactedSlice.findings_count(int) andRedactedSlice.fingerprints(list[str]) directly.