Attempt log: S8-02 — CLI summary block on stdout¶

Append-only journal. Each attempt records what was tried, what worked, what didn't, and the lesson for the next attempt or story.

Attempt 1 — 2026-05-18 (phase-story-executor)¶

Outcome: GREEN. All 10 ACs verified; 39 new tests added (29 unit + 8 parametrized integration + 2 ordering); full pytest suite stays green (3588 passed, 33 skipped, 2 xfailed); mypy --strict, ruff check/ruff format, lint-imports, make fence, and the full pre-commit run --all-files all clean.

Code shipped¶

New:
src/codegenie/cli_summary.py — pure SummaryBlock frozen dataclass + summary_block(...) factory + as_lines() formatter. ~95 LOC. No I/O — AST gate enforces it.
tests/unit/cli/__init__.py + tests/integration/cli/__init__.py
- tests/integration/cli/conftest.py (autouse no-op of _seam_configure_logging, mirroring smoke conftest).
tests/unit/cli/test_summary_block_pure.py — 9 tests covering AC-8 (purity, regex, idempotence, sort+dedup, AST gate, hypothesis property).
tests/unit/cli/test_summary_fingerprints_format.py — 4 tests for AC-3 format regex + hypothesis property.
tests/unit/cli/test_summary_no_new_events.py — 2 tests for AC-6 (runtime event-name guard + source-grep interdict on the new helper body).
tests/unit/skills/test_loader_shadowed_skills.py — 2 tests for the loader's additive shadowed_skills field (zero-collision / cross-tier collision).
tests/integration/cli/test_summary_stdout_shape.py — AC-1.
tests/integration/cli/test_summary_count_matches_event.py — AC-2 (stdout count == envelope.written field; both clean and secret-seeded cases).
tests/integration/cli/test_summary_plaintext_boundary.py — AC-3 boundary, parametrized over sanitizer._PATTERNS + entropy (7 parameters; auto-extends when a 7th class lands).
tests/integration/cli/test_summary_skill_shadowed_data_path.py — AC-4 wiring (calls _emit_phase2_summary directly with synthetic schema_slice, plus a live zero-collision smoke).
tests/integration/cli/test_summary_order_after_audit.py — AC-5.
tests/integration/cli/test_summary_zero_state.py — AC-7.
tests/integration/cli/test_summary_determinism.py — AC-9 (two consecutive gathers, byte-identical stdout; zero-state + three seeded secrets).
Modified:
src/codegenie/skills/model.py — added ShadowedSkill frozen Pydantic model (skill_id, shadowed_tier, winning_tier, shadowed_path, winning_path — the same five fields the loader's existing structlog event already populates).
src/codegenie/skills/loader.py — extended LoadOutcome with shadowed_skills: tuple[ShadowedSkill, ...] (default empty); collision branch appends a ShadowedSkill row in addition to the existing _logger.warning(_EVENT_SHADOWED, ...) call.
src/codegenie/probes/layer_d/skills_index.py — extended SkillsIndexSlice with shadowed_skills; passed outcome.shadowed_skills into the constructor.
src/codegenie/schema/probes/skills_index.schema.json — added the shadowed_skills array property + ShadowedSkill $defs entry (additionalProperties: false).
src/codegenie/cli.py — added _emit_phase2_summary(findings_count, fingerprints, skills_slice) helper; invoked at the end of _run_gather_pipeline after _seam_audit_record returns.

Per-AC evidence¶

AC	Where verified	Status
AC-1 — exactly three lines, in order, no extras	`test_only_three_lines`	PASS
AC-2 — stdout count == `envelope.written` field (both surfaces)	`test_count_zero_on_clean_minimal_ts`, `test_count_one_when_secret_seeded` (asserts equality, plus zero `secrets.summary` emission)	PASS
AC-3 — fingerprints format + plaintext boundary over `_PATTERNS`	`test_summary_fingerprints_format` (regex + hypothesis), `test_summary_plaintext_boundary` (parametrized over `_PATTERNS` + entropy = 7 cases)	PASS
AC-4 — `LoadOutcome.shadowed_skills` data path, not events	`test_loader_shadowed_skills::test_cross_tier_collision_yields_one_shadowed_skill_row`, `test_emit_phase2_summary_reads_shadowed_skills_from_slice`, `test_shadowed_skill_model_uses_real_loader_field_names`	PASS
AC-5 — summary emits after audit record, exit 0 irrespective of count	`test_audit_record_exists_before_stdout`, `test_exit_code_zero_irrespective_of_count`	PASS
AC-6 — no new Phase-2 structlog events	`test_no_forbidden_event_names_in_clean_gather`, `test_no_new_event_emission_in_cli_summary_source`	PASS
AC-7 — zero-state grep-ability	`test_three_zero_lines_present`	PASS
AC-8 — pure formatter / AST gate	`test_summary_block_pure.py` (9 tests including `test_pure_no_io_imports` AST walk)	PASS
AC-9 — determinism (byte-identical across two runs)	`test_byte_identical_across_two_runs_zero_state`, `test_byte_identical_with_three_seeded_secrets`	PASS
AC-10 — mypy + ruff + lint-imports + fence green	`mypy --strict src/` → `Success: no issues found in 133 source files`; `ruff check` → `All checks passed!`; `lint-imports` → `Contracts: 2 kept, 0 broken.`; `make fence` → 9 tests pass; pre-commit all-files → clean	PASS

Conflict surfaced + resolution (CLAUDE.md Rule 7)¶

Story Goal text: "Impure shell: _emit_summary_block(block: SummaryBlock) calls print three times on stdout."

Reality: the repo bans print( in src/ via two stacked structural defenses: ruff's T201 rule (pyproject.toml [tool.ruff.lint]) and the forbidden-patterns pre-commit hook (scripts/check_forbidden_patterns.py). Both fire on the literal print( substring. Operator-facing stdout in this codebase flows through sys.stdout.write; logging flows through structlog.

Resolution chosen: sys.stdout.write(line + "\n") (one newline per line, three calls). Semantically identical to print for this use case, respects both structural defenses, and matches the codebase's existing convention (Rule 11). The story's "calls print" was casual phrasing for "writes three lines on stdout" — the intent is the operator-visible stdout surface, not the specific stdlib helper.

The comment block explaining the choice deliberately does NOT contain the literal token print( because the forbidden-patterns regex matches anywhere in the file, even inside comments. First attempt's comment ("sys.stdout.write (not print()…") tripped the hook; the shipped comment reframes without the forbidden substring.

Out-of-scope finding (Rule 3 — surgical changes)¶

SkillsIndexProbe.run (src/codegenie/probes/layer_d/skills_index.py) reads ctx.config and ctx.output_dir, but the coordinator constructs BudgetingContext (which has only workspace, raw_artifact_mb, bytes_written, parsed_manifest, input_snapshot). The probe therefore fails on every gather with AttributeError: 'BudgetingContext' object has no attribute 'config' and is failure-isolated to an empty schema_slice.

This is pre-existing (not introduced by S8-02) and out of scope for this story. My code handles the failure-isolated output gracefully: the .get("shadowed_skills", []) default renders skill_shadowed=[], and AC-4's data-path is validated at the CLI seam (with a synthetic schema_slice) rather than through the broken upstream probe. The zero-collision smoke test_no_collisions_renders_empty_skill_shadowed_on_live_gather exercises the live coordinator output for the case the live pipeline can actually produce today.

Surfacing this as a separate follow-up so a future story owns the BudgetingContext ↔ ProbeContext API drift — at least four other probes (runtime_trace, ast_grep, tree_sitter_import_graph, semgrep) hit the same wall today.

Refactor decisions (design-patterns lens)¶

Pattern	Decision
Functional core / imperative shell	Reinforced — `cli_summary.summary_block` is pure (AST gate); `_emit_phase2_summary` is the only impure shell (three `sys.stdout.write` calls).
Smart constructor	Preserved — `_emit_phase2_summary` takes primitives (`findings_count: int`, `fingerprints: list[str]`) instead of `RedactedSlice` so test code never constructs `RedactedSlice` outside the documented two-site closed set (02-ADR-0010). Refactored from a `RedactedSlice` parameter mid-implementation after the structural-defense test caught the leakage.
Newtype discipline	`ShadowedSkill.skill_id: SkillId` — newtype preserved end-to-end; no primitive obsession on the carrier shape.
Tagged union / sum type	Not applicable — `ShadowedSkill` is a flat record, not a sum type.
Open/Closed at the file boundary	`test_summary_plaintext_boundary` iterates `sanitizer._PATTERNS` so a 7th pattern auto-exercises the boundary check (no per-pattern test file edits).
Strategy pattern	Considered + rejected for `summary_block` formatting — three rendered lines with three trivial formats; a strategy dict would be premature abstraction (Rule 2).
Lazy import	Applied at the impure shell — `_emit_phase2_summary` imports `cli_summary` and `skills.model` via `importlib.import_module` to keep `codegenie.cli`'s static import graph minimal (the `import-linter` cli contract stays kept).
Capability / chokepoint	The redactor remains the only construction site for `RedactedSlice`; the smart-constructor invariant is preserved by the primitive-passing seam in `_emit_phase2_summary`.

No anti-patterns introduced (no Any in new signatures, no if/elif ladder over a closed variant set, no broad except Exception, no mutable module-level state, no two-level concrete inheritance).

Gates¶

make lint — clean (ruff check + ruff format).
lint-imports --config pyproject.toml --no-cache → Contracts: 2 kept, 0 broken.
mypy --strict src/ → Success: no issues found in 133 source files.
pre-commit run --all-files → all hooks pass.
make fence → 9 fence tests pass.
Full pytest suite — 3588 passed, 33 skipped, 3 deselected, 2 xfailed.

Lessons for follow-on stories¶

L1. The repo's print( ban is two-layer structural (ruff T201
forbidden-patterns hook). Any new operator-facing stdout in src/ must use sys.stdout.write and comments must avoid the literal substring print( because the forbidden-patterns regex doesn't respect Python comment boundaries.
L2. The RedactedSlice smart-constructor invariant is enforced by AST scan of every src/ and tests/ file. A new seam that takes RedactedSlice as a parameter forces test code to construct one, tripping the closed two-site allowlist. Prefer primitive parameters at consumer seams — the wrapper carries the invariant at the producer side, not at every consumer signature.
L3. BudgetingContext vs ProbeContext API drift — several Layer-B/C/D probes (runtime_trace, ast_grep, tree_sitter_import_graph, semgrep, skills_index) read attributes that exist on ProbeContext but not on the coordinator-constructed BudgetingContext. They are all failure-isolated to empty slices on every gather. This is a Phase-2 follow-up worth a dedicated story; the affected probes are visible by grepping ctx.config\|ctx.output_dir\|ctx.cache_dir\|ctx.logger under src/codegenie/probes/.
L4. structlog.testing.capture_logs() swallows print-to-stdout events — wrapping CliRunner.invoke(...) in capture_logs() intercepts the structlog chain so events don't pollute stdout. This is the pattern any future test that needs a "clean stdout" assertion through a Click runner should follow (the _seam_configure_logging no-op alone isn't enough — structlog's default PrintLogger writes to stdout when unconfigured).