Skip to content

Attempt log — S3-02 (RedactedSlice smart constructor private to redact_secrets)

2026-05-16 — Attempt 1 — DONE (paired with S3-01)

Summary

Landed src/codegenie/output/redacted_slice.py — a pure-functional-core domain model with three frozen fields (slice, findings_count, fingerprints), enforcing the 8-hex fingerprint format and the findings_count >= len(fingerprints) invariant at construction time. Shipped in the same PR as S3-01 per the validator's tight-coupling guidance — S3-01 imports RedactedSlice from this module.

Stage 2 — Implementer (TDD)

  1. RED. tests/unit/output/test_redacted_slice.py covers all 26 ACs from the validator-hardened story (AC-1..AC-19 + AC-7b, AC-8b, AC-10b, AC-10c, AC-11b, AC-12b, AC-15b). Tests collected red against a not-yet-existent module.
  2. GREEN.
  3. RedactedSlice(BaseModel) with model_config = ConfigDict(frozen=True, extra="forbid").
  4. Field order pinned: slice: dict[str, JSONValue], findings_count: int = Field(ge=0), fingerprints: list[str]. Pydantic preserves declaration order in model_dump + model_dump_json (AC-11b).
  5. @field_validator("fingerprints") rejects anything not matching ^[0-9a-f]{8}$ (lowercase, exactly 8 chars).
  6. @model_validator(mode="after") enforces findings_count >= len(fingerprints) (a deduplicated count contract — same secret appearing twice has findings_count == 2, len(fingerprints) == 1).
  7. Refactor. Pure module: no I/O, no logging, no clock, no subprocess. _FP_PATTERN = re.compile(r"^[0-9a-f]{8}$") at module scope so the regex is compiled once. Self returned from the model_validator for type-checker happiness.

Stage 3 — Validator

  • 54 tests in test_redacted_slice.py green (the parametrize over fingerprint validators + boundary cases + hypothesis property tests expands the AC list).
  • Cross-story integration with S3-01 (AC-15b) covers four canonical redaction shapes: zero secrets, one secret, three distinct, same-twice.
  • model_construct ban verified end-to-end via subprocess invocation of scripts/check_forbidden_patterns.py against synthetic offending content in both tmp_path/src/codegenie/output/synth.py (must fire) and tmp_path/src/codegenie/parsers/synth.py (must NOT fire — surgical predicate negative path, AC-12b).
  • AC-14: re.compile(r"\.model_construct\s*\(|\bmodel_construct\s*=") walks src/codegenie/output/ and asserts zero call-form hits — passes (the module docstring mentions model_construct in prose form only, which the structural regex correctly skips).
  • ruff check, ruff format, mypy --strict, pre-commit run --all-files: green. Coverage: 93.35%.

Refactor decisions

  • Schema before consumer — closed product type. Three persisted fields is the 02-ADR-0010 §Decision contract. Adding a fourth (e.g., pattern_class_counts: dict[Literal[...], int] for telemetry) would be an ADR amendment, mirroring S3-01 Notes #12 and S1-01 (IndexFreshness) / S1-03 (AdapterConfidence).
  • Pure module discipline. No logging, no structlog, no os.environ, no Path reads. The validators are pure functions over their arguments. Closes the toolkit's "Smart constructor" pattern at the I/O boundary without dragging IO concerns into the domain model.
  • Fingerprint newtype DEFERRED. Production ADR-0033 §3 flags primitive obsession on cross-module identifiers as a review-blocker. S3-02 is the second consumer (S3-01 produces; S3-02 carries via RedactedSlice.fingerprints); S3-03 will be the third (writer reads slice_.fingerprints). The rule-of-three threshold crosses at S3-03; surface the Fingerprint = NewType("Fingerprint", str) opportunity in S3-03's prose. S3-02 closes the format invariant by validator; the origin invariant (only _fingerprint(...) produces a Fingerprint) is S3-03's surface.

Files touched

  • src/codegenie/output/redacted_slice.py — new.
  • tests/unit/output/test_redacted_slice.py — new (54 tests).
  • src/codegenie/parsers/__init__.pyJSONValue redefined via TypeAliasType (see S3-01 attempt log L11 for the rationale).
  • docs/phases/02-context-gather-layers-b-g/stories/S3-02-redacted-slice-smart-constructor.md — status flipped to Done.

Lessons (carry forward)

See S3-01.md for the shared lessons L10 (entropy double-fire) and L11 (TypeAliasType for Pydantic-compatible recursive JSONValue) and L12 (mutation regex must be unable-to-match, not just weaker). The single new lesson unique to S3-02:

L13 — Field validators that target list[T] get the full list, not per-element

  • Symptom: Drafted @field_validator("fingerprints") to validate a single string; mypy strict flagged def _validate_fingerprints(cls, v: str) -> str because the field type is list[str] and Pydantic v2 passes the whole list to mode="after" validators (mode="before" could in principle preprocess; mode="after" gets the post-coerced type).
  • Fix: Type the validator as def _validate_fingerprints(cls, v: list[str]) -> list[str] and iterate inside. isinstance(fp, str) is defensive against Pydantic's coercion of non-str list elements.
  • Why it matters: Any future field validator targeting a list[T] field across Phase-2 / Phase-3 (the secret-fingerprint family expanding to RAG ingest, audit-anchor, etc.) inherits the discipline.