Skip to content

Attempt log — S3-01 (SecretRedactor pattern classes + entropy threshold + BLAKE3 fingerprint)

2026-05-16 — Attempt 1 — DONE (paired with S3-02)

Summary

Landed redact_secrets in src/codegenie/output/sanitizer.py (the existing Phase 0 chokepoint module — extended, not replaced) alongside the RedactedSlice smart constructor from S3-02. The two stories were merged into one attempt per the validator's tight-coupling guidance (S3-01 imports RedactedSlice from S3-02; the import path is correct in either ordering when both ship together). All 28 ACs (originals + harden-tier AC-26..AC-34) pass; mutation-test discipline against six pattern classes + entropy floor is load-bearing in CI.

Stage 2 — Implementer (red → green → refactor)

  1. RED — RedactedSlice tests first. Wrote tests/unit/output/test_redacted_slice.py covering all 26 ACs against a not-yet-existent codegenie.output.redacted_slice. Tests collected without errors after the PEP-695 ↔ TypeAliasType workaround (see L11 below).
  2. GREEN — RedactedSlice model. Added src/codegenie/output/redacted_slice.py with frozen=True, extra="forbid", the field-declaration order pinned by AC-11b, the per-element 8-hex fingerprint validator, and a model_validator(mode="after") for the findings_count >= len(fingerprints) invariant. 54 tests green.
  3. RED — SecretRedactor tests. Wrote tests/unit/output/test_secret_redactor.py covering all 28 ACs (AC-1..AC-25, AC-26..AC-34 after hardening). Each named pattern has a mutation test that swaps in a deliberately-unable-to-match regex via monkeypatch.setattr(sanitizer, "_PATTERNS", ...) and asserts the canonical example survives unredacted. Entropy-floor mutation patches the threshold to 6.0 (above the canonical fixture's ~5.58 measured entropy).
  4. GREEN — redact_secrets body. Added module-level _PATTERNS table, _ENTROPY_THRESHOLD_BITS_PER_CHAR, _ENTROPY_MIN_LEN, _shannon_entropy, _fingerprint, _make_repl, _redact_string, _walk, and the public redact_secrets. The entropy fallback fires only when no named pattern matched the leaf — see L10 below.
  5. Refactor. Composition: _walk recurses into dict values and list items; dict keys are not walked. Closure-based _make_repl keeps cleartext lifetime inside the regex callback. No global state. re.compiles live at module level so the patterns are stable chokepoints.

Stage 3 — Validator

  • 99 unit tests across test_redacted_slice.py + test_secret_redactor.py green.
  • Full suite: 2062 passed, 5 skipped, 3 deselected, 2 xfailed.
  • ruff check ., ruff format --check ., mypy --strict src/, pre-commit run --all-files: all green.
  • Coverage: 93.35% (above the 85% gate).

All 28 ACs evidence:

AC Evidence
AC-1 redact_secrets exported from sanitizer.py:__all__; signature (slice_, probe_name) -> tuple[RedactedSlice, list[SecretFinding]] (test_ac1_redact_secrets_is_exported)
AC-2 Module docstring carries 02-ADR-0005, 02-ADR-0010, 4.5, 32 (test_ac2_module_docstring_contains_required_anchors)
AC-3 SecretFinding frozen + extra=forbid + 4 fields (test_ac3_secret_finding_model_shape)
AC-4..AC-9 Per-pattern canonical-example redaction + pattern_class assertion
AC-10..AC-12 Entropy threshold + length floor + low-entropy skip
AC-13 / AC-14 / AC-32 Fingerprint format, dedup-order, prefix-strip regression guard
AC-15..AC-17 Recursive walk, dict-keys-not-walked, scalar passthrough
AC-18 Six per-pattern mutation tests via monkeypatch.setattr(sanitizer, "_PATTERNS", ...)
AC-19 Entropy-floor mutation (threshold → 6.0)
AC-20..AC-22 Findings count == total replacements, no cleartext on finding, stateless across calls
AC-24 model_construct absence in sanitizer.py source (positive lint of self)
AC-26..AC-29 Same-secret-twice dedup, two-distinct-classes-in-one-string, byte-length cleartext, input-not-mutated
AC-30 _PATTERNS + _ENTROPY_* are module-level; monkeypatch.setattr genuinely disables redaction
AC-31 _shannon_entropy total over str"", "a", "a"*100, multi-byte all return finite floats
AC-33 RedactedSlice.model_validate(model_dump()) round-trips across 5 fixture shapes
AC-34 Inline-substring replacement preserves prefix/suffix across AWS / GitHub / NPM / Anthropic + entropy

Refactor decisions

  • Entropy fallback only fires on strings unchanged by named patterns. Original implementer note suggested "the entropy rule sees post-regex strings; <REDACTED:...> tokens are below the 32-char floor so they cannot retrigger". That reasoning is wrong — a string like "Authorization: token ghp_<36>" is ~57 chars; after GitHub redaction it's still ~50 chars and its entropy rises (the 8-hex fingerprint adds high-entropy bits). The post-replacement string then triggers a second entropy finding, silently double-counting one cleartext credential. The fix is structural: track whether any named pattern matched; if so, skip the entropy pass. Closes AC-15b / AC-27 / AC-34 inline-substring semantics without sacrificing AC-10's entropy fallback for genuinely unnamed credential shapes. Documented in module docstring.
  • Closure-based _make_repl keeps cleartext lifetime tight. The cleartext appears as a re.Match.group(0) inside the closure, is fingerprinted, then discarded. No print, no log, no field on SecretFinding. Matches 02-ADR-0005 §Decision.
  • Module-level pattern table is non-negotiable. Each mutation test monkeypatches _PATTERNS; moving it function-local would silently no-op the harness. AC-30c is the positive control.
  • pattern_class is Literal[...], not Enum. Cheaper at the Pydantic boundary; mypy --strict enforces exhaustiveness when a future story uses match over the literal. Closed-set extension framing per Notes-for-implementer #12.

Files touched

  • src/codegenie/output/sanitizer.py — extended Phase 0's existing module with SecretFinding, redact_secrets, _PATTERNS, helpers. The existing OutputSanitizer.scrub is untouched (composition is S3-03's job).
  • src/codegenie/output/redacted_slice.py — new (S3-02).
  • src/codegenie/parsers/__init__.pyJSONValue changed from a plain union assignment to TypeAliasType("JSONValue", ...) (see L11 below).
  • tests/unit/output/test_redacted_slice.py — new (S3-02).
  • tests/unit/output/test_secret_redactor.py — new.
  • docs/phases/02-context-gather-layers-b-g/stories/S3-01-secret-redactor.md — status flipped to Done.

Lessons (carry forward)

L10 — Entropy fallback double-fires on post-redacted strings (S3-01 / S3-02)

  • Symptom: Tests AC-15b / AC-27 / AC-34 fail with findings_count off by 1 (extra entropy finding) when a long string carries a named secret. Surface: "Authorization: token ghp_<36>" → expected 1 finding, got 2 (GitHub + entropy on the redacted ~50-char post-string).
  • Fix: In _redact_string, track matched_any_named across the pattern-table pass. If any named pattern matched, skip the entropy fallback entirely. The named pattern is the more-specific signal; entropy is the catch-all for unknown shapes.
  • Why it matters: S3-03 (writer signature tightening) reads findings_count for the structured-event field. A double-count here propagates as secrets_redacted_count: 2 for a single AWS key inside a long string — the CLI summary would misreport.

L11 — Pydantic v2 + recursive JSONValue requires TypeAliasType, not plain union

  • Symptom: Defining a Pydantic field as dict[str, JSONValue] where JSONValue = bool | int | float | str | None | list["JSONValue"] | dict[str, "JSONValue"] (the Phase 1 form) raises RecursionError: maximum recursion depth exceeded inside Pydantic's _generate_schema._union_schema — the forward-string "JSONValue" references are not resolved to a named alias, so the schema generator expands the recursion eagerly until the stack runs out.
  • Fix: Define the alias via typing_extensions.TypeAliasType (the runtime form of PEP 695's type statement, available on Python 3.11):
    from typing_extensions import TypeAliasType
    JSONValue = TypeAliasType(
        "JSONValue",
        "bool | int | float | str | None | list[JSONValue] | dict[str, JSONValue]",
    )
    
    Static typing semantics are identical (every existing dict[str, JSONValue] return-type annotation still type-checks). PEP 695's type statement (3.12+) is not an option because requires-python = ">=3.11" and target-version = "py311".
  • Why it matters: Any future Phase-2 / Phase-3 Pydantic model that carries a JSONValue field (RAG ingest, audit-anchor, etc.) hits the same wall. The fix is structural; the validator missed it for S3-02 because the prior art (safe_json.load() -> dict[str, JSONValue]) was a return-type, not a field annotation.

L12 — Mutation regexes need to be UNABLE to match the canonical, not just "weaker"

  • Symptom: The story's prescribed AWS mutation AKIA[0-9A-Z]{15} is more permissive than the production regex (one fewer required char). Against the 20-char canonical AKIAIOSFODNN7EXAMPLE, the weakened regex still matches the 19-char prefix and re.sub redacts it — failing the mutation-test assertion that the cleartext survives.
  • Fix: Mutate to a pattern that cannot match the canonical:
  • Length-quantified named patterns → use {N+1} (require one more char than the canonical has): AWS {17}, GitHub {37}, NPM {37}, Anthropic {60,} against a 50-char canonical.
  • Literal-prefix patterns (JWT) → change the literal: eyJXXX.
  • Multi-line block (RSA) → constrain to [^\n] between BEGIN/END.
  • Why it matters: Every Phase-2 mutation-test story (S6-07 gitleaks AKIA fixtures, S5-04 SBOM patterns, …) inherits the discipline: "weaken" is ambiguous; "make unable-to-match" is precise.