Skip to content

S3-05 — attempt log (Bundle cache key + Store + GC + codegenie cache prune CLI)

Attempt 1 — 2026-05-19 (GREEN, single-pass)

Outcome

GREEN — all 50 ACs met; full suite 4,873 passed, 0 failures excluding the two known-local-env canaries (test_lint_imports_canary.py PATH-resolution; test_pre_commit_run_all_files_exits_zero) which pass in CI per S3-03 attempt-log precedent.

Per-AC evidence table

AC Evidence (test file::test or file:line) Notes
AC-1 src/codegenie/plugins/cache.py:61-69 (__all__ = sorted([...]) set equality) + module docstring lines 1-50 cite ADR-0008 §Decision, ADR-0001 chokepoint, and Gap 4 origin.
AC-2 src/codegenie/plugins/cache_gc.py:56-62 __all__ + docstring lines 1-38 explicitly cite "This module IS the Gap 4 fix".
AC-3 tests/unit/plugins/test_bundle_cache_key.py::TestBundleCacheErrorModel (2 tests, frozen + unknown-reason). BundleCacheRaise(model=...) shape pinned by 17 separator-poisoning / TTL-env / invalid-key tests reading exc.value.model.reason.
AC-4 src/codegenie/types/identifiers.py:121-127 + tests/unit/types/test_identifiers_phase3.py::test_newtype_names_pinned (parametrised over PHASE3_NAMES + {BundleCacheKey}).
AC-5 src/codegenie/plugins/cache.py:130 plugin_version: str (NOT SemverVersion). DP-B follow-up: SemverVersion newtype now exists (S3-03 landed it 2026-05-19); story-AC still pins str to avoid widening the AC contract mid-execution. Future S7-02 caller could elevate.
AC-6 test_kwargs_only_signature — positional call raises TypeError.
AC-7 test_declared_order_byte_layout — composer output equals content_hash_bytes(unit_sep.join(parts).encode()).
AC-8 test_each_input_participates_mutation_resistant — 8-row parametrize, same-length distinct-class mutations. The single most important test in the story (ADR-0008 correctness).
AC-9 test_boundary_shift_collisions_blocked("ab","c")("a","bc").
AC-10 test_separator_poisoning_rejected — 8-row parametrize, each input carries embedded \x1f.
AC-11 test_determinism_n_100len({compose_bundle_cache_key(**kw) for _ in range(100)}) == 1.
AC-12 test_args_canonical_passthrough_verbatim'{"a":1}' vs '{"a": 1}' produce different keys.
AC-13 tests/unit/plugins/test_cache_no_blake3_import.py — AST walk of cache.py + cache_gc.py finds zero import blake3 / from blake3 import ….
AC-14 test_cache_dir_annotation_is_sandboxed_path — name-pinned to source-text "SandboxedPath" (matches S3-04 precedent). from __future__ import annotations keeps the annotation as a string.
AC-15 TestKeyValidation × 9 bad keys × 2 methods (put + get) = 18 parametrize rows. Path-traversal, uppercase, length variants, wrong algorithm prefix, empty all rejected.
AC-16 test_put_writes_atomically_no_residual_tmp + test_put_file_mode_0600_and_dir_mode_0700 — Phase-0 ADR-0011 modes; no *.tmp residue.
AC-17 test_idempotent_put_same_bundle + test_overwrite_with_different_bundle — byte-identical re-put; clean overwrite different bundle.
AC-18 test_corrupt_file_returns_none_and_file_survives + test_partial_json_returns_none + test_get_missing_returns_none + test_get_on_missing_cache_dir_returns_none — get returns None; corrupt file NOT deleted (corrupt.exists() asserted).
AC-19 TestBundleJsonRoundtripCanary::test_bundle_json_round_tripBundle.model_validate_json(b.model_dump_json()) == b over the populated tuple[BundleEntry, ...] field. PASSES without an S3-04 normalisation.
AC-20 test_result_serialises_cache_dir_as_stringmodel_dump(mode="json")["cache_dir"] == str(tmp_path).
AC-21 test_from_result_classmethod_constructs_event — full field round-trip + event_type == "cache_gc_completed".
AC-22 test_result_to_event_field_overlap_canaryset(CacheGcResult.model_fields) - {"cache_dir"} ⊆ set(CacheGcCompletedEvent.model_fields).
AC-23 docs/phases/03-vuln-deterministic-recipe/phase-arch-design.md line ~872 — Literal[...] tuple includes "cache_gc_completed"; make docs builds clean.
AC-24 _parse_ttl_seconds, _is_evictable, _should_run_amortized — Hypothesis-property tested with max_examples=100 each.
AC-25 tests/unit/plugins/test_cache_gc_purity.py::test_pure_helpers_have_no_io_or_clock_references — AST walk fences _FORBIDDEN_ROOTS = {os, Path, time, structlog} and function-scope imports.
AC-26 test_constructor_does_no_ioBundleCacheGc(...) succeeds with malformed env; only gc.run() raises.
AC-27 test_evicts_only_files_older_than_ttl + test_run_skips_non_hex_files_and_special_paths + test_run_skips_symlinks — only files matching ^[0-9a-f]{64}\.json$ AND not symlinks AND older-than-TTL are unlinked. Path.stat().st_size is captured BEFORE Path.unlink() (impl cache_gc.py:266-274).
AC-28 test_run_on_missing_bundles_dir_returns_zero + test_run_on_empty_bundles_dir_returns_zero.
AC-29 test_is_evictable_strict_boundary — exactly-TTL-old kept, one-second-older evicted.
AC-30 test_evicts_only_files_older_than_ttl::assert result.bytes_reclaimed == size_old — exact equality, not >= n.
AC-31 test_parse_ttl_accepts 6 rows + test_parse_ttl_reject_corpus 9 rows.
AC-32 test_event_emitter_called_exactly_once + test_event_emitter_none_emits_zero + test_within_24h_is_noop_and_does_not_emit.
AC-33 test_emitter_exceptions_propagate — RuntimeError from emitter propagates out of run().
AC-34 test_first_call_writes_stamp::assert stamp_path.exists() — stamp lands at <cache_dir>/.gc-stamp.
AC-35 test_stamp_atomic_no_tmp_residue — no .gc-stamp.*.tmp after run; content parseable as float.
AC-36 test_first_call_writes_stamp — first invocation with no stamp succeeds.
AC-37 test_corrupt_gc_stamp_fails_loud — non-float content raises BundleCacheRaise(reason="corrupt_gc_stamp").
AC-38 test_future_dated_stamp_treated_as_stale_should_run_amortized returns True for future stamp; new stamp written < future.
AC-39 test_concurrent_callers_serialized — 2 threads racing run_amortized; exactly one runs, one no-ops. fcntl.flock(LOCK_EX) on <cache_dir>/.gc-stamp.lock.
AC-40 test_24h_elapsed_runs_again — monkeypatch cg.time.time (module-bound, NOT global time.time); stamp advances. Uses a counter-gated fake to avoid recursion into the monkeypatched function during _now_iso()'s datetime.now() resolution.
AC-41 test_within_24h_is_noop_and_does_not_emit — second call returns None; emitter call-count stays 1.
AC-42 tests/integration/cli/test_cache_prune.py::test_cache_gc_stub_preservedrunner.invoke(cli, ["cache", "gc"]) exit 0 + structlog.testing.capture_logs sees event == "cache.gc.stub". Existing cache.py:911-913 stub bytes-for-bytes untouched.
AC-43 test_cache_prune_help_exit_zero--help exit 0 + output contains --cache-dir.
AC-44 test_cache_prune_emits_exactly_one_event[seed_stale-True/False] — both rows: 1 event, trigger == "operator_cli", correct (evicted, reclaimed) counts.
AC-45 test_cache_prune_event_file_mode_0600append.jsonl created at mode 0o600; events dir at 0o700. JSON-lines format, one event per line.
AC-46 tests/integration/cli/conftest.py::capture_spanning_events fixture — reads <cache_dir>/../events/spanning/append.jsonl, decodes via CacheGcCompletedEvent.model_validate_json. Used by AC-44.
AC-47 docs/phases/03-vuln-deterministic-recipe/ADRs/0008-...md §Tradeoffs postscript — three env vars enumerated. make docs clean.
AC-48 First test-only commit ran RED (modules absent); subsequent code commits brought them GREEN. Single-pass — no Stage 2 ↔ Stage 3 retry loop required.
AC-49 ruff check + ruff format --check over the 12 touched files: clean. mypy --strict src/: success, no issues in 173 files.
AC-50 lint-imports --config pyproject.toml --no-cache: 4 contracts kept, 0 broken (run via .venv/bin/lint-imports since lint-imports console script is not on $PATH in this local environment — same condition documented by S3-03 attempt log).

Files touched

  • src/codegenie/types/identifiers.py (modified) — BundleCacheKey = NewType(...) + __all__ + _NEWTYPE_REGISTRY row.
  • src/codegenie/types/__init__.py (modified) — re-export BundleCacheKey for the identity-passthrough test.
  • src/codegenie/plugins/cache.py (new) — compose_bundle_cache_key, BundleCacheStore, BundleCacheErrorModel, BundleCacheRaise, _validate_key, _atomic_write_bytes.
  • src/codegenie/plugins/cache_gc.py (new) — BundleCacheGc, CacheGcResult, CacheGcCompletedEvent, three pure helpers (_parse_ttl_seconds, _is_evictable, _should_run_amortized), _atomic_write_text, _now_iso.
  • src/codegenie/cli.py (modified) — @cache.command(name="prune") with inline JSON-lines emitter; the existing cache_gc stub at lines 911-913 is preserved bytes-for-bytes.
  • tests/unit/plugins/test_bundle_cache_key.py (new) — 24 composer tests (AC-3 + AC-6..AC-12).
  • tests/unit/plugins/test_bundle_cache_store.py (new) — 29 store tests (AC-14..AC-19).
  • tests/unit/plugins/test_bundle_cache_gc.py (new) — 39 GC tests (AC-20..AC-22, AC-24..AC-41).
  • tests/unit/plugins/test_cache_gc_purity.py (new) — AST purity fence (AC-25).
  • tests/unit/plugins/test_cache_no_blake3_import.py (new) — AST chokepoint fence (AC-13).
  • tests/integration/cli/test_cache_prune.py (new) — 5 CLI integration tests (AC-42..AC-45).
  • tests/integration/cli/conftest.py (modified) — added the capture_spanning_events fixture (AC-46). The autouse _disable_cli_configure_logging is unchanged.
  • tests/unit/types/test_identifiers_phase3.py (modified) — PHASE3_NAMES set extended with BundleCacheKey.
  • docs/phases/03-vuln-deterministic-recipe/phase-arch-design.md (modified) — Literal[...] additive edit (AC-23).
  • docs/phases/03-vuln-deterministic-recipe/ADRs/0008-...md (modified) — postscript paragraph (AC-47).

Design-pattern notes

  • Newtype + smart constructor (DP-D). BundleCacheKey = NewType(...) lives in codegenie.types.identifiers; the only sanctioned constructor is compose_bundle_cache_key. Three call sites (composer + BundleCacheStore.put + BundleCacheStore.get) — rule-of-three for the newtype itself is satisfied. A future AST chokepoint test on direct BundleCacheKey(...) calls outside the composer is NOT shipped (story explicitly defers — single chokepoint not yet justified).
  • Functional core / imperative shell (DP-A). The three pure helpers live above the constants block; the AST fence at test_cache_gc_purity.py is what holds the line — a careless refactor that pulls time.time() into _is_evictable would fail loud. Mirrors S3-04's _compose_entry discipline.
  • Markers-only Exception + frozen Pydantic value (S3-01/S3-04 lineage). BundleCacheRaise(model=BundleCacheErrorModel(reason=...)). The story specifies model= kwarg (NOT S3-04's error=); attempt-log readers should note this drift — when (if) the third call site appears we should standardise the kwarg name across all three sites.
  • Atomic write inlined twice (DP-G). _atomic_write_bytes in plugins/cache.py and _atomic_write_text in plugins/cache_gc.py. Phase-0 cache/store.py:118-135 is the third site by spirit — extraction to codegenie._fs_atomic.atomic_write_{bytes,text} is genuinely justified now. Surfacing as a follow-up against Phase 6's recipe-cache writer.
  • Open/Closed seam on event-type literals. The arch §C9 WorkflowSpanningEvent.event_type Literal grew additively; no consumer dispatch on trigger exists yet, so no match + assert_never site needed. When S6-04's orchestrator reads trigger, that site MUST use the discriminator-on-kind pattern.
  • Newtype now exists for SemverVersion (S3-03 added it 2026-05-19), but the AC-5 contract was written when it did not. Honoured the AC literally (use str) and surface the elevation opportunity as a Phase-3 cleanup task. The cost of widening the AC mid-execution is higher than the cost of one str field.

Refactor decisions

  • Did not extract codegenie._fs_atomic yet. Justified by story §Notes DP-G ("Rule 2 trumps DRY at two callers"); recommended for the next atomic-write call site.
  • Did not standardise BundleCacheRaise(model=...) vs BundleBuilderRaise(error=...). Both are local raise wrappers around frozen-Pydantic payloads; the kwarg-name drift is the only difference. Rename can land in a sweep when a third local-raise class appears.

Deferred / out-of-scope (intentional)

  • EventLog infrastructure / BLAKE3 chain / zstd compression — S6-01 absorbs the interim append.jsonl substrate additively.
  • BundleBuilder cache lookup integration — S7-02 wires BundleBuilder.build to consult BundleCacheStore.get / .put.
  • CanonicalArgsJson newtype — DP-C still at two callers (S3-04 + this story); deferred to the third.
  • Cancellation / metrics from the CLI — S6-04 ports the orchestrator emitter here.

Gate log

  • .venv/bin/pytest tests/unit/plugins/test_bundle_cache_{key,store,gc}.py tests/unit/plugins/test_cache_{gc_purity,no_blake3_import}.py tests/integration/cli/test_cache_prune.py --no-cov99 passed.
  • .venv/bin/pytest --no-cov -q --deselect tests/unit/test_lint_imports_canary.py --deselect tests/unit/test_precommit_and_docs_config.py::test_pre_commit_run_all_files_exits_zero4,873 passed, 62 skipped, 6 xfailed.
  • .venv/bin/ruff check (12 touched files): clean. ruff format --check: clean.
  • .venv/bin/mypy --strict src/: success, no issues in 173 files.
  • .venv/bin/lint-imports --config pyproject.toml --no-cache: 4 kept, 0 broken.
  • .venv/bin/mkdocs build --strict: docs built clean in 22.4 s.

Cross-story lessons

  • Story precondition can become stale. AC-5 says "SemverVersion newtype does not yet exist"; by execution time S3-03 had already landed it. The right move is to honour the AC as written and surface the elevation opportunity in the attempt log — widening an AC during execution erodes the validator's contract.
  • from __future__ import annotations defeats __annotations__ is ClassName. S3-04 hit this; S3-05 hit it again. Pin the source-text spelling instead (assert annotation == "SandboxedPath"). The S3-04 precedent at test_bundle_builder.py:181-189 is the canonical pattern.
  • Adding a new CLI subcommand under an existing group is purely additive. The Phase-1+ cache gc stub at cli.py:911-913 stays bytes-for-bytes intact; cache prune lands as a sibling under the same @cli.group("cache"). The autouse _disable_cli_configure_logging fixture in tests/integration/cli/conftest.py covers both subcommands without modification because they both share the same lazy-logging seam.