S5-05 — Attempt log¶

Attempt 1 — 2026-05-17 — phase-story-executor¶

Outcome: GREEN (all 18 ACs satisfied; full suite + lint + mypy --strict + forbidden-patterns green).

Inputs read¶

Story S5-05-runtime-trace-freshness-and-drift.md (HARDENED — 18 ACs incl. the validator's 6 consistency-fixes against on-disk shapes).
scip_freshness precedent at src/codegenie/probes/layer_b/index_health.py:143-204 — the canonical "freshness function" shape.
FreshnessRegistry contract at src/codegenie/indices/registry.py:67 — Callable[[dict[str, object], str], IndexFreshness].
IndexFreshness discriminated union at codegenie.indices.freshness — Fresh | Stale(reason: DigestMismatch | IndexerError | ...).
S5-04 integration test for cache-key invalidation as the template for Scenario A.
tests/adv/phase02/test_stale_scip_fixture.py outer-key invariant — the only existing test that pinned the registry size at 1.

Upstream-AC patch to S5-02 (one new field)¶

The story's "Notes for implementer" anticipated this: S5-02's slice did NOT carry last_traced_at, so the freshness function's Fresh(indexed_at=...) branch had no source. Per the inline-patch instructions:

Added last_traced_at to _EXPECTED_SLICE_KEYS (the snapshot pin).
Added a _now_utc_iso() helper module-local seam.
Threaded last_traced_at: str | None = None through _empty_slice and _slice_from_aggregate — each defaults to _now_utc_iso() when None (the probe DID run; the timestamp is honest).
Updated the inline _build_envelope_build_failed to stamp last_traced_at too.
Existing slice-key snapshot tests at tests/unit/probes/layer_c/test_runtime_trace.py:317,324 absorbed the addition because they read from _EXPECTED_SLICE_KEYS (no test edit needed — the constant is the contract).

The widening is additive; no existing tests broke. Branch (b) of the freshness function (trace_coverage_confidence == "unavailable") fires before branch (c) of type-validation, so failure-path slices with last_traced_at=None would never reach the type check anyway — but the timestamp is stamped in all paths for honest-confidence rendering downstream.

Discoveries that mattered¶

Pydantic model_dump(mode="json") renders UTC datetimes with the Z suffix. The B2 integration test's freshness["indexed_at"] is the wire-shape string after model_dump, not the source ISO string. Initial assertion against "2026-05-17T00:00:00+00:00" failed; corrected to "2026-05-17T00:00:00Z". Pin the wire shape, not the source.
IndexHealthProbe imports codegenie.exec as _exec. The HEAD-resolver monkeypatch must target codegenie.exec.run_allowlisted (the imported module attribute), NOT ih.run_allowlisted (which doesn't exist on the index_health module). Mirrors the pattern that other index_health tests use indirectly via fixtures.
tests/adv/phase02/test_stale_scip_fixture.py had a load-bearing set(...) == {"scip"} assertion that S5-05 widens. Updated to {"scip", "runtime_trace"} with a comment naming the future-widening trigger (S6-08).
Ruff's UP031 flagged %-formatting in adversarial assertion messages; converted to f-strings. F811 flagged the pytest-fixture import that also names a test parameter — applied a targeted noqa: F811 (the pytest pattern requires both the import and the parameter).
No edits to index_health.py. AC-16's structural promise held: git diff origin/master..HEAD --name-only does NOT include src/codegenie/probes/layer_b/index_health.py. The registry decorator pattern + read_raw_slices kernel is the Open/Closed seam working as designed.

Refactor decisions¶

Single _now_utc_iso() seam instead of three duplicated _dt.datetime.now(_dt.UTC).isoformat() call-sites — keeps the I/O surface narrow (one impure-line in S5-02's pure-helper layer) and gives the AST-purity audit a stable boundary to assert against.
Six-branch isinstance discipline for the freshness function — mirrors scip_freshness lines 168-184 verbatim. The branch order is load-bearing per the story validator's consistency block; (b) catches failure paths before (c) type-validates the optional last_traced_at.
Hypothesis property under tests/property/ — same directory the existing test_index_freshness_roundtrip.py lives. The property strategies are intentionally broad (None / int / bool / list / str) to exercise every isinstance arm including the "weird object" defensive paths.
Adversarial helpers in tests/adv/phase02/_helpers.py — second helper file in the adv corpus (after tests/adv/_helpers.py); the rule-of-three threshold is not yet reached so no kernel extraction. Documented for the next consumer.
No shared _FreshnessHelpers base. Story Notes explicitly defer this to S6-08 (the rule-of-three trigger fires at the 3rd consumer + the 4th & 5th together). Followed Rule 2 / Rule 11 — the duplication is fine; the trigger is recorded.

Acceptance criteria — evidence¶

AC	Evidence
AC-1 — function placement + decorator + signature + `__all__`	`test_function_signature_matches_registry_contract`, `test_function_exported_in_all`
AC-2 — branch table over six cases, total	`test_branch_{a,b,c,d,e,f,g}_*` (8 tests) + `test_function_never_raises_on_arbitrary_object_values`
AC-3 — `Final[str]` message constants	`test_all_message_constants_annotated_final_str`, `test_message_constants_values_are_unique`, `test_message_constants_match_id_pattern`
AC-4 — purity AST-walk audit	`test_function_body_has_no_clock_or_io_calls`, `test_function_body_has_no_await_or_subprocess`, `test_function_body_is_pure_no_assignments_to_outer_state`
AC-5 — registry membership + identity	`test_runtime_trace_registered_in_default_registry`
AC-6 — B2 drift end-to-end (four-part)	`test_b2_emits_drift_for_runtime_trace`
AC-7 — B2 clean = Fresh	`test_b2_emits_fresh_for_runtime_trace`
AC-8 — B2 absent slice → upstream_unavailable	`test_b2_emits_stale_for_absent_runtime_trace_slice`
AC-9 — mutation-resistance table (5 stubs)	`test_mutant_fails_at_least_one_named_check` parametrized over 5 mutants — every one fails AC-6, AC-7, or AC-8
AC-10 — Hypothesis totality + purity	`tests/property/test_runtime_trace_freshness_purity.py::test_totality_and_purity`, `test_wall_clock_under_soft_budget`
AC-11 — argument-order canary	`test_arg_order_is_slice_then_head`
AC-12 — adversarial three scenarios	`test_image_digest_change_changes_cache_key` + `test_drift_detected_through_b2` + `test_clean_run_emits_fresh`
AC-13 — no real subprocess in adv	`forbid_real_subprocess` fixture + `test_no_real_subprocess_in_adv_layer` smoke
AC-14 — ADR refs in adv assertion messages	`test_assertion_messages_carry_adr_refs` (AST-introspect)
AC-15 — duplicate-registration smoke	`test_runtime_trace_duplicate_registration_rejected`
AC-16 — no edits to `index_health.py`	`test_no_edit_to_index_health_module` (git diff audit)
AC-17 — `mypy --strict` clean	`mypy --strict src/codegenie` → 109 files, 0 errors
AC-18 — `forbidden-patterns` green	`python scripts/check_forbidden_patterns.py` exit 0

Gates¶

ruff check (src + tests) — clean
ruff format --check — 330 files formatted
mypy --strict src/codegenie — 109 files, 0 errors
python scripts/check_forbidden_patterns.py — exit 0
Full suite — 2669 passed, 15 skipped, 3 deselected, 2 xfailed (one initial flake on test_stale_scip_regenerate_guard.py resolved on re-run; the test passes in isolation — order-pollution from a pre-existing test, not new in this story)

Files touched¶

Extended (S5-02 inline-patch): src/codegenie/probes/layer_c/runtime_trace.py — added _dt, Index{Freshness,erError}, Fresh, Stale, DigestMismatch, register_index_freshness_check, IndexName imports; added _MSG_* Final[str] constants; added last_traced_at to _EXPECTED_SLICE_KEYS; added _now_utc_iso(); threaded last_traced_at through _empty_slice, _slice_from_aggregate, and the inline _build_envelope_build_failed; added _check_runtime_trace_freshness + the @register_index_freshness_check decorator; updated __all__.
Updated test: tests/adv/phase02/test_stale_scip_fixture.py — outer-key set widened to include runtime_trace.
New tests:
tests/unit/probes/layer_c/test_runtime_trace_freshness.py (21 tests: signature, registry, branch table, B2 integration, arg-order, duplicate, no-edit-to-B2)
tests/unit/probes/layer_c/test_runtime_trace_freshness_purity.py (6 tests: Final[str] audit + AST-walk purity audit)
tests/unit/probes/layer_c/test_runtime_trace_freshness_mutation.py (5 mutants killed)
tests/property/test_runtime_trace_freshness_purity.py (2 Hypothesis properties)
tests/adv/phase02/test_image_digest_drift.py (5 tests: cache-key, drift, clean, no-real-subproc, ADR-message audit)
tests/adv/phase02/_helpers.py (shared build_drift_slice + forbid_real_subprocess + clean_freshness_registry)

Lessons for future Phase 2 stories¶

Pydantic JSON-mode renders UTC as Z. When a story asserts the wire shape of a datetime, pin against "...Z", not "...+00:00". The source datetime is built from fromisoformat(...) (which produces +00:00) but model_dump(mode="json") is the wire serializer.
Outer-key invariants widen with each freshness-check registration. The tests/adv/phase02/test_stale_scip_fixture.py test pinned the registry to a single name; S5-05 widens it to 2; S6-08 will widen it to 5. Each new registration must update this assertion at the same time — leave a comment naming the next-widening story (S6-08).
scip_freshness is the load-bearing template. Every future @register_index_freshness_check candidate should clone its shape: pure function, isinstance-discipline, try/except ValueError for timestamp parsing, Stale(IndexerError(_MSG_*)) for every failure return. Deviation from the template is a smell.
The HEAD-resolver monkeypatch target is codegenie.exec.run_allowlisted (via the module attribute), NOT index_health.run_allowlisted (which doesn't exist). The pattern is consistent — patch the imported module, not the importing module.