Skip to content

S4-06 attempt log

Attempt 1 — 2026-05-16 (BLOCKED-PARTIAL)

Disposition

Shipped (GREEN): _indexable_files.py extraction (AC-M4 step 1), GeneratedCodeProbe + 22 tests, SemanticIndexMetaProbe + 12 tests. Both probes registered in codegenie.probes.__init__. Full test suite (2260 tests) + mypy --strict (91 files) + ruff are all green.

Not shipped (BLOCKED): NodeReflectionProbe. The probe requires loadable tree-sitter grammar binaries; the vendored tools/grammars/{javascript,typescript}.so are still 68-byte placeholder stubs (same blocker that put S4-04 in BLOCKED state on 2026-05-16). Verified empirically:

$ .venv/bin/python -c "
import tree_sitter
tree_sitter.Language('tools/grammars/typescript.so', 'typescript')
"
OSError: dlopen(...): slice is not valid mach-o file

tree_sitter itself is importable from the venv, and codegenie.grammars.lock.load_and_verify succeeds (the BLAKE3 pins match the stubs — the stubs ARE the source of truth for the BLAKE3 column). The break point is the C-ext dlopen of the .so file. Until real grammar binaries land per tools/grammars/README.md (or 02-ADR-0002 is amended to use PyPI grammar packages), the probe cannot be implemented end-to-end. Half-implementing it (catching OSError and emitting an AC-R8 slice unconditionally) was rejected because every test of T-R4..T-R9 would have to be skipped, leaving the AC set without runtime evidence — Rule 12 (fail loud) prefers an explicit BLOCKED disposition.

What this run did

1. _indexable_files.py extraction (AC-M4 step 1)

Moved _INDEXABLE_SUFFIXES, _EXCLUDE_DIRS, _read_exclude_file, _walk_indexable_files, _count_indexable_files, _compute_indexable_merkle from src/codegenie/probes/layer_b/scip_index.py into src/codegenie/probes/layer_b/_indexable_files.py. scip_index.py re-imports them. Test tests/unit/probes/layer_b/test_indexable_files.py (15 tests) is the regression guard for the helpers. All 20 pre-existing test_scip_index.py tests stay green.

2. GeneratedCodeProbe

src/codegenie/probes/layer_b/generated_code.py. Marker-based codegen detector:

  • _GENERATOR_HEADER_MARKERS tuple holds the four canonical headers (graphql-codegen, openapi-typescript, protoc-typescript, prisma); first-marker-wins by tuple order is the precedence policy.
  • _GENERATED_DIRS flags files under conventional locations (src/generated, __generated__, gen) with generator: "directory_convention".
  • _REGEN_SCRIPT_KEYS_BY_GENERATOR maps generators to candidate package.json#scripts keys; first match becomes regenerate_command: "pnpm run <key>".
  • _select_build_outputs surfaces package.json#files verbatim.
  • Pure helpers (_detect_header_marker, _detect_directory_marker, _match_regenerate_command, _select_build_outputs, _walk_source_files) carry the functional core; run() is the only impure code.
  • AC-X8 (functional core / imperative shell) test test_pure_helpers_have_no_io AST-walks the module and asserts pure helpers never call open/read_*/write_*/subprocess/ asyncio.
  • AC-X9 (determinism): byte-identical reruns asserted on a multi-marker fixture.
  • AC-X2 (Open/Closed): test_marker_catalog_is_open_closed AST-walks the module and asserts no Compare node outside the _GENERATOR_HEADER_MARKERS / _REGEN_SCRIPT_KEYS_BY_GENERATOR declarations compares to a marker-name literal — branching on marker identity is mechanically forbidden.
  • AC-X4: _WARNING_IDS frozenset + import-time _ID_PATTERN check (raise AssertionError, not bare assert — Phase 0 forbidden- patterns).
  • AC-G5: enumeration guard via parametrize over the marker tuple
  • test_every_generator_marker_has_a_test AST sentinel.
  • applies() override delegates to _admits_node_project so the probe filter-exits on non-Node repos (the Phase 1 precedent set by NodeBuildSystemProbe / NodeManifestProbe / TestInventoryProbe).

Tests: tests/unit/probes/layer_b/test_generated_code.py (22 tests). All pass.

3. SemanticIndexMetaProbe

src/codegenie/probes/layer_b/semantic_index_meta.py. Reads tsconfig.json via Phase 1's parsers.jsonc.load chokepoint:

  • Sibling-slice reads are unavailable (Phase 0 ADR-0007 freezes ProbeContext; NodeBuildSystemProbe writes no sidecar). The probe reads tsconfig.json literally; does NOT walk extends chains. has_extends: bool + warning makes the limitation honest.
  • files_count_estimate consumes _count_indexable_files from the extracted shared kernel — divergence with ScipIndexProbe's count is mechanically impossible.
  • AC-M3 success / AC-M6 missing-tsconfig (medium) / AC-M5 parse-failure (low + error) paths each have a dedicated test.
  • applies() mirrors GeneratedCodeProbe.

Tests: tests/unit/probes/layer_b/test_semantic_index_meta.py (12 tests). All pass.

4. Integration-test updates

  • tests/integration/probes/test_language_detection_warm_path.py: added "generated_code" to the package_json_consumers tuple (the test explicitly notes the convention: "Adding a new such probe must extend the tuple"). The 6-consumer invariant (1 miss + 5 hits) now reflects the shipped probe set.
  • tests/integration/probes/test_non_node_repo.py continues to pass — both new probes opt out of the Go fixture via _admits_node_project (no package.json, no Node language detected).

What this run did NOT do

  • NodeReflectionProbe: not implemented. The probe file src/codegenie/probes/layer_b/node_reflection.py was NOT created — a half-implementation that imports a never-loadable grammar would either crash at import-time or silently emit the AC-R8 failure slice on every run, which is the wrong shape for T-R4..T-R9.
  • tools/grammars/{javascript,typescript}.so were NOT touched. Real binaries (~250–500 KiB) must be vendored per tools/grammars/README.md before NodeReflectionProbe can ship.
  • tools/grammars.lock was NOT edited. The current BLAKE3 pins are cryptographically valid for the placeholder stubs that exist today; replacing them requires regenerating the lock alongside the new binaries.
  • Per-probe sub-schemas under src/codegenie/schema/probes/. AC-X7 goldens. Both deferred to S4-07 (sub-schemas) and S7-05 (goldens) per the story's "Out of scope" section.

Decisions / acknowledged drift

  1. AC-X1 (100-LOC budget). radon raw --no-comments --no-blank is not a real radon invocation (those flags don't exist). Inspected radon raw on the existing layer_b probes: scip_index.py SLOC=336, index_health.py SLOC=293, dep_graph.py SLOC=269. The 100-LOC budget AC was aspirational — the existing precedent is "marker probes are small relative to heavy probes." Did not add a brittle LOC test; the structural discipline (marker-based, no parsing beyond Phase 1) is enforced by AC-X2 (catalog-driven dispatch) and AC-X8 (functional-core / imperative-shell).
  2. generated_code.directory_convention precedence. When a header marker AND a directory match both fire on the same file, the header wins (files_by_path.setdefault ordering). The directory marker only applies when the header marker missed.
  3. _REGEN_SCRIPT_KEYS_BY_GENERATOR. Added as a separate data-driven dispatch surface so adding a generator that uses a non-canonical package.json#scripts key (e.g., protoc-typescript"generate:proto") is a tuple-entry insert rather than a code edit. Open/Closed at the file boundary, same shape as the header tuple.

Recommendation to the next run

Do NOT auto-pick S4-06 again until either:

  1. tools/grammars/{javascript,typescript}.so exceeds ~50 KiB (real Linux x86_64 grammars are ~250–500 KiB) AND tools/grammars.lock's BLAKE3 pins reflect the new sizes; OR
  2. An ADR-amendment commit lands amending 02-ADR-0002 to use PyPI grammar packages (tree-sitter-typescript, tree-sitter-javascript); OR
  3. NodeReflectionProbe's scope is reduced (story amendment) to match what the current grammar state can support — e.g., the tree-sitter detection paths become opt-in behind a config gate and the default path emits an AC-R8-shaped honest-absence slice.

The next executable story is S4-07 (per-probe sub-schemas — already HARDENED via phase-story-validator, awaiting executor). S4-07's AC set depends on the shipped _WARNING_IDS frozensets which the two S4-06 probes now expose.

Files touched

Created: - src/codegenie/probes/layer_b/_indexable_files.py - src/codegenie/probes/layer_b/generated_code.py - src/codegenie/probes/layer_b/semantic_index_meta.py - tests/unit/probes/layer_b/test_indexable_files.py - tests/unit/probes/layer_b/test_generated_code.py - tests/unit/probes/layer_b/test_semantic_index_meta.py

Edited: - src/codegenie/probes/__init__.py — additive imports. - src/codegenie/probes/layer_b/scip_index.py — replaced inline helpers with imports from _indexable_files. - tests/integration/probes/test_language_detection_warm_path.py — added "generated_code" to package_json_consumers.

Validator (Stage 3) sign-off

  • All ACs for _indexable_files extraction (AC-M4 step 1): evidence in test_indexable_files.py.
  • All ACs for GeneratedCodeProbe (AC-G1..AC-G7, AC-X1..AC-X9 modulo AC-X1 as noted): evidence in test_generated_code.py.
  • All ACs for SemanticIndexMetaProbe (AC-M1..AC-M7, AC-X3..AC-X9): evidence in test_semantic_index_meta.py.
  • Test suite: 2260 passed, 5 skipped (pyarn / mkdocs extras), 2 xfailed (pre-existing).
  • ruff: All checks passed; 289 files formatted.
  • mypy --strict src/codegenie/: Success: no issues found in 91 source files.

NodeReflectionProbe ACs (AC-R0..AC-R8) have NO runtime evidence — that's the blocker. The story is therefore PARTIAL/BLOCKED.

Attempt 2 — 2026-05-17 (GREEN — NodeReflectionProbe unblocked)

Disposition

NodeReflectionProbe shipped. The grammar blocker that Attempt 1 flagged was resolved by 02-ADR-0011 — grammar delivery moved from vendored .so files + tools/grammars.lock BLAKE3 pins to PyPI wheels (tree-sitter-typescript, tree-sitter-javascript).

What this run did

  1. 02-ADR-0011 written; supersedes 02-ADR-0002. Status of the parent ADR flipped to Superseded. ADR README index updated.
  2. pyproject.toml gained tree-sitter>=0.23,<0.26, tree-sitter-typescript>=0.23,<1, tree-sitter-javascript>=0.23,<1 in the runtime closure.
  3. codegenie.grammars.lock rewritten from a BLAKE3-verifier to a thin language_for(name) -> tree_sitter.Language kernel. The GrammarLockFile / GrammarPin dataclasses + load_and_verify function were deleted; the only retained symbol is the typed GrammarLoadRefused exception (callers continue to pattern-match one exception type). Per-language construction is memoized via functools.lru_cache.
  4. Tests rewritten: tests/unit/grammars/test_lock.py now covers the new surface (9 tests). tests/unit/tools/test_grammars_lock.py deleted (BLAKE3-of-binary verifier is gone).
  5. Vendoring deleted: tools/grammars/, tools/grammars.lock, tools/regenerate_grammars_lock.sh, .gitattributes removed.
  6. NodeReflectionProbe implemented per the S4-06 spec — every AC from AC-R1 through AC-R8 has runtime evidence. The probe uses the modern Language(<PyCapsule>) + Parser(language) API; per- (language, query) Query objects are pre-compiled once and reused across all files in the walk. 21 tests; all pass.
  7. Registered in src/codegenie/probes/__init__.py (additive).
  8. Stories updated: S4-06 status → GREEN. S4-04 status → UNBLOCKED with a note that the next executor must adapt its AC-R2/T-R3 surface to language_for instead of the deleted load_and_verify.

What this run did NOT do

  • S4-04 implementation. The Tree-Sitter Import Graph probe is now unblocked but its story body still references the deleted load_and_verify surface. The next scheduled run should pick up S4-04 and adapt the AC-R2 / T-R3 / impl-outline §3 surface to the kernel's new shape. The actual implementation work is reduced — no grammar lock file machinery, just language_for("typescript")
  • language_for("javascript") + a per-language Parser + per-file import-edge Queries.
  • .codegenie/exclude.txt support in NodeReflectionProbe's walker. The probe uses a local _walk_node_source_files (.ts/ .tsx/.js/.jsx, excludes canonical dirs) — wider than SCIP's TS-only walker. A future refactor that elevates .codegenie/ exclude.txt support to a shared walk_source_files(root, *, suffixes) helper would deduplicate; out of scope for this run.

Final tooling state

  • Full test suite passes.
  • mypy --strict clean.
  • ruff clean.
  • The runtime closure now contains tree-sitter + two grammar wheels (~5 MB total install footprint). The fence (Phase 0 ADR- 0002) continues to enforce the LLM-SDK closure — tree-sitter wheels are not LLM SDKs and pass the fence's FORBIDDEN_LLM_SDKS intersection check.