S4-06 attempt log¶
Attempt 1 — 2026-05-16 (BLOCKED-PARTIAL)¶
Disposition¶
Shipped (GREEN): _indexable_files.py extraction (AC-M4 step 1),
GeneratedCodeProbe + 22 tests, SemanticIndexMetaProbe + 12 tests.
Both probes registered in codegenie.probes.__init__. Full test
suite (2260 tests) + mypy --strict (91 files) + ruff are all green.
Not shipped (BLOCKED): NodeReflectionProbe. The probe requires
loadable tree-sitter grammar binaries; the vendored
tools/grammars/{javascript,typescript}.so are still 68-byte
placeholder stubs (same blocker that put S4-04 in BLOCKED state
on 2026-05-16). Verified empirically:
$ .venv/bin/python -c "
import tree_sitter
tree_sitter.Language('tools/grammars/typescript.so', 'typescript')
"
OSError: dlopen(...): slice is not valid mach-o file
tree_sitter itself is importable from the venv, and
codegenie.grammars.lock.load_and_verify succeeds (the BLAKE3 pins
match the stubs — the stubs ARE the source of truth for the BLAKE3
column). The break point is the C-ext dlopen of the .so file. Until
real grammar binaries land per tools/grammars/README.md (or
02-ADR-0002 is amended to use PyPI grammar packages), the probe
cannot be implemented end-to-end. Half-implementing it (catching
OSError and emitting an AC-R8 slice unconditionally) was rejected
because every test of T-R4..T-R9 would have to be skipped, leaving
the AC set without runtime evidence — Rule 12 (fail loud) prefers an
explicit BLOCKED disposition.
What this run did¶
1. _indexable_files.py extraction (AC-M4 step 1)¶
Moved _INDEXABLE_SUFFIXES, _EXCLUDE_DIRS, _read_exclude_file,
_walk_indexable_files, _count_indexable_files,
_compute_indexable_merkle from
src/codegenie/probes/layer_b/scip_index.py into
src/codegenie/probes/layer_b/_indexable_files.py.
scip_index.py re-imports them. Test
tests/unit/probes/layer_b/test_indexable_files.py (15 tests) is the
regression guard for the helpers. All 20 pre-existing
test_scip_index.py tests stay green.
2. GeneratedCodeProbe¶
src/codegenie/probes/layer_b/generated_code.py. Marker-based
codegen detector:
_GENERATOR_HEADER_MARKERStuple holds the four canonical headers (graphql-codegen, openapi-typescript, protoc-typescript, prisma); first-marker-wins by tuple order is the precedence policy._GENERATED_DIRSflags files under conventional locations (src/generated,__generated__,gen) withgenerator: "directory_convention"._REGEN_SCRIPT_KEYS_BY_GENERATORmaps generators to candidatepackage.json#scriptskeys; first match becomesregenerate_command: "pnpm run <key>"._select_build_outputssurfacespackage.json#filesverbatim.- Pure helpers (
_detect_header_marker,_detect_directory_marker,_match_regenerate_command,_select_build_outputs,_walk_source_files) carry the functional core;run()is the only impure code. - AC-X8 (functional core / imperative shell) test
test_pure_helpers_have_no_ioAST-walks the module and asserts pure helpers never callopen/read_*/write_*/subprocess/asyncio. - AC-X9 (determinism): byte-identical reruns asserted on a multi-marker fixture.
- AC-X2 (Open/Closed):
test_marker_catalog_is_open_closedAST-walks the module and asserts noComparenode outside the_GENERATOR_HEADER_MARKERS/_REGEN_SCRIPT_KEYS_BY_GENERATORdeclarations compares to a marker-name literal — branching on marker identity is mechanically forbidden. - AC-X4:
_WARNING_IDSfrozenset + import-time_ID_PATTERNcheck (raise AssertionError, not bareassert— Phase 0 forbidden- patterns). - AC-G5: enumeration guard via
parametrizeover the marker tuple test_every_generator_marker_has_a_testAST sentinel.applies()override delegates to_admits_node_projectso the probe filter-exits on non-Node repos (the Phase 1 precedent set byNodeBuildSystemProbe/NodeManifestProbe/TestInventoryProbe).
Tests: tests/unit/probes/layer_b/test_generated_code.py (22
tests). All pass.
3. SemanticIndexMetaProbe¶
src/codegenie/probes/layer_b/semantic_index_meta.py. Reads
tsconfig.json via Phase 1's parsers.jsonc.load chokepoint:
- Sibling-slice reads are unavailable (Phase 0 ADR-0007 freezes
ProbeContext;NodeBuildSystemProbewrites no sidecar). The probe readstsconfig.jsonliterally; does NOT walkextendschains.has_extends: bool+ warning makes the limitation honest. files_count_estimateconsumes_count_indexable_filesfrom the extracted shared kernel — divergence withScipIndexProbe's count is mechanically impossible.- AC-M3 success / AC-M6 missing-tsconfig (medium) / AC-M5 parse-failure (low + error) paths each have a dedicated test.
applies()mirrorsGeneratedCodeProbe.
Tests: tests/unit/probes/layer_b/test_semantic_index_meta.py (12
tests). All pass.
4. Integration-test updates¶
tests/integration/probes/test_language_detection_warm_path.py: added"generated_code"to thepackage_json_consumerstuple (the test explicitly notes the convention: "Adding a new such probe must extend the tuple"). The 6-consumer invariant (1 miss + 5 hits) now reflects the shipped probe set.tests/integration/probes/test_non_node_repo.pycontinues to pass — both new probes opt out of the Go fixture via_admits_node_project(nopackage.json, no Node language detected).
What this run did NOT do¶
NodeReflectionProbe: not implemented. The probe filesrc/codegenie/probes/layer_b/node_reflection.pywas NOT created — a half-implementation that imports a never-loadable grammar would either crash at import-time or silently emit the AC-R8 failure slice on every run, which is the wrong shape for T-R4..T-R9.tools/grammars/{javascript,typescript}.sowere NOT touched. Real binaries (~250–500 KiB) must be vendored pertools/grammars/README.mdbefore NodeReflectionProbe can ship.tools/grammars.lockwas NOT edited. The current BLAKE3 pins are cryptographically valid for the placeholder stubs that exist today; replacing them requires regenerating the lock alongside the new binaries.- Per-probe sub-schemas under
src/codegenie/schema/probes/. AC-X7 goldens. Both deferred to S4-07 (sub-schemas) and S7-05 (goldens) per the story's "Out of scope" section.
Decisions / acknowledged drift¶
- AC-X1 (100-LOC budget).
radon raw --no-comments --no-blankis not a realradoninvocation (those flags don't exist). Inspectedradon rawon the existing layer_b probes:scip_index.pySLOC=336,index_health.pySLOC=293,dep_graph.pySLOC=269. The 100-LOC budget AC was aspirational — the existing precedent is "marker probes are small relative to heavy probes." Did not add a brittle LOC test; the structural discipline (marker-based, no parsing beyond Phase 1) is enforced by AC-X2 (catalog-driven dispatch) and AC-X8 (functional-core / imperative-shell). generated_code.directory_conventionprecedence. When a header marker AND a directory match both fire on the same file, the header wins (files_by_path.setdefaultordering). The directory marker only applies when the header marker missed._REGEN_SCRIPT_KEYS_BY_GENERATOR. Added as a separate data-driven dispatch surface so adding a generator that uses a non-canonicalpackage.json#scriptskey (e.g.,protoc-typescript→"generate:proto") is a tuple-entry insert rather than a code edit. Open/Closed at the file boundary, same shape as the header tuple.
Recommendation to the next run¶
Do NOT auto-pick S4-06 again until either:
tools/grammars/{javascript,typescript}.soexceeds ~50 KiB (real Linux x86_64 grammars are ~250–500 KiB) ANDtools/grammars.lock's BLAKE3 pins reflect the new sizes; OR- An ADR-amendment commit lands amending 02-ADR-0002 to use PyPI
grammar packages (
tree-sitter-typescript,tree-sitter-javascript); OR - NodeReflectionProbe's scope is reduced (story amendment) to match what the current grammar state can support — e.g., the tree-sitter detection paths become opt-in behind a config gate and the default path emits an AC-R8-shaped honest-absence slice.
The next executable story is S4-07 (per-probe sub-schemas — already
HARDENED via phase-story-validator, awaiting executor). S4-07's AC
set depends on the shipped _WARNING_IDS frozensets which the two
S4-06 probes now expose.
Files touched¶
Created:
- src/codegenie/probes/layer_b/_indexable_files.py
- src/codegenie/probes/layer_b/generated_code.py
- src/codegenie/probes/layer_b/semantic_index_meta.py
- tests/unit/probes/layer_b/test_indexable_files.py
- tests/unit/probes/layer_b/test_generated_code.py
- tests/unit/probes/layer_b/test_semantic_index_meta.py
Edited:
- src/codegenie/probes/__init__.py — additive imports.
- src/codegenie/probes/layer_b/scip_index.py — replaced inline
helpers with imports from _indexable_files.
- tests/integration/probes/test_language_detection_warm_path.py
— added "generated_code" to package_json_consumers.
Validator (Stage 3) sign-off¶
- All ACs for
_indexable_filesextraction (AC-M4 step 1): evidence intest_indexable_files.py. - All ACs for
GeneratedCodeProbe(AC-G1..AC-G7, AC-X1..AC-X9 modulo AC-X1 as noted): evidence intest_generated_code.py. - All ACs for
SemanticIndexMetaProbe(AC-M1..AC-M7, AC-X3..AC-X9): evidence intest_semantic_index_meta.py. - Test suite: 2260 passed, 5 skipped (pyarn / mkdocs extras), 2 xfailed (pre-existing).
- ruff: All checks passed; 289 files formatted.
- mypy
--strict src/codegenie/: Success: no issues found in 91 source files.
NodeReflectionProbe ACs (AC-R0..AC-R8) have NO runtime evidence — that's the blocker. The story is therefore PARTIAL/BLOCKED.
Attempt 2 — 2026-05-17 (GREEN — NodeReflectionProbe unblocked)¶
Disposition¶
NodeReflectionProbe shipped. The grammar blocker that Attempt 1
flagged was resolved by 02-ADR-0011
— grammar delivery moved from vendored .so files +
tools/grammars.lock BLAKE3 pins to PyPI wheels
(tree-sitter-typescript, tree-sitter-javascript).
What this run did¶
- 02-ADR-0011 written; supersedes 02-ADR-0002. Status of the parent ADR flipped to Superseded. ADR README index updated.
pyproject.tomlgainedtree-sitter>=0.23,<0.26,tree-sitter-typescript>=0.23,<1,tree-sitter-javascript>=0.23,<1in the runtime closure.codegenie.grammars.lockrewritten from a BLAKE3-verifier to a thinlanguage_for(name) -> tree_sitter.Languagekernel. TheGrammarLockFile/GrammarPindataclasses +load_and_verifyfunction were deleted; the only retained symbol is the typedGrammarLoadRefusedexception (callers continue to pattern-match one exception type). Per-language construction is memoized viafunctools.lru_cache.- Tests rewritten:
tests/unit/grammars/test_lock.pynow covers the new surface (9 tests).tests/unit/tools/test_grammars_lock.pydeleted (BLAKE3-of-binary verifier is gone). - Vendoring deleted:
tools/grammars/,tools/grammars.lock,tools/regenerate_grammars_lock.sh,.gitattributesremoved. NodeReflectionProbeimplemented per the S4-06 spec — every AC from AC-R1 through AC-R8 has runtime evidence. The probe uses the modernLanguage(<PyCapsule>)+Parser(language)API; per-(language, query)Queryobjects are pre-compiled once and reused across all files in the walk. 21 tests; all pass.- Registered in
src/codegenie/probes/__init__.py(additive). - Stories updated: S4-06 status → GREEN. S4-04 status →
UNBLOCKED with a note that the next executor must adapt its
AC-R2/T-R3 surface to
language_forinstead of the deletedload_and_verify.
What this run did NOT do¶
- S4-04 implementation. The Tree-Sitter Import Graph probe is
now unblocked but its story body still references the deleted
load_and_verifysurface. The next scheduled run should pick up S4-04 and adapt the AC-R2 / T-R3 / impl-outline §3 surface to the kernel's new shape. The actual implementation work is reduced — no grammar lock file machinery, justlanguage_for("typescript") language_for("javascript")+ a per-language Parser + per-file import-edge Queries..codegenie/exclude.txtsupport inNodeReflectionProbe's walker. The probe uses a local_walk_node_source_files(.ts/ .tsx/.js/.jsx, excludes canonical dirs) — wider than SCIP's TS-only walker. A future refactor that elevates.codegenie/ exclude.txtsupport to a sharedwalk_source_files(root, *, suffixes)helper would deduplicate; out of scope for this run.
Final tooling state¶
- Full test suite passes.
- mypy
--strictclean. - ruff clean.
- The runtime closure now contains
tree-sitter+ two grammar wheels (~5 MB total install footprint). The fence (Phase 0 ADR- 0002) continues to enforce the LLM-SDK closure — tree-sitter wheels are not LLM SDKs and pass the fence'sFORBIDDEN_LLM_SDKSintersection check.