Story S7-02 — Fixtures batch 2: monorepo-pnpm + load-bearing stale-scip full materialization¶
Step: Step 7 — Plant five-repo fixture portfolio + per-probe golden files + remaining adversarial corpus
Status: Done — GREEN 2026-05-18 (phase-story-executor; see _attempts/S7-02.md for the per-AC evidence table + gate log)
Effort: M
Depends on: S7-01 HARDENED (fixtures batch 1 — patterns + shape-test conventions + the shared tests/unit/_fixture_regen_allowlist.py module transfer wholesale; the "hand-author the lockfile; do NOT run pnpm install at regen" pattern is the explicit S7-01 precedent for monorepo-pnpm), S4-02 (stale-scip STUB at tests/fixtures/portfolio/stale-scip/ + test_stale_scip_fixture.py CI-gating adversarial — this story is the FULL materialization of the _seed/scip-index.scip.placeholder binary, NOT a wholesale stub replacement).
ADRs honored: ADR-0001 (allowlisted binaries — regenerate.sh for both fixtures invokes only binaries in ALLOWED_BINARIES ∪ _SHELL_COREUTILS_ALLOWLIST; scip-typescript is in the allowlist but is invoked ONLY out-of-band by the contributor producing the seed binary, NOT inside regenerate.sh; pnpm/npm/node-gyp are NOT allowlisted), ADR-0006 (IndexFreshness location — CommitsBehind is the structural assertion the fixture's adversarial test reads), ADR-0007 (no plugin loader — neither fixture seeds plugins/), ADR-0009 (pytest-xdist veto — closed-set fixture trees, regen-script-only mutation surface).
Validation notes (2026-05-18)¶
Hardened by phase-story-validator (scheduled task: story-validation-corrector). Verdict: HARDENED.
Summary of changes (full audit log in _validation/S7-02-fixtures-batch-two.md):
- Block-tier —
pnpmis NOT inALLOWED_BINARIES(verified atsrc/codegenie/exec/__init__.py:96-111; the closed Phase-2 set is{git, node, semgrep, syft, grype, gitleaks, scip-typescript, ast-grep, ripgrep, tree-sitter, docker, strace}). Original AC-10 + Implementation Outline §1 saidmonorepo-pnpm'sregenerate.shrunspnpm install --frozen-lockfile; that would either fail S7-01's AC-31 static check or force a silent ADR-0001 expansion. Fix: AC-10 rewritten to mirror S7-01native-modules' HARDENED precedent — hand-authoredpnpm-lock.yamlbytes committed to the fixture;regenerate.shdoes NOT invokepnpm install(regen ismkdir/coreutils-only). Implementation Outline §1 rewritten accordingly. - Block-tier — bash can't call
run_allowlisted. Original AC-21 step (b) saidregenerate.sh"runsscip-typescript(viarun_allowlisted)".run_allowlistedis a Python function insrc/codegenie/exec/__init__.py; bash cannot call it. Same architectural mismatch S7-01 AC-22 had. Fix: AC-21 split into AC-21a ("seed-build ritual" — one-time, contributor's local box, invokesscip-typescriptdirectly to produce_seed/scip-index.scip; NOT insideregenerate.sh) and AC-21b ("regen-runtime" — deterministic, just commits + template-materialize + seed-copy; noscip-typescriptinvocation at regen time). - Block-tier — story prescribed
last-indexed-commit.txtmechanism that contradicts the actual S4-02 stub. Original AC-19, AC-20, and content predicates (_last_indexed_not_equal_to_current_head,_regen_refuses_current_head) all referenced alast-indexed-commit.txtfile. The actual S4-02 stub attests/fixtures/portfolio/stale-scip/uses a_seed/scip-slice.template.jsontemplate with aPARENT_COMMITplaceholder substituted at regen time into.codegenie/context/raw/scip.json. AC-18's "S4-02's stub already chose one path, this story honors it" made the contradiction internal. Fix: AC-18 rewritten to explicitly name the actual mechanism; AC-19 rewritten to describe the seed-template; AC-20 rewritten to point at the existingregenerate.shguard (LAST_INDEXED="${LAST_INDEXED:-$(git rev-parse HEAD~1)}"+ the[[ "$LAST_INDEXED" == "$(git rev-parse HEAD)" ]]; exit 1block); content predicates rewritten accordingly. - Block-tier — AC-17 committed
.codegenie/context/raw/scip-index.scipconflicts with the existing stub. The actual S4-02 stub gitignores.codegenie/(defense-in-depth) AND S7-01's central no-committed-cache guard asserts no.codegenie/cache/under any portfolio fixture. The story's.gitignorecarve-out would either weaken the central guard or contradict the stub's pattern. Fix: AC-17 rewritten — the real binary SCIP lives in_seed/scip-index.scip(committed; replaces_seed/scip-index.scip.placeholder);.codegenie/stays gitignored;regenerate.shcopies the seed blob to.codegenie/context/raw/scip-index.scipat runtime. AC-29 + Implementation Outline §4 amended — the existing S7-01 central no-cache-committed guard passes unchanged; no edit needed. - Block-tier — AC-15 "wholesale replacement of the S4-02 stub" destroys the working seed-template mechanism. The adversarial test reads
.codegenie/context/raw/scip.json(materialized from_seed/scip-slice.template.json), NOT the binary.scipblob. Wholesale replacement would break this. Fix: AC-15 rewritten — replacement is restricted to the_seed/scip-index.scip.placeholderempty blob (substituted with a realscip-typescript-built blob) plus an expandedsrc/tree; the seed-template +regenerate.shmechanism is preserved. - Harden-tier — adversarial-still-passes framing. Original AC-32/AC-33 implied "full materialization makes the adversarial assertion non-trivially true" — but the adversarial has been passing since S4-02 against the seed-templated
scip.json(the template carriesPARENT_COMMIT, NOT current HEAD, by construction). S7-02's actual contribution is the real binary SCIP for S4-03's futureScipIndexProbe, not changing what S4-02's adversarial asserts. Fix: AC-32/AC-33 reworded; footnote acknowledges the widened outer-key set ({"scip", "runtime_trace"}after S5-05; future S6-08 registrations may widen it further) so a future contributor doesn't think a materialization PR broke an unrelated invariant. - Harden-tier —
_ProbeNameLiteral subset semantics inherited from S7-01. Original AC-26 said "runtime-equals the documented one" — runtime-equality. S7-01's HARDENED AC-37 uses subset semantics (set(registered_phase_2_names) ⊆ set(get_args(_ProbeName))) so Phase-3+ probes added later don't retroactively break Phase-2 fixtures. Fix: AC-26 rewritten with subset semantics, matching S7-01. - Harden-tier — kernel location across the fixture-namespace boundary. Original AC-23 placed the kernel at
tests/fixtures/portfolio/_shape_test_kernel.py. AC-25 says Phase 1'stest_fixture_node_typescript_helm_shape.pymigrates to consume the kernel; Phase 1's fixture lives attests/fixtures/node_typescript_helm/(NOT underportfolio/). Importing aportfolio/-namespaced kernel from outside the portfolio subdirectory is awkward. Fix: kernel relocated totests/fixtures/_shape_test_kernel.py(above theportfolio/subdirectory) so all six consumers import cleanly. - Harden-tier — kernel
__all__runtime check. AC-23 required mypy --strict, but a silent removal ofenumerate_trackedor one of themake_*factories would still pass mypy if consumers were updated concurrently. Fix: AC-23 amended — kernel exposes a documented__all__set; a test attests/unit/test_shape_test_kernel.pyasserts the export set matches the documented contract. - Harden-tier —
_fixture_regen_allowlist.pyconsumer tests for the two new fixtures. S7-01 lifted the shared module totests/unit/_fixture_regen_allowlist.py; the two new fixtures need their own consumer tests (tests/unit/test_fixture_monorepo_pnpm_regenerate_allowlist.py+tests/unit/test_fixture_stale_scip_regenerate_allowlist.py). Fix: AC-31 amended to reference the shared module; "Files to touch" extended. - Harden-tier — TDD-plan
_FILE_SPECSfor stale-scip matched the wrong mechanism. Original_FILE_SPECSlistedlast-indexed-commit.txtand.codegenie/context/raw/scip-index.scipas committed entries. Fix:_FILE_SPECSrewritten to list the actual committed files (_seed/scip-slice.template.json,_seed/scip-index.scip(committed real binary),regenerate.sh,README.md, the source tree); content predicates rewritten for the corrected mechanism (_last_indexed_not_equal_to_current_headreadsregenerate.shfor theLAST_INDEXED="${LAST_INDEXED:-$(git rev-parse HEAD~1)}"line;_regen_refuses_current_headgreps the guard;_scip_blob_metadata_records_prior_commitreads_seed/scip-index.scipnon-emptiness). - Harden-tier — seed-template counters drift. Story expands the
stale-scipsource tree to ≤ 50.tsfiles; the existing_seed/scip-slice.template.jsonhas"files_indexed": 1, "files_in_repo": 1. If the source tree grows without updating these counters, B2 may surfaceCoverageGapinstead ofCommitsBehind. Fix: new predicate_seed_template_counters_match_source_treeasserts the seed template's counts equal the count of*.tsfiles undersrc/. - Design-pattern — kernel factory pattern.
make_*_testreturning pytest functions for module-level assignment is awkward for pytest's natural module-level@pytest.mark.parametrizediscovery. Flatter alternative: kernel exposes pure helper functions (assert_file_exists(fixture, spec),assert_file_parses(fixture, spec), etc.); consumers write minimal@pytest.mark.parametrizetest bodies. Documented as Notes-for-implementer (not promoted to AC — pattern advice is contextual; the consumer's choice). The functional-core / imperative-shell shape of the kernel makes this the more natural fit. - Design-pattern —
enumerate_trackedas the kernel's port forgit ls-files. The kernel's port-and-adapter discipline:git ls-files <fixture-path>is invoked from exactly one place in the kernel; consumers receive atuple[str, ...]of relpaths. Documented as Notes-for-implementer.
Full audit log: _validation/S7-02-fixtures-batch-two.md.
Context¶
This story lands the remaining two of the five fixture repos:
monorepo-pnpm/— exercisesDepGraphProbecross-package edges via a real pnpm workspace. Three packages (packages/lib-a/,packages/lib-b/,packages/app/) withappdepending on both libs,lib-bdepending onlib-a. Thedep_graphslice for this fixture contains real inter-package edges;tree_sitter_import_graphrecords theimportadjacency between the workspace packages.stale-scip/— the load-bearing roadmap exit-criterion fixture. Pre-populated SCIP index from a prior commit; HEAD has moved since;IndexHealthProbe(S4-01) must catch the staleness in CI (test_stale_scip_fixture.pyfrom S4-02). S4-02 landed a STUB directory + minimal SCIP blob +README.mdpolicy so the adversarial test could run during Step 4; this story produces the full materialization — populated.tsfiles, a real SCIP index built from a prior commit, two committed commits documented in the fixture so the staleness path is real.
The synthesis ledger pins three Step-7 implementation risks to this story:
- Risk #3 (
stale-scipregeneration silently breaks the load-bearing exit). A future contributor regenerates the SCIP fixture against current HEAD; the test still passes (becauseCommitsBehind.n >= 0is trivially satisfied) but no longer exercises staleness. Defense:regenerate.shforstale-scipMUST error out if invoked against current HEAD;README.mddocuments the structural assertion (CommitsBehind.n >= 1andlast_indexed != current_HEAD); the S4-02 adversarial asserts both inequalities — but the fixture'sregenerate.shis the front-line guard. - Risk #5 (golden-file non-determinism). Inherited from S7-01; this story compounds it because
monorepo-pnpm'spnpm installagainst the public registry may produce slightly different lockfile bytes across runs. The discipline: pin the lockfile bytes at fixture creation time, never re-runpnpm installinregenerate.sh(the lockfile is committed; the regen script asserts it has not drifted). - Risk #8 (Phase 3 protocol drift).
monorepo-pnpmis one of the two fixtures Phase 3's first plugin author will use as a target (per "Next-phase integration" table inphase-arch-design.md). The dep-graph evidence this fixture produces is what Phase 3'sDepGraphAdapterwill consume; the fixture's shape is part of the Protocol contract. Document this in the fixture'sREADME.mdso Phase 3's author sees the explicit handoff.
This story is also the natural landing point for the shared _shape_test_kernel.py the Rule-of-Three guard in S7-01 deferred. With five fixtures (Phase 1's node_typescript_helm/ + S7-01's three + this story's two), the kernel earns its keep.
References — where to look¶
- Architecture:
../phase-arch-design.md §"Testing strategy" → "Fixture portfolio"—monorepo-pnpm+stale-sciprows.../phase-arch-design.md §"Component design" #1(IndexHealthProbe— thestale-scipadversarial consumer).../phase-arch-design.md §"Component design" #11(DepGraphProbe—monorepo-pnpm's primary exerciser).../phase-arch-design.md §"Edge cases"row 11 (stale-scip fixture in CI — deliberate seed; the table row this story implements).../phase-arch-design.md §"Implementation risks"#3, #5, #8.- Phase ADRs: ADR-0006 (
IndexFreshnesssum type —CommitsBehindvariant is the structural assertion), ADR-0007 (no plugin loader —monorepo-pnpmships zeroplugins/). - Implementation plan:
../High-level-impl.md §"Step 7"—monorepo-pnpm+stale-scipbullets. - Source design:
../final-design.md §"Open questions"#7 (stale-scipregeneration policy — this story implements the named documentation discipline). - Existing code:
tests/adv/phase02/test_stale_scip_fixture.py(S4-02 — the adversarial this story's fixture must satisfy).tests/fixtures/portfolio/stale-scip/README.md(S4-02 stub — this story extends it).tests/fixtures/portfolio/minimal-ts/+native-modules/+distroless-target/(S7-01 — shape conventions transfer).
Goal¶
Two fixtures exist under tests/fixtures/portfolio/:
monorepo-pnpm/— pnpm workspace with three packages; rootpnpm-workspace.yaml;packages/lib-a/{package.json,src/index.ts},packages/lib-b/{package.json,src/index.ts}(importslib-a),packages/app/{package.json,src/index.ts}(imports both); a single rootpnpm-lock.yamlresolving all internal + minimal external deps; rootDockerfile,.github/workflows/ci.yml,tsconfig.jsonat each package level; shape test (tests/unit/test_fixture_monorepo_pnpm_shape.py).stale-scip/— full materialization, additive over the existing S4-02 stub attests/fixtures/portfolio/stale-scip/. The existing stub mechanism is preserved: gitignored.git/(regenerated byregenerate.sh); gitignored.codegenie/(regenerated byregenerate.sh); committed seeds under_seed/;regenerate.shinitializes.git/, commits v0 (parent /last_indexed_commit) then v1 (HEAD), materializes_seed/scip-slice.template.json→.codegenie/context/raw/scip.json(substitutingPARENT_COMMIT), copies_seed/scip-index.scip→.codegenie/context/raw/scip-index.scip, and refuses to setLAST_INDEXED == HEAD. S7-02's contributions are additive: (a) replace the empty_seed/scip-index.scip.placeholderwith a realscip-typescript-built binary blob (produced OUT-OF-BAND by the contributor on their local box; the seed binary is committed); (b) expand the source tree to ≤ 50.tsfiles; (c) update_seed/scip-slice.template.json'sfiles_indexed/files_in_repocounters to match the seeded source-tree footprint; (d) extendREADME.mdwith the Phase-3 entry-gate handoff note + the seed-build ritual section..codegenie/stays gitignored; no.gitignorecarve-out for.codegenie/context/raw/scip-index.scipis needed. The S4-02 adversarial test attests/adv/phase02/test_stale_scip_fixture.pycontinues to read.codegenie/context/raw/scip.json(materialized from the seed template), and the binary_seed/scip-index.scipis the forward-looking contract surface for S4-03'sScipIndexProbe.
The shared _shape_test_kernel.py is extracted to tests/fixtures/_shape_test_kernel.py (above the portfolio/ subdirectory so Phase 1's tests/fixtures/node_typescript_helm/ shape test can import it cleanly) and consumed by all five S7-01/S7-02 portfolio fixtures' shape tests + Phase 1's node_typescript_helm/ shape test (sixth consumer; conclusively past Rule of Three).
Acceptance criteria¶
monorepo-pnpm/ fixture tree shape
- [ ] AC-1.
tests/fixtures/portfolio/monorepo-pnpm/directory exists. - [ ] AC-2 —
pnpm-workspace.yamldeclarespackages: ["packages/*"]; parses viasafe_yaml.load. - [ ] AC-3 —
package.jsonat root declares"name": "monorepo-pnpm-fixture","private": true,"workspaces": ["packages/*"](redundant withpnpm-workspace.yaml, but pnpm reads either);"devDependencies": {"typescript": "^5.3.0"}; nodependencies. Parses viasafe_json.load. - [ ] AC-4 —
packages/lib-a/package.jsondeclares"name": "@monorepo-pnpm/lib-a","version": "0.0.1","main": "src/index.ts", no dependencies. Parses. - [ ] AC-5 —
packages/lib-a/src/index.tsexports a single functionadd(a: number, b: number): number. - [ ] AC-6 —
packages/lib-b/package.jsondeclares"name": "@monorepo-pnpm/lib-b","version": "0.0.1","main": "src/index.ts","dependencies": {"@monorepo-pnpm/lib-a": "workspace:*"}(the load-bearing pnpm workspace-protocol markerDepGraphProbeexercises). Parses. - [ ] AC-7 —
packages/lib-b/src/index.tsimports from@monorepo-pnpm/lib-aand exports a derived function. Theimportstatement is the load-bearing edgetree_sitter_import_graphrecords. - [ ] AC-8 —
packages/app/package.jsondeclares"name": "@monorepo-pnpm/app","version": "0.0.1","main": "src/index.ts","dependencies": {"@monorepo-pnpm/lib-a": "workspace:*", "@monorepo-pnpm/lib-b": "workspace:*", "express": "^4.18.2"}. Parses. - [ ] AC-9 —
packages/app/src/index.tsimports from both internal packages +express; declares a trivial Express handler. The two internal imports are whatdep_graphslice records as cross-package edges. - [ ] AC-10 — root
pnpm-lock.yamlis committed as hand-authored bytes (S7-01native-modulesHARDENED precedent —pnpmis NOT inALLOWED_BINARIESper ADR-0001 + S1-06 AC-10;regenerate.shMUST NOT invokepnpm install/pnpm install --frozen-lockfile/ anypnpmsubcommand). Body:lockfileVersion: '6.0'header; resolves all three internal packages via theworkspace:*protocol; resolvesexpressand its transitive deps to pinned versions. Parses viasafe_yaml.load. Generation path (out-of-band, contributor's local box, one-time per dep-version bump): runpnpm installonce in a scratch directory exactly matching the fixture manifest; copy the resultingpnpm-lock.yamlinto the fixture; commit.regenerate.shismkdir/coreutils-only (per AC-31's static check +tests/unit/_fixture_regen_allowlist.pyshared module) — no install commands. Defense-in-depth: a fixture-local.npmrcwithignore-scripts=trueships alongside (mirrors S7-01native-modulesAC-16) so any operator who later runspnpm installlocally doesn't trigger lifecycle scripts. - [ ] AC-11 —
tsconfig.jsonat each package level; roottsconfig.jsonwith"references"declaring all three packages (TS project-references shape; exercisestsconfig-walk paths). - [ ] AC-12 — root
Dockerfileis multi-stage;FROM node:20-slim AS buildbuilds the app; final stageFROM node:20-slim;USER node;CMD ["node", "packages/app/dist/index.js"]. Parses by the Phase-2 Dockerfile probe. - [ ] AC-13 — root
.github/workflows/ci.ymldeclares one jobbuildwithrun: pnpm install --frozen-lockfile && pnpm -r build && pnpm -r test. Parses viasafe_yaml.load. - [ ] AC-14 —
README.mdlists every file by relpath, names every probe inconsumers, AND explicitly documents (in prose) "Phase 3 entry-gate target —DepGraphAdapter's first plugin will produce cross-package edges from this fixture." This is the Risk-#8 named handoff.
stale-scip/ fixture full materialization
- [ ] AC-15 — additive materialization, NOT wholesale replacement. The existing S4-02 stub at
tests/fixtures/portfolio/stale-scip/ships: gitignored.git/(regenerated byregenerate.sh); gitignored.codegenie/(regenerated byregenerate.sh); committed seeds under_seed/; committedpackage.json+main.ts+regenerate.sh+README.md+.gitignore+.gitattributes. The seed-template + regenerate-script mechanism is PRESERVED. S7-02's contribution is restricted to: (a) replace the empty_seed/scip-index.scip.placeholderwith a real (binary) SCIP blob produced OUT-OF-BAND byscip-typescriptagainst the v0 commit tree (seed-build ritual per AC-21a; the resulting binary is committed at_seed/scip-index.scip); (b) expand the source tree to ≤ 50.tsfiles; (c) update_seed/scip-slice.template.json'sfiles_indexed/files_in_repocounters to match the seeded source-tree footprint; (d) extendREADME.mdper AC-22. No wholesale stub replacement. A future contributor must NOTrm -rfthe stub directory before applying S7-02's changes. - [ ] AC-16 — expanded source tree.
src/contains at least 5.tsfiles with realexport/importstatements (e.g.,src/a.tsexports a function;src/b.tsimportsa's function + exports its own; chained throughsrc/e.ts). Each file ≤ 30 lines.package.jsondeclares thetypescriptdevDependency at a version compatible with thescip-typescriptversion pinned inREADME.md(per AC-22).tsconfig.jsonis valid JSONC; emits todist/(which is gitignored — never built at regen time). - [ ] AC-17 — real binary SCIP committed at
_seed/scip-index.scip(replaces_seed/scip-index.scip.placeholder). The binary is the output ofscip-typescriptinvoked against the v0 commit tree per AC-21a's seed-build ritual..codegenie/STAYS gitignored (fixture-local.gitignorecontinues to list.codegenie/); theregenerate.shscript copies_seed/scip-index.scipto.codegenie/context/raw/scip-index.scipat runtime. No.gitignorecarve-out for.codegenie/is needed; S7-01's central no-committed-cache guard (tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.py) passes unchanged. - [ ] AC-18 — fixture mechanism (preserved from S4-02 stub). The fixture's "two-commits" history lives inside the fixture's OWN micro-git-repo at
tests/fixtures/portfolio/stale-scip/.git/, which is gitignored (regenerated byregenerate.sh). Mechanism (codified in the existingregenerate.sh):rm -rf .git .codegenie;git init -q -b main;git add package.json && git commit -m "v0 — seeded last_indexed_commit"(thelast_indexed_committarget); capturePARENT_COMMIT=$(git rev-parse HEAD);git add main.ts <other src files> && git commit -m "v1 — HEAD moves forward"(HEAD is now ahead by ≥ 1); materialize.codegenie/context/raw/scip.jsonfrom_seed/scip-slice.template.jsonby substitutingPARENT_COMMIT; copy_seed/scip-index.scipto.codegenie/context/raw/scip-index.scip. TheLAST_INDEXEDandHEADSHAs are genuinely different by construction. The story honors this mechanism wholesale; it is NOT to be replaced with alast-indexed-commit.txt-based scheme. - [ ] AC-19 —
last_indexed_commitlives in_seed/scip-slice.template.jsonas the placeholder string"PARENT_COMMIT", substituted byregenerate.shat runtime into.codegenie/context/raw/scip.json(gitignored). The S4-02 adversarial reads the materializedscip.jsonto assertfreshness.reason.last_indexed != current_HEAD. There is NOlast-indexed-commit.txtfile; the prior-commit SHA is not separately persisted on disk becauseregenerate.shcaptures it locally as thePARENT_COMMITshell variable and substitutes it into the materialized slice. (The committed seed template stores the placeholder; the runtime materialized slice stores the actual prior-commit SHA.) - [ ] AC-20 —
regenerate.sh's "refuse-against-current-HEAD" guard. The existing stub'sregenerate.shalready implements the guard viaLAST_INDEXED="${LAST_INDEXED:-$(git rev-parse HEAD~1)}"(defaultslast_indexedto HEAD~1; never HEAD) followed byif [[ "$LAST_INDEXED" == "$(git rev-parse HEAD)" ]]; then echo "ERROR: regenerate.sh refuses to set last_indexed_commit == HEAD" >&2; exit 1; fi. The story PRESERVES this guard; it is NOT to be replaced. Verification of the guard's correctness lives in the existingtests/unit/test_stale_scip_regenerate_sh_guard.py(or whatever the S4-02 story landed for it — confirm at land-time; if missing, this story adds it). The test runsregenerate.shwithLAST_INDEXED=$(git rev-parse HEAD)forced via env override and asserts exit code 1 + a stderr message containing "refuses to set last_indexed_commit". Skipped unlessCODEGENIE_REGEN_FIXTURES=1. - [ ] AC-21a — seed-build ritual (one-time per
scip-typescriptversion bump; OUT-OF-BAND). The contributor producing_seed/scip-index.scipdoes so on their local box, NOT insideregenerate.sh. Sequence: (1) check out a clean copy of the v0 tree (justpackage.json) into a scratch directory; (2) invokescip-typescriptagainst the scratch dir (scip-typescriptis inALLOWED_BINARIES; bash can call it directly — but the seed-build is contributor-side, not regen-side); (3) copy the resulting.scipblob to_seed/scip-index.scipin the fixture; (4) commit. Pin thescip-typescriptversion inREADME.mdso future contributors regenerate against the same tool version when the binary is updated.regenerate.shdoes NOT invokescip-typescript; the seed-binary is committed bytes (same discipline aspnpm-lock.yamlformonorepo-pnpm— generated once out-of-band, committed, treated as fixture bytes thereafter). - [ ] AC-21b —
regenerate.shruntime behavior (preserved from S4-02 stub). The script's full behavior is:rm -rf .git .codegenie;git init -q -b mainwith the fixture-local user.email/user.name; commit v0 (package.json) then v1 (main.ts+ any expanded source files);mkdir -p .codegenie/context/raw;sed "s|PARENT_COMMIT|${PARENT_COMMIT}|g" _seed/scip-slice.template.json > .codegenie/context/raw/scip.json;cp _seed/scip-index.scip .codegenie/context/raw/scip-index.scip; AC-20's guard. Noscip-typescriptinvocation at regen time; nopnpm/npm/node-gypinvocation. The script invokes onlygit,mkdir,rm,cp,sed,echo(all inALLOWED_BINARIES ∪ _SHELL_COREUTILS_ALLOWLISTper S7-01's tokenizer spec). AC-31's static check passes. - [ ] AC-22 —
README.mddocuments the regeneration ritual explicitly, additive over the existing stub's prose. Required sections: "Why this fixture exists" (preserved); "Structural assertion (CommitsBehind.n >= 1 AND last_indexed != current_HEAD — tool-version-agnostic)" (preserved + extended with the rationale of both inequalities); "Regeneration policy — DO NOT retarget against current HEAD" (preserved + extended); "Seed-build ritual (one-time perscip-typescriptversion bump)" (NEW — the AC-21a out-of-band ritual); "How to add a new commit (and the SCIP-vs-HEAD invariant that survives)" (NEW); "Pinnedscip-typescriptversion" (NEW — records the tool version used to build_seed/scip-index.scip). The README is the Risk-#3 front-line guard.
Shared _shape_test_kernel.py extraction
- [ ] AC-23 —
tests/fixtures/_shape_test_kernel.py(above theportfolio/subdirectory; chosen so Phase 1'stests/fixtures/node_typescript_helm/-targeted shape test can import the kernel without crossing theportfolio/-namespace boundary) is extracted with: the_FileSpec(frozenNamedTuple) +_ProbeName(Literal) +_ParserKind(Literal) types; theenumerate_tracked(fixture_path) -> tuple[str, ...]port (the only call site forgit ls-files <fixture-path>— invoked throughrun_allowlisted("git", "ls-files", str(fixture_path)); consumers receive a tuple of relpaths and never shell out themselves); the_FIXTURE_NOISE_NAMESdefense-in-depth frozenset; the parametrized-test machinery (see AC-24 for the choice of test-factory vs. flat-helper shape — both are acceptable as long as mypy --strict passes and consumers don't duplicate the structural logic). The kernel passesmypy --strict. The kernel declares a module-level__all__: Final[tuple[str, ...]] = (...); the test attests/unit/test_shape_test_kernel.pyasserts the runtime export set equals the documented contract (so a silent removal ofenumerate_trackedor any factory becomes a build error). - [ ] AC-24 — every fixture's shape test consumes the kernel.
tests/unit/test_fixture_{minimal_ts,native_modules,distroless_target,monorepo_pnpm,stale_scip}_shape.pyimport the kernel; each declares only its_FIXTUREpath + its_FILE_SPECStuple + its content-check predicate functions. The structural parametrized-test logic lives in the kernel. Implementer's choice — two acceptable shapes for the kernel's parametrized-test surface: (a) test-factory pattern (make_existence_test,make_parses_test, … returning pytest-decorated test functions for module-level assignment); (b) flat-helper pattern (assert_file_exists(fixture, spec),assert_file_parses(fixture, spec), … as pure helpers; each consumer writes minimal@pytest.mark.parametrize("spec", _FILE_SPECS, ids=lambda s: s.relpath) def test_fixture_file_exists(spec): assert_file_exists(_FIXTURE, spec)). The validator recommends (b) — it's more pytest-natural for module-level discovery, mypy --strict-clean without ergonomic dance, and keeps the kernel as a functional core. Pick one and apply consistently; the AC's requirement is "structural logic lives in the kernel; consumers declare only data", not the specific implementation shape. - [ ] AC-25 — Phase 1's
test_fixture_node_typescript_helm_shape.pyalso migrates to the kernel. This is the sixth consumer and is the final demonstration that the kernel pays off (Rule of Three conclusively past). The migration preserves every existing AC from Phase 1 S2-03 (the original story atdocs/phases/01-context-gather-layer-a-node/stories/S2-03-fixture-node-typescript-helm.md, ACs 1–23 and the hardened AC-37 + AC-38) — all S2-03 tests still pass after the kernel migration. Verification ritual: run the full Phase 1 test suite before and after the migration; the diff is non-test-file (just the import-rewrite of the existing test); test counts and pass/fail results unchanged. - [ ] AC-26 — kernel exposes
_ProbeNameas the live Phase-1 + Phase-2 probe-name superset; runtime check uses subset semantics (matching S7-01 AC-37). The Literal lists the full Phase-1 + Phase-2 probe names. A test attests/unit/test_shape_test_kernel.py(alongside the__all__test from AC-23) assertsset(p.name for p in default_registry.all()) ⊆ set(get_args(_ProbeName))(subset, NOT equality) — Phase-3+ probes added later don't retroactively break Phase-2 fixtures, but a renamed/added Phase-2 probe whose name isn't reflected in the Literal IS a test failure. Equality semantics are explicitly REJECTED here — they would force every Phase-3+ probe addition to also edit the fixture kernel, which is the wrong direction.
Closed-set + forbidden-subpath + line-ending invariants per new fixture
- [ ] AC-27 —
monorepo-pnpm/closed-set complement.test_fixture_monorepo_pnpm_tree_is_closed_setenumerates tracked files viaenumerate_tracked(kernel port →git ls-files) and asserts the set equals{spec.relpath for spec in _FILE_SPECS}.node_modules/MUST NOT be present in tracked files (gitignored; the install never happens in regen sincepnpmis not allowlisted — per AC-10 — sonode_modules/doesn't exist in working trees either, but the gitignore defense covers operator-sidepnpm installinvocations). - [ ] AC-28 —
stale-scip/closed-set complement.test_fixture_stale_scip_tree_is_closed_setenumerates tracked files viaenumerate_tracked(kernel port →git ls-files <fixture-path>from the parent codewizard-sherpa repo). Gitignored.git/and.codegenie/do NOT appear in the enumeration; the closed set is exactly_FILE_SPECS(which includes_seed/scip-slice.template.json,_seed/scip-index.scip,regenerate.sh,README.md,package.json,tsconfig.json, thesrc/*.tsfiles,.gitignore,.gitattributes). Noinclude_pathscarve-out for.codegenie/— the real binary lives in_seed/, not under.codegenie/; the kernel's default exclusion of.codegenie/continues to apply. - [ ] AC-29 — S7-01's central no-committed-cache guard passes unchanged.
tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.pywalkstests/fixtures/portfolio/and asserts no.codegenie/cache/directory or.codegenie/content exists in committed (tracked) files. Both new fixtures honor the invariant:monorepo-pnpm/does not produce any.codegenie/content at all (no cache, no committed slice);stale-scip/produces.codegenie/content only at runtime (gitignored). No edit to the central guard test is needed; if the test currently has an explicit allowlist of zero entries, it stays at zero. - [ ] AC-30 — line endings per file for every file in both new fixtures (the kernel-provided test). Binary files (the seed
_seed/scip-index.scipblob) are explicitly excluded from the LF check via_FILE_SPECScarrying aparser=Nonemarker that the kernel treats as "skip line-ending check" — the same convention S7-01 used for the placeholder. - [ ] AC-31 —
regenerate.shinvokes only allowlisted binaries per fixture, verified bytests/unit/test_fixture_monorepo_pnpm_regenerate_allowlist.py+tests/unit/test_fixture_stale_scip_regenerate_allowlist.py(one per new fixture, both consumingtests/unit/_fixture_regen_allowlist.py— the shared module S7-01 lifted; reused unchanged here). The tokenizer per S7-01's AC-31: each non-blank, non-comment line's first non-builtin/non-control-flow token must be inALLOWED_BINARIES ∪ _SHELL_COREUTILS_ALLOWLIST. Formonorepo-pnpm: the non-builtin / non-coreutil set must contain onlygit(if at all). Forstale-scip: the set contains onlygit+sed(sed is in_SHELL_COREUTILS_ALLOWLIST). Explicit fails:pnpm,npm,node-gyp,scip-typescript(at runtime — the seed-build ritual is OUT-OF-BAND),curl,wget,eval. The story explicitly assertspnpm∉ invoked set formonorepo-pnpmandscip-typescript∉ invoked set forstale-scip'sregenerate.sh— the seed-build ritual invokesscip-typescripton the contributor's local box, not insideregenerate.sh.
stale-scip structural assertion survives regeneration (Risk #3 defense)
- [ ] AC-32 — adversarial test from S4-02 continues to pass against the materialized fixture.
tests/adv/phase02/test_stale_scip_fixture.py(landed in S4-02; this story does NOT edit it) reads.codegenie/context/raw/scip.json(materialized at regen time from_seed/scip-slice.template.jsonwith thePARENT_COMMITsubstitution) and asserts: (1)set(index_health.keys()) == {"scip", "runtime_trace"}(the widened outer-key set; was{"scip"}at S4-02 land time, widened by S5-05 — future S6-08 registrations may widen further; this story does NOT cause the set to change); (2)isinstance(slice.freshness, Stale); (3)isinstance(slice.freshness.reason, CommitsBehind); (4)slice.freshness.reason.n >= 1; (5)slice.freshness.reason.last_indexed != current_HEAD; (6)index_health["scip"]["confidence"] == "medium". The assertion has been passing since S4-02 against the stub-templatedscip.json(the template carriesPARENT_COMMIT, NOT current HEAD, by construction). S7-02's contribution to the assertion-passing claim is: the assertion CONTINUES to pass after the source tree expansion + seed-counter updates (AC-15 + AC-16). The real binary SCIP at_seed/scip-index.scipis forward-looking for S4-03'sScipIndexProbeconsumer, not a determinant of S4-02's current assertion. Pre-flight check the implementer runs:pytest tests/adv/phase02/test_stale_scip_fixture.pyafter the source-tree expansion — observe green. - [ ] AC-33 —
last_indexed != current_HEAD(both inequalities) is the structural assertion in the adversarial — not justn >= 1(which>= 0would trivially satisfy). The S4-02 file already encodes this; the existingregenerate.shalready enforcesLAST_INDEXED != HEADby construction (LAST_INDEXEDdefaults toHEAD~1plus the guard against operator override to HEAD). This story's contribution is preserving the invariant after the source-tree expansion: ensure the v0/v1 split + the seed-template'slast_indexed_commit=PARENT_COMMITsubstitution mechanics survive.
Determinism, audit hygiene, type cleanliness
- [ ] AC-34 —
regenerate.shbyte-identical-twice scope is the tracked-files scope (matching S7-01 AC-30's hardened convention; gitignored artifacts —.git/,.codegenie/,dist/,node_modules/— are out of scope by design). Formonorepo-pnpm/: tracked-files SHA equality across two consecutive invocations (manual local verification; documented in PR).stale-scip/'s scope is narrower still: only the committed_seed/blobs + manifest files +regenerate.sh+README.md+.gitignore+.gitattributesare in scope — the regenerated.git/and.codegenie/legitimately re-derive distinct ephemeral SHAs across invocations (eachgit initproduces fresh object SHAs because the commit timestamps and committer identity may differ across runs even with the fixture-local user.email pin), and that's intentional — only the COMMITTED bytes are part of the fixture contract. - [ ] AC-35 — every new shape-test + kernel + the
_seed/scip-index.scipbinary's existence assertion passesmypy --strict. NoAnyoutside the explicitpayload: Anyparser-dispatch lines (Phase 1 convention). - [ ] AC-36 — Phase 1's
test_fixture_node_typescript_helm_shape.pystill passes after the kernel migration (AC-25). Mandatory: run the existing test suite, observe green; the migration is refactor-by-extraction, not behavior change. Concretely: the diff of the test file is just the import change (from tests.fixtures._shape_test_kernel import ...) + the removal of the duplicated structural-test code (now imported from the kernel) — no logic edit, no behavior change.
Implementation outline¶
- Plant
monorepo-pnpm/first (no risky surface). mkdir -p tests/fixtures/portfolio/monorepo-pnpm/{packages/lib-a/src,packages/lib-b/src,packages/app/src,.github/workflows}.- Write the shape test (
tests/unit/test_fixture_monorepo_pnpm_shape.py) — TDD red, modeled on S7-01's three fixtures (still using inlined parametrized-test bodies; the kernel extraction comes in step 3 below). - Plant each file per AC-2..AC-14.
- Generate the
pnpm-lock.yamlONCE, OUT-OF-BAND, on the contributor's local box: runpnpm installin a scratch directory that exactly matches the fixture manifest; copy the resultingpnpm-lock.yamlinto the fixture; commit.regenerate.shdoes NOT invokepnpm(per AC-10 + ADR-0001 —pnpm∉ALLOWED_BINARIES); the regen script ismkdir/coreutils-only (it materializes any tree skeleton that is regenerated and asserts invariants — but the lockfile is committed bytes treated as fixture contract). - Plant
.npmrcwithignore-scripts=true(defense-in-depth; mirrors S7-01native-modulesAC-16). - Write
tests/unit/test_fixture_monorepo_pnpm_regenerate_allowlist.pyconsumingtests/unit/_fixture_regen_allowlist.py(S7-01's shared module — reused unchanged); the test explicitly assertspnpm∉ invoked-binary set per AC-31. - Run shape test + allowlist test. Green.
- Materialize
stale-scip/additively over the existing S4-02 stub. - READ the existing stub first.
tests/fixtures/portfolio/stale-scip/{README.md, regenerate.sh, .gitignore, package.json, main.ts, _seed/scip-slice.template.json, _seed/scip-index.scip.placeholder, .gitattributes}codify the seed-template + gitignored-.git/+ gitignored-.codegenie/mechanism. Do NOTrm -rfthe stub directory. Read the existingregenerate.shend-to-end so you understand the v0/v1 commit sequence, thePARENT_COMMITsubstitution, and theLAST_INDEXEDguard. - Write the shape test (
tests/unit/test_fixture_stale_scip_shape.py) — TDD red, with_FILE_SPECSdeclaring the committed files only (_seed/scip-slice.template.json,_seed/scip-index.scip,package.json,tsconfig.json,src/*.ts,regenerate.sh,README.md,.gitignore,.gitattributes). - Expand the source tree to ≤ 50
.tsfiles: addsrc/a.tsthroughsrc/e.ts(or more) with chainedexport/importstatements. The expanded tree givesscip-typescriptmore to index than the stub's singlemain.ts. - Update
_seed/scip-slice.template.json'sfiles_indexed/files_in_repoto match the count of.tsfiles in the newsrc/tree (or whatever subset the seeded SCIP actually covers — pick deliberately, document inREADME.md's "Seed-build ritual" section). - Seed-build ritual (one-time; OUT-OF-BAND on contributor's local box): on a scratch directory, check out only
package.json(the v0 tree from the existingregenerate.sh's perspective); runscip-typescript .against the scratch directory; copy the resulting.scipblob to_seed/scip-index.scip(replacing_seed/scip-index.scip.placeholder). Pin thescip-typescriptversion used inREADME.md. Do NOT touchregenerate.shto invokescip-typescript— it stays as a committed seed bytes step. - Extend
regenerate.shONLY for the source-tree expansion: the existing script commitsmain.tsfor v1; widen this to commit all expandedsrc/*.tsfiles for v1. All other lines of the existingregenerate.share preserved, including theLAST_INDEXED="${LAST_INDEXED:-$(git rev-parse HEAD~1)}"+if [[ "$LAST_INDEXED" == "$(git rev-parse HEAD)" ]]; then exit 1; figuard (AC-20). Thecp _seed/scip-index.scip.placeholder ...line in the existing script becomescp _seed/scip-index.scip ...(the seed file now has real content). - Extend
README.mdper AC-22 — preserve existing sections (existing structural-assertion and regeneration-policy sections); add: "Seed-build ritual (one-time perscip-typescriptversion bump)" + "How to add a new commit" + "Pinnedscip-typescriptversion". - Verify: run
bash tests/fixtures/portfolio/stale-scip/regenerate.sh; observe the v0/v1 commits + materialized.codegenie/context/raw/scip.json+ copied.codegenie/context/raw/scip-index.scip(real binary now). Runpytest tests/adv/phase02/test_stale_scip_fixture.py— green (AC-32). RunLAST_INDEXED=$(cd tests/fixtures/portfolio/stale-scip && git rev-parse HEAD) bash tests/fixtures/portfolio/stale-scip/regenerate.sh; observe exit code 1 + stderr "refuses to set last_indexed_commit == HEAD" (AC-20). - Write
tests/unit/test_fixture_stale_scip_regenerate_allowlist.pyconsumingtests/unit/_fixture_regen_allowlist.py; explicitly assertscip-typescript∉ invoked-binary set (the seed-build ritual is out-of-band). - Run shape test + allowlist test + adversarial. All green.
- Extract the shared kernel at
tests/fixtures/_shape_test_kernel.py. - Compare the three S7-01 shape-test files + the two new shape-test files + Phase 1's
tests/unit/test_fixture_node_typescript_helm_shape.py. The duplicated machinery is the parametrized-test bodies +enumerate_tracked+_FIXTURE_NOISE_NAMES. The variable parts are_FIXTURE,_FILE_SPECS, the content-check predicates. - Write
tests/fixtures/_shape_test_kernel.py(above theportfolio/subdirectory) with: the_FileSpecfrozenNamedTuple+_ProbeNameLiteral +_ParserKindLiteral types; theenumerate_tracked(fixture_path) -> tuple[str, ...]port (only call site forgit ls-files); the_FIXTURE_NOISE_NAMESfrozenset; the parametrized-test helpers per AC-24 (validator-recommended: flat helper functions likeassert_file_exists,assert_file_parses, etc. — but factory-based pattern is also acceptable if mypy --strict-clean). - Add
__all__: Final[tuple[str, ...]] = (...)to the kernel; writetests/unit/test_shape_test_kernel.pyasserting the runtime export set + the_ProbeNamesubset semantics check per AC-26. - One at a time: migrate
tests/unit/test_fixture_minimal_ts_shape.py→ kernel-consumer; run; observe green. Same fornative_modules,distroless_target, the two new fixtures, AND Phase 1'stests/unit/test_fixture_node_typescript_helm_shape.py(sixth consumer). - Verify all six shape tests still pass; AC-25 + AC-36 require Phase 1's existing tests pass identically post-migration.
- Verify S7-01's central no-committed-cache guard passes unchanged. Run
tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.py(S7-01's guard). No edit to this test is needed —stale-scip's real binary SCIP lives at_seed/scip-index.scip, NOT under.codegenie/; the test's invariant ("no.codegenie/content undertests/fixtures/portfolio/in committed tracked files") passes unchanged. If the test would fail because some other adjacent change leaked a.codegenie/path, that's a separate bug — fix it in place, not by allowlisting. - Final pass:
mypy --strict,ruff,ruff format --check. Run the full Phase 2 test suite (pytest -qminus advisory benches). Green.
TDD plan — red / green / refactor¶
Red — failing shape tests first¶
For monorepo-pnpm, the shape test mirrors S7-01:
# tests/unit/test_fixture_monorepo_pnpm_shape.py (excerpt)
_FILE_SPECS: tuple[_FileSpec, ...] = (
_FileSpec("pnpm-workspace.yaml", ("node_build_system", "dep_graph"), "safe_yaml", (_workspace_declares_packages,)),
_FileSpec("package.json", ("node_build_system", "node_manifest"), "safe_json", (_root_pkg_shape,)),
_FileSpec("packages/lib-a/package.json", ("node_manifest", "dep_graph"), "safe_json", (_lib_a_pkg_shape,)),
_FileSpec("packages/lib-a/src/index.ts", ("language_detection", "tree_sitter_import_graph"), "text", (_lib_a_exports_add,)),
_FileSpec("packages/lib-b/package.json", ("node_manifest", "dep_graph"), "safe_json",
(_lib_b_pkg_shape, _lib_b_declares_workspace_dep_on_lib_a)),
_FileSpec("packages/lib-b/src/index.ts", ("language_detection", "tree_sitter_import_graph"),
"text", (_lib_b_imports_from_lib_a,)),
_FileSpec("packages/app/package.json", ("node_manifest", "dep_graph"), "safe_json",
(_app_pkg_shape, _app_declares_workspace_deps_on_both_libs)),
_FileSpec("packages/app/src/index.ts", ("language_detection", "tree_sitter_import_graph"),
"text", (_app_imports_from_both_libs,)),
_FileSpec("pnpm-lock.yaml", ("node_build_system", "node_manifest", "dep_graph"),
"safe_yaml", (_lock_v6_header,)),
_FileSpec("tsconfig.json", ("node_build_system",), "jsonc", (_tsconfig_root_references_all_three,)),
_FileSpec("Dockerfile", ("dockerfile", "runtime_trace", "entrypoint"), "text",
(_dockerfile_multistage, _dockerfile_uses_node_slim, _dockerfile_runs_as_node_user)),
_FileSpec(".github/workflows/ci.yml", ("ci",), "safe_yaml", (_ci_runs_recursive_build,)),
_FileSpec("README.md", (), "text", (_readme_documents_phase3_entry_gate_target,)),
)
The load-bearing content predicates for monorepo-pnpm:
_lib_b_declares_workspace_dep_on_lib_a(pkg)— assertspkg["dependencies"]["@monorepo-pnpm/lib-a"] == "workspace:*". Mutation: drop the dep → fails._lib_b_imports_from_lib_a(raw_bytes)— asserts'from "@monorepo-pnpm/lib-a"'is in the source. Mutation: remove the import → fails._app_declares_workspace_deps_on_both_libs(pkg)— asserts bothworkspace:*deps. Mutation: drop either → fails._app_imports_from_both_libs(raw_bytes)— asserts bothfrom "@monorepo-pnpm/lib-a"ANDfrom "@monorepo-pnpm/lib-b". Mutation: drop either → fails. (This is the load-bearing pair thetree_sitter_import_graphgolden depends on.)_readme_documents_phase3_entry_gate_target(raw_bytes)— asserts the literal phrase"Phase 3 entry-gate target"appears inREADME.md. The phrase is the Risk-#8 named handoff.
For stale-scip (the committed fixture surface; runtime-materialized .codegenie/ content is NOT in _FILE_SPECS because it's gitignored):
_FILE_SPECS: tuple[_FileSpec, ...] = (
_FileSpec("package.json", ("node_build_system", "node_manifest"), "safe_json", (_pkg_declares_typescript,)),
_FileSpec("tsconfig.json", ("node_build_system",), "jsonc", (_tsconfig_shape,)),
_FileSpec("src/a.ts", ("language_detection",), "text", (_a_ts_exports,)),
_FileSpec("src/b.ts", ("language_detection",), "text", (_b_ts_imports_a,)),
_FileSpec("src/c.ts", ("language_detection",), "text", (_c_ts_imports_b,)),
_FileSpec("src/d.ts", ("language_detection",), "text", (_d_ts_imports_c,)),
_FileSpec("src/e.ts", ("language_detection",), "text", (_e_ts_imports_d,)),
_FileSpec("main.ts", ("language_detection",), "text", (_main_ts_imports_e,)), # preserved from S4-02 stub
_FileSpec("_seed/scip-slice.template.json", ("scip_index", "index_health"), "safe_json",
(_template_carries_parent_commit_placeholder,
_seed_template_counters_match_source_tree,)),
_FileSpec("_seed/scip-index.scip", ("scip_index",), None,
(_scip_blob_non_empty, _scip_blob_smoke_shape,)),
_FileSpec("regenerate.sh", (), "text",
(_regen_initializes_git_and_commits_two_commits,
_regen_substitutes_parent_commit_into_template,
_regen_copies_seed_scip_to_runtime_path,
_last_indexed_defaults_to_head_tilde_one,
_regen_refuses_current_head,
_regen_invokes_only_allowlisted_binaries,)),
_FileSpec("README.md", (), "text",
(_readme_documents_structural_assertion,
_readme_documents_regen_ritual,
_readme_documents_seed_build_ritual,
_readme_pins_scip_typescript_version,)),
_FileSpec(".gitignore", (), "text", (_gitignore_excludes_git_and_codegenie,)),
_FileSpec(".gitattributes", (), "text", ()),
)
The load-bearing content predicates for stale-scip (all reading committed bytes — never runtime-materialized state):
_last_indexed_defaults_to_head_tilde_one(raw_bytes)— grepsregenerate.shfor the lineLAST_INDEXED="${LAST_INDEXED:-$(git rev-parse HEAD~1)}"(or its semantic equivalent —HEAD~1is the structural guarantee thatlast_indexedis the PARENT of HEAD, never HEAD itself). This is the Risk-#3 front-line invariant. Mutation: a contributor "fixes"regenerate.shto defaultLAST_INDEXEDtoHEAD→ this predicate fails. Pure-string grep; no subprocess invocation (the predicate is called against the static script bytes by the kernel's content-invariants test, which is itself pure)._regen_refuses_current_head(raw_bytes)— grepsregenerate.shfor the explicit checkif [[ "$LAST_INDEXED" == "$(git rev-parse HEAD)" ]](or its semantic equivalent) + theexit 1branch. Pins the load-bearing guard at the script-text level._regen_substitutes_parent_commit_into_template(raw_bytes)— grepsregenerate.shfor thesed "s|PARENT_COMMIT|...|g" _seed/scip-slice.template.json > .codegenie/context/raw/scip.jsonline (or its semantic equivalent). Pins the template-substitution step; mutation: a contributor "tidies up" the regen script by hardcoding the materializedscip.json→ predicate fails._regen_copies_seed_scip_to_runtime_path(raw_bytes)— grepsregenerate.shforcp _seed/scip-index.scip .codegenie/context/raw/scip-index.scip(or equivalent). Pins the seed-binary-copy step; mutation: a contributor addsscip-typescriptinvocation insideregen.shinstead of the cp-from-seed → predicate fails AND the AC-31 allowlist test also fails._template_carries_parent_commit_placeholder(parsed_json)— assertsparsed_json["last_indexed_commit"] == "PARENT_COMMIT"(the placeholder string, NOT a real SHA — the substitution happens at regen runtime). Mutation: a contributor "tidies up" the template by replacing the placeholder with the actual prior commit SHA at fixture creation → predicate fails (and theregenerate.shsubstitution would no-op silently)._seed_template_counters_match_source_tree(parsed_json)— counts*.tsfiles undersrc/(andmain.tsat root if present); assertsparsed_json["files_indexed"] == parsed_json["files_in_repo"] == <count>(or the deliberately-pinned subset count, per AC-15 + AC-16 + the README's "Seed-build ritual" section). Mutation: a contributor grows the source tree without updating the seed template →IndexHealthProbesurfacesCoverageGapinstead ofCommitsBehindand the adversarial fails for the wrong reason._scip_blob_non_empty(raw_bytes)— assertslen(raw_bytes) > 0(the placeholder was 0 bytes; the real binary is non-empty). The first sanity check that the seed-build ritual actually ran._scip_blob_smoke_shape(raw_bytes)— asserts the blob is parseable as a SCIP index (the wire-format is a protobuf-serializedIndexmessage; smoke check: the first bytes are a valid SCIP magic / varint prefix per the SCIP spec, OR a minimum-size check of ≥ 200 bytes which any realscip-typescriptoutput exceeds). NOT a deep structural assertion; the placeholder is 0 bytes so the minimum-size check alone catches "seed-build ritual didn't run"._readme_documents_structural_assertion(raw_bytes)— asserts the README contains both"CommitsBehind.n >= 1"AND"last_indexed != current_HEAD"phrases verbatim._readme_documents_seed_build_ritual(raw_bytes)— asserts the README has a "Seed-build ritual" section._readme_pins_scip_typescript_version(raw_bytes)— asserts the README pins thescip-typescriptversion used to produce_seed/scip-index.scip(regexscip-typescript\s+v?\d+\.\d+\.\d+or equivalent)._gitignore_excludes_git_and_codegenie(raw_bytes)— asserts.gitignorecontains lines.git/AND.codegenie/. Mutation: a contributor adds an!.codegenie/...allowlist carve-out → predicate fails (and S7-01's central no-committed-cache guard would also catch a leaked path).
Green — make it pass¶
Plant the trees. Run the shape tests. Green. Then extract the kernel.
Mutation-resistance witness table¶
| Mutation | Test that catches it |
|---|---|
Drop "@monorepo-pnpm/lib-a": "workspace:*" from lib-b/package.json |
test_fixture_monorepo_pnpm_content_invariants[packages/lib-b/package.json] via _lib_b_declares_workspace_dep_on_lib_a |
Remove the import from app/src/index.ts (silently breaks the tree_sitter_import_graph golden) |
_app_imports_from_both_libs |
monorepo-pnpm/regenerate.sh invokes pnpm install --frozen-lockfile (or any pnpm subcommand) |
tests/unit/test_fixture_monorepo_pnpm_regenerate_allowlist.py (consuming _fixture_regen_allowlist.py) — pnpm ∉ ALLOWED_BINARIES ∪ _SHELL_COREUTILS_ALLOWLIST |
monorepo-pnpm/regenerate.sh invokes npm install or node-gyp rebuild |
Same allowlist test — neither binary is in ALLOWED_BINARIES |
Contributor "tidies up" stale-scip/regenerate.sh by defaulting LAST_INDEXED to HEAD instead of HEAD~1 |
_last_indexed_defaults_to_head_tilde_one grep predicate AND tests/adv/phase02/test_stale_scip_fixture.py (from S4-02) BOTH fail (the materialized scip.json carries last_indexed == HEAD; the adversarial's last_indexed != current_HEAD assertion fails) |
Contributor "fixes" regenerate.sh to allow regen with operator-forced LAST_INDEXED=$(git rev-parse HEAD) |
_regen_refuses_current_head grep predicate fails (the guard's if block is gone) |
Contributor adds scip-typescript . invocation inside regenerate.sh (instead of the cp-from-seed pattern) |
tests/unit/test_fixture_stale_scip_regenerate_allowlist.py — scip-typescript shows up in the invoked-binary set, which is asserted-absent at regen-time (seed-build is OUT-OF-BAND) |
Contributor "tidies up" the seed template by replacing "PARENT_COMMIT" placeholder with a real SHA |
_template_carries_parent_commit_placeholder predicate fails (the placeholder string is gone) |
Contributor grows the source tree to 12 .ts files but forgets to update _seed/scip-slice.template.json counters |
_seed_template_counters_match_source_tree predicate fails (counts mismatch) — pre-empts the worse failure mode of IndexHealthProbe surfacing CoverageGap instead of CommitsBehind and the adversarial failing for the wrong reason |
Contributor leaves _seed/scip-index.scip.placeholder (0 bytes) in place instead of replacing with a real binary |
_scip_blob_non_empty + _scip_blob_smoke_shape predicates fail (placeholder is 0 bytes; real binary is ≥ 200 bytes) |
Contributor adds !.codegenie/context/raw/scip-index.scip carve-out to stale-scip/.gitignore |
_gitignore_excludes_git_and_codegenie predicate fails (the carve-out introduces extra non-.git/-non-.codegenie/ lines that flunk the strict-equality check) AND S7-01's central no-committed-cache guard catches the leaked .codegenie/ content |
Stray node_modules/ force-added to monorepo-pnpm |
test_fixture_monorepo_pnpm_tree_is_closed_set (extra tracked file outside _FILE_SPECS) |
Stray .codegenie/cache/blobs/x committed under any portfolio fixture |
tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.py (S7-01) |
README drops the "Phase 3 entry-gate target" phrase from monorepo-pnpm/README.md |
_readme_documents_phase3_entry_gate_target |
README drops the structural-assertion phrasing from stale-scip/README.md |
_readme_documents_structural_assertion |
README forgets to pin the scip-typescript version |
_readme_pins_scip_typescript_version predicate fails |
Kernel extraction silently changes behavior (e.g., enumerate_tracked excludes a different default) |
Phase 1's test_fixture_node_typescript_helm_shape.py regression (still passing is the proof) + tests/unit/test_shape_test_kernel.py __all__ runtime check |
_ProbeName Literal in the kernel falls out of sync with the live probe registry (e.g., Phase 2 probe renamed) |
tests/unit/test_shape_test_kernel.py subset-semantics check (set(p.name for p in default_registry.all()) ⊆ set(get_args(_ProbeName)) — the renamed probe's new name is not in the Literal) |
Refactor — clean up¶
- The kernel extraction is the refactor. The pre-existing five shape tests + Phase 1's
node_typescript_helmshape test all migrate to consume the kernel; the kernel itself is mypy-strict, noAnyoutsidepayload: Any, no untyped helpers. _ProbeNamein the kernel is the Phase-1 + Phase-2 probe-name superset. The kernel-side test (tests/unit/test_shape_test_kernel.py) asserts subset semantics (set(p.name for p in default_registry.all()) ⊆ set(get_args(_ProbeName))) per AC-26 — matching S7-01's hardened AC-37. Phase-3+ probes added later do NOT retroactively break Phase-2 fixtures.- The kernel's
__all__is a separate runtime check (also intests/unit/test_shape_test_kernel.py) — silent export removal becomes a build error. regenerate.shbyte-identical-twice scope per AC-34:monorepo-pnpm—git ls-files-tracked files only; gitignored artifacts out of scope.stale-scip— the committed bytes only (the_seed/directory + manifest files +regenerate.sh+README.md+.gitignore+.gitattributes); the regenerated.git/and.codegenie/legitimately differ across runs (freshgit initproduces fresh object SHAs).- No edit to
tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.py— the existing S7-01 guard passes unchanged becausestale-scip's real binary SCIP lives at_seed/scip-index.scip(NOT under.codegenie/).
Files to touch¶
| Path | Why |
|---|---|
tests/fixtures/portfolio/monorepo-pnpm/ (tree per AC-2..AC-14; lockfile hand-authored OUT-OF-BAND; .npmrc with ignore-scripts=true) |
pnpm workspace; DepGraphProbe cross-package edges; Phase-3 entry-gate target |
tests/fixtures/portfolio/stale-scip/ (additive materialization: expand src/ + replace _seed/scip-index.scip.placeholder with real binary + extend regenerate.sh for the wider commit-set + extend README.md) |
Load-bearing. The roadmap exit-criterion fixture; existing stub mechanism preserved |
tests/fixtures/_shape_test_kernel.py (NOTE: above portfolio/ subdirectory so Phase 1's tests/fixtures/node_typescript_helm/ test can consume it cleanly) |
Shared _FileSpec + parametrized-test surface (Rule of Three conclusively past — 6 consumers including Phase 1) |
tests/unit/test_fixture_monorepo_pnpm_shape.py |
Shape test |
tests/unit/test_fixture_stale_scip_shape.py |
Shape test (regen-script grep predicates + seed-template counter invariant + seed-binary smoke check) |
tests/unit/test_fixture_monorepo_pnpm_regenerate_allowlist.py |
AC-31 — explicit assertion pnpm ∉ invoked binaries; consumes _fixture_regen_allowlist.py from S7-01 |
tests/unit/test_fixture_stale_scip_regenerate_allowlist.py |
AC-31 — explicit assertion scip-typescript ∉ invoked binaries at regen-time (seed-build is OUT-OF-BAND); consumes _fixture_regen_allowlist.py from S7-01 |
tests/unit/test_fixture_minimal_ts_shape.py (migrate to kernel) |
Was direct-pattern in S7-01; now consumes kernel |
tests/unit/test_fixture_native_modules_shape.py (migrate to kernel) |
Same |
tests/unit/test_fixture_distroless_target_shape.py (migrate to kernel) |
Same |
tests/unit/test_fixture_node_typescript_helm_shape.py (Phase 1; migrate to kernel) |
Phase-1 fixture consumes the kernel — the sixth consumer demonstrates the kernel pays off |
tests/unit/test_shape_test_kernel.py |
Asserts kernel's __all__ matches documented contract + subset-semantics check for _ProbeName Literal vs. live probe registry |
NOT TOUCHED — tests/unit/test_no_committed_codegenie_cache_under_portfolio_fixtures.py |
S7-01's central guard passes unchanged because stale-scip's real binary SCIP lives at _seed/scip-index.scip, NOT under .codegenie/ |
NOT TOUCHED — tests/unit/_fixture_regen_allowlist.py |
Reused unchanged from S7-01 |
Out of scope¶
- Golden file regeneration + ~70 goldens — S7-03.
- Adversarial corpus (
hostile_skills_yaml,concurrent_gather_race,no_inmemory_secret_leak,phase3_handoff_smoke) — S7-04. - Property tests + portfolio sweep integration — S7-05.
- CI wiring (
portfoliojob,adv-phase02job) — S8-03. stale-scipadversarial test itself (tests/adv/phase02/test_stale_scip_fixture.py) — already lives in S4-02; this story only ensures it passes against the full materialization (AC-32 + AC-33), does not edit it.- Pre-built
monorepo-pnpm/node_modules/cache for CI speedup — explicitly out. The regen-each-run policy is what Phase 2 ships; the escape valve lives infinal-design.md §"Open questions"#6 and triggers only on hosted-runner bench failure.
Notes for the implementer¶
- Risk #3 is the load-bearing risk this story defends. If a future contributor regenerates the
stale-scipSCIP against current HEAD, the load-bearing exit criterion silently stops exercising staleness. Three layers of defense, all in this story (or inherited from S4-02): regenerate.shLAST_INDEXEDdefaults toHEAD~1(the parent of HEAD; NEVER HEAD). The shape test's_last_indexed_defaults_to_head_tilde_onepredicate pins this at the script-text level (AC-20).regenerate.shhas an explicit guard against operator-forcedLAST_INDEXED=$(git rev-parse HEAD)— the script exits 1 with a clear error. The shape test's_regen_refuses_current_headpredicate pins this (AC-20).- The S4-02 adversarial asserts both
n >= 1ANDlast_indexed != current_HEAD(already coded; this story's source-tree expansion preserves the non-trivial truth of both inequalities by leaving the v0/v1 commit-split mechanism intact).
Document all three layers in stale-scip/README.md (AC-22). Test the regen-script refusal before opening the PR: LAST_INDEXED=$(cd tests/fixtures/portfolio/stale-scip && bash regenerate.sh && git rev-parse HEAD) bash tests/fixtures/portfolio/stale-scip/regenerate.sh — observe exit code 1.
-
monorepo-pnpm'spnpm-lock.yamlbyte-stability matters for golden determinism. Pin the lockfile bytes at fixture creation: runpnpm installonce in a scratch directory matching the manifest exactly, copy the lockfile in, commit it, and never invokepnpminregenerate.sh(per AC-10 + ADR-0001 —pnpmis NOT inALLOWED_BINARIES; S7-01'snative-modulesHARDENED precedent is the model). If the public registry repushes any ofmonorepo-pnpm's deps, the OUT-OF-BANDpnpm install(run in a deliberate fixture-update PR) would observe a mismatch and the contributor would re-pin the lockfile then — never silently. -
The kernel extraction in this story has been deferred from S7-01 deliberately. S7-01 had three consumers (Rule of Three boundary, not past); this story brings the count to five new + one Phase-1 = six. Six is conclusively past the rule. The kernel is the natural landing point — extract once, migrate all six consumers in one PR, observe Phase-1 regressions stay green (AC-25 + AC-36).
-
Kernel location at
tests/fixtures/_shape_test_kernel.py(above theportfolio/subdirectory). Phase 1'stests/fixtures/node_typescript_helm/fixture is OUTSIDEportfolio/; placing the kernel attests/fixtures/portfolio/_shape_test_kernel.pywould force Phase 1's shape test to import from a "portfolio" namespace it isn't part of, which is structurally awkward. The above-portfolio/location lets all six consumers importfrom tests.fixtures._shape_test_kernel import ...symmetrically. -
Kernel pattern choice — flat helpers vs. test factories. Two acceptable shapes:
- Test factories (
make_existence_test,make_parses_test, …): the kernel returns pytest-decorated test functions for module-level assignment in each consumer. Compact but unusual; pytest's natural module-level@pytest.mark.parametrizediscovery is inverted. - Flat helpers (
assert_file_exists(fixture, spec),assert_file_parses(fixture, spec), …): the kernel exposes pure helper functions; each consumer writes minimal@pytest.mark.parametrize("spec", _FILE_SPECS, ids=lambda s: s.relpath) def test_fixture_file_exists(spec): assert_file_exists(_FIXTURE, spec). More pytest-natural; mypy --strict-clean without ergonomic dance; the kernel is a clean functional core.
Validator recommends flat helpers — but factory-based is acceptable if cleaner per the implementer's read. Pick one and apply consistently across all six consumers; the AC's requirement is "structural logic lives in the kernel; consumers declare only data".
-
enumerate_trackedis the kernel's port forgit ls-files. Hexagonal discipline: subprocess invocation is encapsulated; consumers receivetuple[str, ...]of relpaths. The kernel is the ONLY call site forrun_allowlisted("git", "ls-files", str(fixture_path))— no consumer shells out itself. This makes the kernel's I/O surface auditable in one place. -
_FileSpecis a frozen NamedTuple. Immutability by construction (S2-03 precedent). Don't switch todataclass(frozen=True)— the existing S7-01 shape tests are NamedTuple and the migration should be import-rewrite, not constructor-rewrite. -
_fixture_regen_allowlist.py(S7-01) and_shape_test_kernel.py(this story) are SEPARATE flat modules. Different responsibilities — closed-set discovery + parametrized-test structure (kernel) vs. allowlist policy ownership forregenerate.shinvocations (regen-allowlist). Subsuming one into the other would conflate two cohesive responsibilities; keep them flat. -
Why no
node_modules/undermonorepo-pnpm/. Phase 2'snode_build_systemprobe (Phase 1) readspnpm-lock.yaml; it does NOT readnode_modules/. Committingnode_modules/would bloat the fixture by an order of magnitude AND introduce non-determinism (transitive-dep version-resolution drift). The probes that need the resolved tree (Phase 3+ adapters) reach through their adapters, not through the file system. -
scip-typescriptversion pin matters for_seed/scip-index.scipreproducibility. Pin the tool version used to build the seed binary (record instale-scip/README.mdper AC-22 +_readme_pins_scip_typescript_versionpredicate). When the production tool version updates (S4-03 records the productionscip-typescriptversion pin), the seed binary may need a deliberate regen via the AC-21a seed-build ritual. The structural assertion (CommitsBehind.n >= 1) survives tool-version drift; the seed binary's bytes do not. -
Why the binary SCIP lives in
_seed/, not.codegenie/. The existing S4-02 stub treats.codegenie/as a runtime-only directory (gitignored; regenerated). The seed bytes (template + binary) live in_seed/(committed). This split keeps the "committed contract surface" cleanly separated from "runtime materialization output". S7-01's central no-committed-cache guard rests on this split. -
Why the adversarial test does NOT consume the binary SCIP today.
tests/adv/phase02/test_stale_scip_fixture.pyreads.codegenie/context/raw/scip.json(materialized from_seed/scip-slice.template.json). The binary_seed/scip-index.scipis forward-looking for S4-03'sScipIndexProbeconsumer. S7-02's binary-SCIP contribution is therefore NOT load-bearing for the current adversarial — it's load-bearing for the next-phase consumer. Document this carefully inREADME.mdso a future maintainer doesn't conclude "the placeholder is fine because the adversarial passes against it." -
Phase-3 handoff note (Risk #8).
monorepo-pnpm/README.mdexplicitly names this as the Phase-3 entry-gate target. When Phase 3's author lands the firstDepGraphAdapterimplementation, they will smoke against this fixture'sdep_graphslice. Any Protocol drift between Phase 2'sProtocolshape and Phase 3's first implementation surfaces here (in addition to S7-04'stest_phase3_handoff_smoke.pyskip-and-unskip ritual).
Patterns DELIBERATELY deferred¶
- Pre-built fixture caches under
tests/fixtures/portfolio/_cache/— out of scope; regen-each-run policy is what Phase 2 ships. - A YAML-based
MANIFEST.yamlSSoT inside each fixture — Python-as-SSoT continues to work; lift only if a fourth consumer of the manifest appears (e.g., a build-system probe needing it at runtime). - A second SCIP indexer (e.g.,
scip-go) for thestale-scipfixture — out; Phase 2 fixtures are TypeScript-only. Phase 6+ may introduce a polyglot variant. - A
githistory visualization committed alongside the fixture — out; the README's prose is enough.