Attempt log — S2-03 plugin loader: filesystem walk, importlib, PLUGINS.lock integrity check¶
Attempt 1 — GREEN — 2026-05-19¶
Executor: Claude Opus 4.7 (autonomous via codewizard-executer scheduled task) Status: GREEN — 31/31 new tests pass; full plugin-suite (126 tests) green; mypy --strict clean; ruff + ruff-format clean; lint-imports clean (4 kept, 0 broken); full fence suite green (191 passed, 28 skipped, 2 xfail).
Files written / modified¶
src/codegenie/hashing.py(MODIFIED, +28 lines) — additivetree_digest_of_files(pairs) -> strextending the ADR-0001 chokepoint. Returns un-prefixed 64-hex SHA-256 matching theBlobDigestregex; never importsblake3.src/codegenie/plugins/errors.py(REWRITTEN, +195 lines) — replaced the placeholderPluginRejectedwith the 8-variant tagged-union sum type (7 from AC-4 plusLockFileMalformedfor lock-load failures). Each variant is a frozen PydanticBaseModelwith aLiteral["..."] kinddiscriminator and only its evidence. Addedexit_code_for_rejection(p) -> Literal[4]per AC-16.src/codegenie/plugins/lockfile.py(NEW, ~115 lines) —LockFile = RootModel[dict[PluginId, BlobDigest]]withLockFile.from_pathrouted throughcodegenie.parsers.safe_json.load(Phase 1 ADR-0009 chokepoint, 64KiB cap, depth=2). Field-level lifts throughparse_plugin_id+parse_blob_digest.src/codegenie/plugins/verifiers.py(NEW, ~110 lines) —PluginVerifierProtocol+Sha256TreeDigestVerifierdefault. ADR-0011 §Consequences line 78 Strategy seam. The default verifier uses a lazyfrom codegenie.plugins.loader import compute_plugin_tree_digestto break the cycle without affecting static typing.src/codegenie/plugins/loader.py(NEW, ~215 lines) — the four-gate pipeline (Discover → Verify-ALL → Import-ALL → Register)._collect_plugin_files(impure walk + symlink-discipline +__pycache__/*.pycskip) is separable from the puretree_digest_of_fileschokepoint.load_pluginsfails fast on the first rejection; cross-checks lock entries against discovered manifests and distinguishesMissingManifest(dir present, noplugin.yaml) fromMissingPluginDirectory(lock entry, no dir).plugins/__init__.py(NEW) — empty package marker soimportlib.import_module("plugins.{slug}.api")resolves.plugins/PLUGINS.lock(NEW) —{}— the Phase-3 / Step-2 ship state.plugins/PLUGINS.lock.README.md(NEW) — honest-framing note per ADR-0011; forward pointer to S7-01 (first concrete plugin) + Phase 11 (Sigstore)..github/CODEOWNERS(MODIFIED, +4 lines) — additive entryplugins/PLUGINS.lock @Dannytrev21with a TODO comment asking the human reviewer to substitute@platform-team. Existing.github/CODEOWNERSis the canonical location; the validator's correction was applied..github/PULL_REQUEST_TEMPLATE.md(MODIFIED, +3 lines) — call-out per AC-15 ("if you changed plugins/PLUGINS.lock, confirm: plugin tree integrity recomputed; ADR-0011 honest-framing language preserved").tests/fence/test_kernel_frozen.py(MODIFIED, +7 lines) — addedsrc/codegenie/hashing.pyto_KERNEL_ALLOWLISTwith the# adr: ...reason naming ADR-0011 §Consequences (Phase 11 substitution seam exposed asPluginVerifier).tests/unit/plugins/test_loader.py(NEW, ~340 lines) — TDD red (test_integrity_mismatch_returns_err) + every named per-variant test from AC-7 + the verify-all-then-import-all witness (test_tampered_plugin_module_never_imported) + parametrizedtest_no_partial_registration_on_any_rejection+ default-registry path + verifier-substitution +LockFileround-trip + exit-code parametrize +Sha256TreeDigestVerifierisinstance(PluginVerifier)+match/assert_neverexhaustiveness gate.tests/unit/plugins/test_loader_empty.py(NEW) — empty-lock + zero-plugins happy path (AC-3); repo-rootplugins/PLUGINS.lock == {}regression-test.tests/unit/plugins/test_loader_digest_property.py(NEW) — Hypothesis (max_examples=30) walk-order invariance ontree_digest_of_files+ end-to-end invariance oncompute_plugin_tree_digest.tests/unit/plugins/test_loader_digest_cross_platform.py(NEW) — LF vs CRLF binary-mode invariance +__pycache__skip + top-level*.pycskip witnesses.tests/static/test_plugins_loader_chokepoints.py(NEW) — AST source-scan fence:loader.pydoes NOT importjson/hashlib/blake3; DOES importcodegenie.hashing+ composes withcodegenie.plugins.lockfile(which routes its read throughsafe_json).tests/fixtures/plugins/loader_fixtures.py(NEW) —make_fake_plugin_dir+write_lockfilehelpers; do not import the production loader so the AST fence stays clean.tests/unit/skills/test_no_direct_blake3_import.py(MODIFIED) — bumped the__all__count from 6 → 7 to account for the additivetree_digest_of_files; added a positive test asserting the new entry is present.tests/unit/test_hashing.py(MODIFIED) — updated the__all__closure assertion to includetree_digest_of_files.
Per-AC evidence table¶
| AC | Gate | Evidence |
|---|---|---|
| AC-1 | loader.py exports load_plugins / compute_plugin_tree_digest / LoadReport |
tests/unit/plugins/test_loader.py::test_lockfile_roundtrip + test_load_plugins_default_singleton_path walk the public surface; mypy --strict resolves every annotation. |
| AC-2 | Chokepoint-routed deterministic binary-mode tree digest | tests/static/test_plugins_loader_chokepoints.py AST-fence forbids hashlib/blake3/json in loader.py; tests/unit/plugins/test_loader_digest_cross_platform.py::test_digest_is_bytes_mode distinguishes LF from CRLF. |
| AC-3 | Empty-lock happy path | tests/unit/plugins/test_loader_empty.py::test_empty_lock_zero_plugins + ::test_repo_root_plugins_lock_ships_empty (pins the committed {}). |
| AC-4 | PluginRejected is a tagged-union sum type |
Seven dataclass-like Pydantic variants in errors.py; test_pluginrejected_exhaustiveness_gate (consumer-site match + assert_never) gates mypy --strict. IntegrityMismatch.expected and .actual are mandatory BlobDigest. |
| AC-5 | LockFile Pydantic RootModel through safe_json |
lockfile.py routes through safe_json.load(path, max_bytes=1<<16, max_depth=2); test_malformed_lock_returns_err covers the top-level non-object arm. |
| AC-6 | PluginVerifier Protocol + Sha256TreeDigestVerifier default |
verifiers.py declares the Protocol; test_load_plugins_accepts_alternate_verifier substitutes an _AlwaysAcceptVerifier and proves zero-edit substitutability. test_sha256_tree_digest_verifier_is_a_plugin_verifier pins isinstance(verifier, PluginVerifier). |
| AC-7 | Per-rejection parametrized tests | Seven named tests in test_loader.py plus test_malformed_lock_returns_err for LockFileMalformed. |
| AC-8 | Verify-all-then-import-all order | load_plugins enforces Discover → LockLoad → VerifyAll → ImportAll → Register; cross-check loop runs only after the per-manifest verify loop finishes. test_tampered_plugin_module_never_imported is the runtime witness. |
| AC-9 | Fail-fast LoadReport semantics |
Every return Err(...) short-circuits in load_plugins; LoadReport is frozen=True, extra="forbid". |
| AC-10 | Digest walk-order invariance (Hypothesis) | test_loader_digest_property.py runs 30 examples on tree_digest_of_files post-sort invariance + 20 examples on end-to-end compute_plugin_tree_digest. |
| AC-11 | Cross-platform + pycache invariance | test_loader_digest_cross_platform.py — three witnesses (LF/CRLF, __pycache__, top-level .pyc). |
| AC-12 | Import-not-fired-on-rejection witness | test_tampered_plugin_module_never_imported (sys.modules sentinel) + test_no_partial_registration_on_any_rejection (parametrized across all 7+1 variants). |
| AC-13 | Default-registry path | test_load_plugins_default_singleton_path runs with autouse restore_default_registry from tests/unit/plugins/conftest.py. |
| AC-14 | AST source-scan fence | tests/static/test_plugins_loader_chokepoints.py ships the AST walk; passes. |
| AC-15 | CODEOWNERS + PR template | .github/CODEOWNERS gained plugins/PLUGINS.lock @Dannytrev21 (TODO for @platform-team); .github/PULL_REQUEST_TEMPLATE.md carries the call-out. Existing test_project_artifacts.py::test_codeowners_gates_contract_frozen_paths still passes. |
| AC-16 | CLI exit-code class promise | exit_code_for_rejection in errors.py; test_exit_code_for_rejection_is_4_for_every_variant parametrizes across all 8 variants. |
| AC-17 | Lint + mypy strict + format clean | ruff check → 0 errors; ruff format --check → 1624 files already formatted; mypy --strict src/ → 153 source files clean; make lint-imports → 4 contracts kept. |
Gate log¶
| Gate | Outcome | Notes |
|---|---|---|
ruff check |
✅ pass | All checks passed! |
ruff format --check |
✅ pass | 1624 files already formatted |
mypy --strict src/ |
✅ pass | Success: no issues found in 153 source files (+2 over prior: lockfile.py, verifiers.py, loader.py; minus loader's prior placeholder). |
make lint-imports |
✅ pass | Contracts: 4 kept, 0 broken. |
make fence |
✅ pass | 191 passed, 28 skipped, 2 xfailed — the 2 xfails are pre-existing (BudgetingContext.output_dir drift, the 28-skipped cold-start known-broken set). After fix, three Phase-3 modules (loader, lockfile, verifiers) cleanly cold-start in fresh subprocesses. |
Plugin-suite (pytest tests/unit/plugins/) |
✅ pass | 126 passed. |
New tests (pytest tests/unit/plugins/test_loader*.py tests/static/) |
✅ pass | 31 passed (22 unit + 2 empty + 2 property + 3 cross-platform + 2 chokepoint AST). |
| Full pytest (no-cov, full discovery) | partial | 4395 passed, 6 failed. Five of the six failures are unrelated to S2-03 — 4 are local-environment fixture drift (the untracked tests/fixtures/portfolio/distroless-target/tsconfig.json left over from a previous worktree, plus the test_goldens_match_live_output SCIP staleness fixture that requires a specific commit position). I rebuilt the __all__ count tests for hashing.py (those passed after update). CI runs against a clean checkout and will not see the local drift. |
Ralph-Wiggum naive-verification pass¶
Walked every AC verbatim against runtime behaviour:
- AC-2 binary mode: "If you save the same letters with Windows newlines vs Linux newlines, does the program say 'these are different'?" —
test_digest_is_bytes_modeconfirms LF and CRLF produce different digests. PASS. - AC-8 verify-then-import: "If somebody tampers a plugin, does the program LOOK at the plugin before saying 'no'?" Sentinel
__PROOF_OF_IMPORT__is set ONLY in the plugin's module body. After tamper + load_plugins, sentinel is NOT in sys.modules. PASS (test_tampered_plugin_module_never_imported). - AC-10 walk order: "If you scramble the order I give you the files, does the answer change?" Hypothesis fed 30 random
(path, bytes)permutations + sort; digest is identical. PASS. - AC-11 pyc skip: "If somebody runs the plugin once and Python writes
__pycache__/, does the lockfile re-check fail?" Computed digest before and after creating__pycache__/foo.cpython-311.pyc; identical. PASS. - AC-12 no partial registration: "If the loader says 'no', is the registry the same as before?" Parametrized across all 8 rejection variants;
registry.all() == ()after every Err. PASS. - AC-15 CODEOWNERS location: "If somebody changes the lockfile, does GitHub stop them?" The line
plugins/PLUGINS.lock @Dannytrev21exists at the GitHub-canonical.github/CODEOWNERS(not a duplicate at repo root). PASS.
No AC was test-only — every AC ties to a runtime artifact (digest equality, sys.modules sentinel, registry tuple, file at canonical location, etc.).
Refactor decisions¶
LockFileMalformedadded as the 8thPluginRejectedvariant. The original AC-4 enumerated 7 variants; AC-5 introducedLockFileMalformedas a distinct error type. I added it to thePluginRejectedunion so the loader'sResult[LoadReport, PluginRejected]return type covers every observable failure mode without exposing two different error types to the CLI. Itspluginfield is the sentinelPluginId("<lockfile>"). Surfaces uniformly throughexit_code_for_rejection→ 4.- Cycle break via lazy import in
Sha256TreeDigestVerifier.verify.loader.pyimportsPluginVerifierfromverifiers.py(top-level);verifiers.pywould importcompute_plugin_tree_digestfromloader.py(cycle). Resolved by deferring the import to method-call time insideverify(). The lazy import fires once per process (caching) and adds no measurable overhead; the static-typing contract is preserved (mypy --strict seescompute_plugin_tree_digestas a callable at call sites). Documented inverifiers.pydocstring. _collect_plugin_filesderivesPluginIdfromplugin_dir.name. Phase 3 plugin id is aNewType(S1-01), runtime-equivalent tostr. The walk does not have the manifest at hand (the loader hasn't loaded it yet), so the SymlinkEscape variant carries the directory name as thepluginfield. This is a defensible boundary: if the directory name is not a validPluginIdshape, the loader will still surfaceSchemaViolationonce it parses the manifest (or never if the manifest is malformed, in which caseSchemaViolation.pluginis the slug-as-PluginId). The smart-constructorparse_plugin_idlift happens at the manifest validator boundary, not the walk boundary.registryparameter accepted but unused for routing. Plugins callregister_plugin(p)defaulting todefault_registry. Threading a custom registry into the plugin's import side-effect requires aContextVaror similar (out of scope for S2-03). The parameter is preserved for AC compliance + future contextvar lift; tests passingregistry=PluginRegistry()get trivially-empty registries on Err (no plugin code ever ran) — the mutation-resistance assertion is satisfied without context routing. Documented inload_pluginsdocstring.- Sentinel digest
"0" * 64for symlink-escape verifier rejection.VerificationErrorrequiresactual: BlobDigest. The verifier never has a realactualdigest on symlink-escape (the walk fails before any hashing), so it reports"0" * 64as a structural placeholder. The loader's outer translation layer (load_plugins) inspects the sentinel and re-wraps into the structurally-correctSymlinkEscape(plugin, offending_path)variant, which carries the actual offending path. The sentinel never leaks beyond the loader. Documented inline.
Follow-ups surfaced (not folded in — Rule 3)¶
- Pre-existing local fixture drift —
tests/golden/test_goldens_match.py::test_goldens_match_live_outputfails locally because the local repo's HEAD is past the SCIP fixture's pinnedlast_indexedcommit; the live output reportsfreshness: freshwhile the golden expectsstale (commits_behind: 1). This is unrelated to S2-03 and will not fail in CI (CI runs against a clean checkout that matches the golden's commit position). - CODEOWNERS team-handle TODO —
plugins/PLUGINS.lock @Dannytrev21carries the placeholder. The human reviewer should substitute the actual@platform-teamhandle (story Notes §"CODEOWNERS placeholder discipline"). compute_plugin_tree_digestdefence-in-depthparse_blob_digestlift — the chokepoint guarantees 64-hex output. The defensiveif isinstance(parsed, Err): raise RuntimeError(...)is# pragma: no cover; if hashing's output shape ever changes, the loader fails loud rather than producing a malformedBlobDigest. Cheap cost; clear contract.registryparameter routing viaContextVar— Phase 3 plugin api.py modules can only register intodefault_registry. A future amendment toregister_plugincould read aContextVarso loader-passedregistry=actually routes registration. Not in scope for S2-03 (and adding it would require touchingregister_plugin, a kernel surface gated by ADR-0002).