S2-04 — attempt log¶
Append-only journal. Newest attempt at the bottom.
Attempt 1 — 2026-05-19 (phase-story-executor)¶
Outcome: GREEN. All 19 ACs covered with runtime evidence; full
plugin + fence + static test suite green; ruff, mypy --strict,
lint-imports, mkdocs build --strict clean.
What shipped¶
src/codegenie/plugins/resolver.py(NEW) —UNIVERSAL_FALLBACK_ID,_MAX_EXTENDS_DEPTH,ScopedCandidate,ComposedTccm,ConcreteResolution,UniversalFallbackResolution,PluginResolutiondiscriminated-union alias,lift_manifest_scope,_lift_dim,_lift_candidates,_unpack,_filter_matches,_sort_key,_candidates_considered,compose_extends_chain,_plugin_tccm,_merge_tccm,_merge_adapters,resolve,_universal_registered. Functional-core/imperative-shell split per the hardening note.src/codegenie/plugins/registry.py— replaced the S2-01NotImplementedError("S2-04 …")stub with delegation toresolver.resolve. The resolver import is module-level (not lazy) — see Refactor decisions §"Module-level resolver import" for why.src/codegenie/plugins/resolution.py— collapsed to a one-line re-export shim per the story's "Module placement" choice (a).src/codegenie/plugins/errors.py— addedchainpayload + exit code toPluginExtendsCycle; newPluginExtendsDepthExceededexception (withchain+reasonClassVar); newPluginRegistryCorrupted(reason: Literal["missing_universal", "empty_registry"]).src/codegenie/plugins/protocols.py— replaced the stubPluginManifestforward-ref with a realTYPE_CHECKINGimport of the S2-02 model. Required to satisfy mypy --strict on the resolver'splugin.manifest.{scope,extends,precedence}reads.tests/fixtures/plugins/fake_plugin.py— extended withextends,precedence,manifest_scope_kwargs,adapters_map, kwargs. Default manifest scope changed to(vuln, node, npm)so resolver tests can callmake_fake_plugin(name="...")without repeating the kwargs. UsesPluginManifest.model_constructto skip the productionparse_plugin_idregex (test fakes use names likea-pluginthat don't match the three-segment production format).tests/fixtures/plugins/universal_fallback_fixture.py(NEW) —make_universal_fallback(). UsesPluginManifest.model_constructto skip the regex that rejects*in plugin ids; literal"universal--*--*"lives only in this fixture and inresolver.py'sUNIVERSAL_FALLBACK_IDconstant.tests/unit/plugins/test_resolver.py— 20 tests covering AC-15 enumeration (13 named cases) + AC-1__all__+ AC-3 frozen dataclass + AC-12 registry delegation + empty-registry defence.tests/unit/plugins/test_resolver_property.py— Hypothesis propertytest_resolve_is_total(200 examples,deadline=None) + meta-propertytest_property_strategy_never_generates_universal_id.tests/unit/plugins/test_resolver_purity.py— AC-13 module-purity AST scan + AC-17 single-source-of-truth scan for integer literal4(mutation M18).tests/unit/plugins/test_resolver_exhaustiveness.py— AC-14 AST scan forcase _: assert_never(...)arm + mypy-checked_dispatch_example(resolution: PluginResolution) -> strhelper.tests/static/test_universal_fallback_id_single_source.py— AC-2 AST/grep scan for the"universal--*--*"literal acrosssrc/codegenie/andtests/fixtures/plugins/.tests/static/test_no_notimplemented_in_registry.py— AC-12 static scan that the S2-01 stub message is gone.tests/unit/plugins/test_registry.py— flippedtest_resolve_stub_names_s2_04(which forward-referenced the S2-01 stub) intotest_resolve_delegates_to_resolverproving the delegation now surfacesPluginRegistryCorrupted("empty_registry")on an empty registry.
Gate log¶
| Gate | Outcome | Notes |
|---|---|---|
ruff check . |
✅ pass | All checks passed! |
ruff format --check . |
✅ pass | 1643 files already formatted |
mypy --strict src/ |
✅ pass | Success: no issues found in 154 source files (+1 over prior: resolver.py; plus resolution.py collapsed to a shim, protocols.py import widened). |
lint-imports |
✅ pass | Contracts: 4 kept, 0 broken. |
make fence equivalent |
✅ pass (modulo pre-existing xfails) | 191 passed, 28 skipped, 2 xfailed. test_no_any_in_plugin_surface initially failed because the resolver used dict[PrimitiveName, Any]; fixed by switching to the real Adapter Protocol (now part of the Phase-3 surface contract). |
mkdocs build --strict |
✅ pass | Documentation built in 22.36 seconds. |
| Plugin tests | ✅ pass | tests/unit/plugins/ + tests/static/test_*plugin*: 155 passed. |
| Full pytest (no-cov) | partial | 4427 passed, 2 failed, 62 skipped, 5 xfailed. The 2 failures are the pre-existing test_lint_imports_canary tests — they look for lint-imports on the global PATH (not .venv/bin); CI runs make lint-imports via the venv binary so they're clean in CI. Documented as unrelated in the S2-03 attempt log. |
Ralph-Wiggum naive-verification pass¶
Walked every AC verbatim against runtime behaviour:
- AC-2 single source: "If somebody writes the literal anywhere
else in src or fixtures, does the scan catch them?" — Planted a
stray
# "universal--*--*"inerrors.pyinitially (in a docstring); the scan flagged it. Rewrote the docstring to referenceUNIVERSAL_FALLBACK_ID. PASS. - AC-7 fan-out: "If I give you
languages=['node', 'python']andbuild_systems='*', do you give me exactly 2 PluginScopes with the right shapes?" — Parametrized table with 4 cases including the universal(*, *, *) → 1corner. PASS. - AC-9 step 4 missing-universal: "If no plugin matches and the
universal isn't registered, do you fail loud?" —
PluginRegistryCorrupted(reason="missing_universal")raises with the typed reason. PASS. - AC-9 step 6 head==universal: "If the universal is the only
thing in the registry, do you correctly return the fallback
(not raise corrupted)?" —
test_only_universal_registered_returns_universal_fallbackexercises step 6 specifically (mutation M8 defence). PASS. - AC-11 cycle: "If A extends B extends A, do you tell me the
cycle path with A repeated at the tail?" —
chain == (A, B, A)verified by full-tuple equality. PASS. - AC-11 depth-cap: "If A → B → C → D → E, do you refuse at the
point of the 5th level?" —
chain == (A, B, C, D, E); depth-4 variant passes. PASS. Mutation M7 (>vs>=) would let depth-5 through; the test catches it. - AC-14 exhaustiveness: "If somebody adds a third
PluginResolution variant tomorrow, will mypy yell?" — Yes: the
_dispatch_examplehelper'scase _: assert_never(resolution)arm forces a type-check failure on any new variant that isn't added to the dispatch table. PASS. - AC-16 totality: "For 200 random registries + random incoming
scopes, does the resolver always return one of the two real
variants?" — Exhaustive
matchoverPluginResolutionwithassert_neverproves it at the type AND the runtime layer. PASS. - AC-17 magic-number removal: "If somebody bumps the cap to
10 in one place but leaves a stray
4in another, does the scan catch it?" — Plantedif x == 4:inresolver.pytemporarily; the scan flagged 2 occurrences (the constant + the planted use). PASS after removing the planted code.
Refactor decisions (Rule 3 — surgical)¶
PluginExtendsDepthExceededis aCodegenieError, NOT aPluginRejectedBaseModel variant. Story AC-11 wording says "raisePluginRejected(reason="extends_depth_exceeded", ...)", butPluginRejectedis aTypeAliasfor a discriminated Pydantic union — BaseModels cannot be raised. I read the contradiction as: the resolver needs a distinct exception class whose payload (reason,chain) matches the proposed variant shape. A future loader-time pre-check (S2-03 successor) can add the equivalent BaseModel variant toPluginRejectedfor its ownResult[X, PluginRejected]return type. Documented in thePluginExtendsDepthExceededclass docstring; tests assertraises(PluginExtendsDepthExceeded)+ payload.- Module-level resolver import in registry.py. The original
cycle (registry ↔ resolver at module load) is already broken by
TYPE_CHECKINGimport ofPluginRegistryinsideresolver.py. At runtimeresolver.pyimports zero ofregistry.py's members. Resolution order:registry.py→resolution.py→resolver.py→ done. Thefrom codegenie.plugins.resolver import resolve as _resolver_resolvelives at module level and is bound at registry-load time. Why this matters: the fence testtests/fence/test_no_llm_in_transforms.pypopscodegenie.plugins.*fromsys.modulesand walks the package to re-import every submodule. A lazyfrom codegenie.plugins import resolver as _resolverinsideresolve()would fetch the new (C2) module after the fence pop, while the test'sConcrete()bindings still hold the old (C1) class — leading toassert_never(dim)on the C1/C2 mismatch. The module-level bind freezes the lookup atregistry.pyload time, so_resolver_resolveand the test'sConcreteare class-identity consistent. Documented in thePluginRegistry.resolvedocstring. tests/fixtures/plugins/fake_plugin.pydefault scope changed to(vulnerability-remediation, node, npm). Previously(vulnerability-remediation, javascript, npm). The resolver tests use(vuln, node, npm)as the canonical incoming scope; matching the default avoids forcing every test that wants a matching plugin to repeat the kwargs. Documented inline.make_fake_pluginusesPluginManifest.model_construct. The S2-02parse_plugin_idvalidator requires three---separated segments (e.g.,vulnerability-remediation--node--npm); resolver tests use semantic names likea-pluginfor sort tie-breakers.model_constructskips the validator. The manifest loader's own tests (test_manifest.py) exercise the production regex; resolver tests should not be coupled to it.tests/unit/plugins/test_registry.pyimports moved to module-level. The S2-04 update oftest_resolve_stub_names_s2_04→test_resolve_delegates_to_resolveroriginally used function- body local imports. Local imports re-fetch fromsys.modulesat the time of the call; after the fence test pops modules, local imports return the C2 classes while the test's existing module globals hold C1. MovingConcrete,PluginScope, andPluginRegistryCorruptedto module-level bindings makes them consistent with the rest of the file.AdapterProtocol used as the runtime adapter-value type, notAny. The fencetest_no_any_in_plugin_surfaceforbidsAnyin Phase-3 surface. The_FakeAdaptertest-local dataclass satisfies the Protocol's one attribute (primitive: PrimitiveName); the resolver's composed map is nowdict[PrimitiveName, Adapter].
Follow-ups surfaced (not folded in — Rule 3)¶
ComposedTccmreal shape — S3-01 lands the realTCCMPydantic per Phase-3 ADR-0004. The substitution point is documented onComposedTccm's docstring and the resolver's_plugin_tccmhook (readsplugin._composed_tccmif present, else empty placeholder).PluginRegistryCorruptedevent-log emission — S6-01 wires the spanning-event. This story raises the typed exception only.- Loader-time
extendscycle / depth / target-missing pre-check — additive to S2-03; the resolver's per-resolve checks are the contract. The pre-check is a fail-fast optimization. - Universal-fallback
parse_plugin_idallowance — the regex rejectsuniversal--*--*. The S7-03 real fallback plugin will need either (a) a regex amendment + ADR, or (b) a separate loader path. Out of scope here; tests usemodel_construct. tests/unit/test_lint_imports_canaryenv drift — these tests look forlint-importson the global PATH. They were pre-existing local failures named in the S2-03 attempt log; CI runs the venv-binary'dmake lint-importswhich is clean. The canary tests could be hardened to look in.venv/binfirst.