Story S2-04 — Plugin resolver: (specificity, precedence, name) ordering + extends walker + UniversalFallbackResolution¶
Step: Step 2 — Plugin Registry kernel, manifest schema, loader, resolver
Status: Done — GREEN 2026-05-19 (phase-story-executor; see _attempts/S2-04.md for the per-AC evidence table + gate log)
Effort: M
Depends on: S1-02 (PluginScope), S1-03 (tagged-union discipline), S2-01 (kernel + resolve NotImplementedError stub), S2-02 (ManifestScope shape — pre-lift), S2-03 (loader + PluginRejected)
ADRs honored: ADR-0002, ADR-0003, ADR-0010, production ADR-0031, production ADR-0009
Validation notes (2026-05-18)¶
Hardened by the phase-story-validator skill (autonomous run via the story-validation-corrector scheduled task). Full audit at _validation/S2-04-plugin-resolver-extends.md. Substantive changes:
ManifestScope → PluginScopelift added as a load-bearing first step. The original story calledplugin.scope.matches(scope)directly. Per S2-02's hardened output (and the arch follow-up logged in_validation/S2-02-plugin-manifest-pydantic.md §Arch amendments), the load-timePluginManifest.scopeis aManifestScopewhose dims arestr | list[str]. The resolver is the lift point: a newlift_manifest_scope(manifest_scope) -> tuple[PluginScope, ...]fans out lists into the cross-product of single-dimPluginScopeinstances. The sort and filter operate on a typedScopedCandidate(plugin + one liftedPluginScope), not on rawPlugin.matchessignature aligned with S1-02. S1-02 shipsPluginScope.matches(*, task: str, language: str, build: str) -> bool(keyword-only). The resolver receives aPluginScope(from the orchestrator's repo-context-derived scope) and queries each lifted candidate viacandidate.lifted_scope.matches(task=..., language=..., build=...)after unpacking the incoming scope's Concrete dims. Wildcards on the incoming scope are treated as "operator did not specify" — see Notes §"Incoming scope discipline".UNIVERSAL_FALLBACK_ID: Final[PluginId]sentinel. ADR-0003 §Decision step 3 says "if the head plugin's id isuniversal--*--*". Story originally inlined the literal; hardened to a module-levelFinal[PluginId]so addingS7-03's real fallback plugin, themake_universal_fallbackfixture, and the loader's startup check all reference one symbol. (Notes §"Why not parameterize" preserves the no-config-knob discipline.)ConcreteResolution.matched_scope: PluginScopeadded. Arch line 779 names this field; original AC omitted it. The matched scope is the single liftedPluginScopethat filtered through — useful for audit / event payloads and for the property test's "concrete resolution's matched_scope.matches(...)" totality check.candidates_consideredsemantics pinned. ADR-0003 §Consequences row 5: "concrete plugins were filtered out". Story originally said "filtered-but-not-chosen concrete plugins" with ambiguous semantics. Hardened:candidates_consideredis the alphabetizedtuple[PluginId, ...]of every concreteplugin.manifest.namethat was inregistry.all()but whose no lifted scope matched the incoming scope. Excludes the universal plugin itself; tuple (not list) so the field is hashable + immutable.- TDD plan parametrized. Original story had one red test. Hardened: parametrized table with eleven named tests covering specificity ordering, precedence tie-break, name tie-break, fan-out, extends composition (TCCM + adapters left-to-right merge), cycle detection, depth-cap, universal fallback (no concrete match), universal-only registry, missing-universal corruption, and missing-extends-target error. Mutation kill-list enumerated.
- Property test pinned. Strategy must NOT generate
UNIVERSAL_FALLBACK_IDas a concrete plugin name; ≥200 examples;deadline=Noneto avoid CI flakes;assert_neverexhaustivematchoverPluginResolutionat the assertion site (mutation-resistance for "added a third variant" refactor). - Functional core / imperative shell split.
resolveis decomposed into pure helpers:_lift_candidates(plugins) -> tuple[ScopedCandidate, ...],_filter_matches(candidates, scope) -> tuple[ScopedCandidate, ...],_sort_by_keys(matches) -> tuple[ScopedCandidate, ...],_compose_extends_chain(head, registry) -> ConcreteResolution. Publicresolve(registry, scope)composes them; the only I/O is whatever the registry holds (in-memory, deterministic). PluginRegistryCorruptedclarified. ADR-0003 §Consequences line 50 names a spanning event; phase-arch §C2 line 483 lists the exit code 4 family. This story raises a typed exceptionPluginRegistryCorrupted(reason: Literal["missing_universal", "empty_registry"]); the event emission is S6-01's concern (Out-of-scope expanded).- Magic-number removal.
max_depth=4becomes_MAX_EXTENDS_DEPTH: Final[int] = 4at module level, with an ADR-0003 §Tradeoffs comment naming the empirical basis.
The original story's substance (algorithm, totality property, cycle detection, universal-fallback-as-plugin discipline) is preserved. The validator only tightened ACs into individually verifiable assertions, pinned the ManifestScope → PluginScope lift seam that the rest of Phase 3 already implicitly committed to, added missing-edge-case ACs (universal-only, missing-extends-target, fan-out, no-incoming-wildcard), rewrote the TDD plan into a parametrized table with eleven concrete tests, and refactored the implementation outline into a functional-core / imperative-shell split. No scope change.
Context¶
This is Step 2's payoff story: with the kernel (S2-01), the manifest model (S2-02), and the loader (S2-03) in place, the resolver implements the ADR-0003 algorithm that maps a PluginScope (the orchestrator's intent — derived from repo-context.yaml) to a typed PluginResolution. Four structural commitments are load-bearing:
- Universal fallback is a registered plugin, not a code path. When no concrete plugin matches a scope, the resolver returns
UniversalFallbackResolution. The universal plugin (universal--*--*) is loaded by the same machinery as any other plugin (ADR-0031 §No-match fallback). The kernel has noif plugin.id == UNIVERSAL_FALLBACK_ID:branch outside the resolver itself. - The return type is a tagged union, not
Plugin | None.PluginResolution = ConcreteResolution | UniversalFallbackResolution; every dispatch sitematches withassert_never. This is the structural enforcement of production ADR-0009 (humans always merge): the "no concrete match" path is type-impossible to silently drop. - The resolver is the
ManifestScope → PluginScopelift point. S2-02 shipsPluginManifest.scope: ManifestScopewith rawstr | list[str]per dim (per production ADR-0031's canonical YAML). This story is the first place those raw scopes are lifted into S1-02's typedPluginScopesum. A plugin whosemanifest.scope.languages == ["node", "python"]fans out into two liftedPluginScopecandidates; the resolver scores each candidate independently. - The
extendswalker composes TCCM and adapter maps left-to-right. Later wins per ADR-0031 §Inheritance and override. Cycle detection caps at depth 4 with a visited-set check;PluginExtendsCycle(chain)is raised on cycle andPluginRejected(reason="extends_depth_exceeded", chain=...)on over-depth.
A Hypothesis property test is non-optional here: for any randomly-generated set of PluginScopes registered alongside a universal--*--* fallback, resolver.resolve(scope) is total — it returns ConcreteResolution whose matched_scope.matches(...) is True, OR returns UniversalFallbackResolution. It never raises (cycle is a startup-time concern, not a per-resolve concern), never returns None.
References — where to look¶
- Architecture:
../phase-arch-design.md §Component design C2 (line 479)—resolve()algorithm;(specificity desc, precedence desc, name asc)ordering; cycle check; max depth 4.../phase-arch-design.md §Component design C3 (lines 486–514)—PluginScope+Concrete | Wildcardsum type from S1-02;specificity() = count of Concrete dims;matches(*, task, language, build) -> boolkeyword-only signature.../phase-arch-design.md §Scenarios D (lines 392–411)— concrete walkthrough: vuln-remediation--node--npm resolved against a(vuln, node, npm)workflow; sort by (specificity desc, precedence desc, name asc); walkextends_chain(max depth 4); compose TCCM left-to-right.../phase-arch-design.md §Data model (lines 775–791)—ConcreteResolution.matched_scope: PluginScopefield (load-bearing for property test);UniversalFallbackResolution.candidates_considered: list[PluginId];Annotated[..., Discriminator("kind")].../phase-arch-design.md §Edge cases E2, E9, E10— fallback path; ambiguous ties; deepextendschains.- Phase ADRs:
../ADRs/0003-plugin-resolution-and-universal-fallback-semantics.md— Option C is the decision; algorithm steps 1–4 are mandatory; universal fallback is a registered plugin underplugins/universal--*--*/, never a hardcoded code path. §Consequences row 5 specifiescandidates_considered = filtered-out concrete plugins.../ADRs/0002-plugin-registry-kernel-instance-with-default-singleton.md— ADR-0002 —resolveis on the registry instance; exit code 4 on cycle.../ADRs/0010-domain-modeling-discipline-scope-sum-type-and-newtypes.md— ADR-0010 —PluginScopeis a sum type, notLiteral["*"];PluginIdnewtype.- Production ADRs:
../../../production/adrs/0031-plugin-architecture.md§Discovery and resolution + §Inheritance and override — the canonical algorithm; later-wins-on-collision forextends.../../../production/adrs/0009-humans-always-merge.md— the invariant the typedUniversalFallbackResolutionenforces statically.- Sibling validations (recent precedent):
_validation/S2-02-plugin-manifest-pydantic.md §Arch amendments— flags thatPluginManifest.scopeisManifestScope(raw form), notPluginScope. This story is the lift point._validation/S2-01-plugin-registry-kernel.md— autouserestore_default_registryfixture; tagged-union exception payload pattern; parametrized per-variant tests._validation/S2-03-plugin-loader-integrity.md— Strategy-seam-by-Protocol precedent (PluginVerifier); tagged-union sum-type for errors (PluginRejectedvariants); AST-source-scan fences._validation/S1-02-plugin-scope-sum-type.md—matches(*, task, language, build) -> boolkeyword-only signature; round-trip viaPluginScope.parse(str(scope)).unwrap(); AC-21 module-purity AST scan precedent._validation/S1-03-tagged-union-outcomes.md— discriminator-on-kindpattern;assert_neverexhaustiveness AST scan precedent.- Existing code:
src/codegenie/plugins/registry.py(S2-01) —PluginRegistry.resolve(scope)raisesNotImplementedError("S2-04 …"); replace.src/codegenie/plugins/scope.py(S1-02) —PluginScope,Concrete,Wildcard,matches,specificity.src/codegenie/plugins/manifest.py(S2-02) —PluginManifest.extends: tuple[PluginId, ...],PluginManifest.precedence: int = 50(note: 50, not 0 — fixed in S2-02 validation),PluginManifest.scope: ManifestScope.src/codegenie/plugins/errors.py(S2-01 placeholders) — populatePluginExtendsCycle(chain)andPluginRegistryCorrupted(reason).src/codegenie/plugins/resolution.py(S2-01 placeholder) — expandclass PluginResolution: ...to the real discriminated union OR move intoresolver.py. Pick one location and pin it.src/codegenie/result.py—Resultis not used onresolve(resolve is total; noResultneeded). The cycle pre-check could be a startup helper if implemented (deferred to S2-03 + S6-04).
Goal¶
Implement PluginRegistry.resolve(scope: PluginScope) -> PluginResolution and the supporting machinery: the ManifestScope → PluginScope lift (with list fan-out), the (specificity desc, precedence desc, name asc) sort, the extends-chain walker that composes TCCM and adapter maps left-to-right (with cycle detection + max depth 4), the typed PluginResolution discriminated union, and the Hypothesis property test that proves the totality invariant. The universal fallback fires by type narrowing on the sorted head, not by branching on a string.
Acceptance criteria¶
- [ ] AC-1 — Module surface.
src/codegenie/plugins/resolver.pyexports exactly:UNIVERSAL_FALLBACK_ID(Final[PluginId]),ScopedCandidate(frozen dataclass),ConcreteResolution,UniversalFallbackResolution,PluginResolution(type alias),lift_manifest_scope,compose_extends_chain,resolve. Declared via__all__: Final[tuple[str, ...]] = (...)alphabetically sorted.set(resolver.__all__)equality is asserted in a test — stowaway exports fail CI (S1-02 AC-2 precedent). - [ ] AC-2 —
UNIVERSAL_FALLBACK_IDsentinel. Module-levelUNIVERSAL_FALLBACK_ID: Final[PluginId] = PluginId("universal--*--*"). The string literal MUST appear in exactly one place insrc/codegenie/; an AST-source-scan test (tests/static/test_universal_fallback_id_single_source.py) parses every.pyfile undersrc/codegenie/andtests/fixtures/plugins/and asserts the literal"universal--*--*"appears at most once outside ofresolver.pyandtests/fixtures/plugins/universal_fallback_fixture.py. (Mutation: a future contributor inlines the literal in a comparison → test fails.) - [ ] AC-3 —
ScopedCandidatedataclass.@dataclass(frozen=True, slots=True) class ScopedCandidate: plugin: Plugin; lifted_scope: PluginScope. Carries the single lifted PluginScope a candidate plugin was scored against (so multi-scope manifests fan out into multiple candidates pre-sort). - [ ] AC-4 —
ConcreteResolutionPydantic model.model_config = ConfigDict(frozen=True, extra="forbid"). Fields exactly: kind: Literal["concrete"] = "concrete"plugin: Pluginextends_chain: tuple[Plugin, ...](root → leaf; leaf ispluginitself)matched_scope: PluginScope— the lifted scope that filtered through (per arch §Data model line 779)composed_tccm: ComposedTccm— minimal placeholder Pydantic model for Step 2 (class ComposedTccm(BaseModel): model_config = ConfigDict(frozen=True, extra="forbid"); provides: dict[str, dict[str, str]] = {}; requires: dict[str, tuple[str, ...]] = {}). Step 3 (S3-01) replaces with the realTCCMmodel from../ADRs/0004-plugin-private-capabilities-via-tccm.md; this story documents the substitution point.composed_adapters: dict[PrimitiveName, Adapter]- [ ] AC-5 —
UniversalFallbackResolutionPydantic model.model_config = ConfigDict(frozen=True, extra="forbid"). Fields exactly: kind: Literal["universal_fallback"] = "universal_fallback"reason: Literal["no_concrete_match"]candidates_considered: tuple[PluginId, ...](alphabetized; excludesUNIVERSAL_FALLBACK_ID)- [ ] AC-6 — Discriminated union.
PluginResolution: TypeAlias = Annotated[ConcreteResolution | UniversalFallbackResolution, Field(discriminator="kind")]. ThePluginRegistry.resolvereturn annotation uses this alias (not the raw union); mypy narrowing flows through the alias. - [ ] AC-7 —
lift_manifest_scope.lift_manifest_scope(manifest_scope: ManifestScope) -> tuple[PluginScope, ...]lifts the rawstr | list[str]dims into the cross-product of single-dimPluginScopeinstances. Examples (parametrized table intest_resolver.py): (task_class="vulnerability-remediation", languages="node", build_systems="npm")→ 1 PluginScope.(task_class="vulnerability-remediation", languages=["node", "python"], build_systems="*")→ 2 PluginScopes (one with Concrete("node"), one with Concrete("python")).(task_class=["vulnerability-remediation", "distroless-migration"], languages="*", build_systems=["npm", "pip"])→ 4 PluginScopes.(task_class="*", languages="*", build_systems="*")→ 1 PluginScope (universal). The function delegates per-dim string-to-ScopeDimlift to a helper_lift_dim(raw: str) -> ScopeDim(returnsWildcard()for"*", elseConcrete(value=raw)). Each outputPluginScopeis constructed directly (NOT viaPluginScope.parse(...)) because the manifest YAML loader already validated the inputs.- [ ] AC-8 —
_lift_candidatespure helper._lift_candidates(plugins: Sequence[Plugin]) -> tuple[ScopedCandidate, ...]flat-maps each plugin into itslift_manifest_scope(plugin.manifest.scope)-produced tuple. Output is a flat tuple of(plugin, lifted_scope)pairs; the samepluginmay appear in multipleScopedCandidates. - [ ] AC-9 — Resolution algorithm.
resolve(registry: PluginRegistry, scope: PluginScope) -> PluginResolutionexecutes exactly these steps: candidates = _lift_candidates(registry.all()).matches = tuple(c for c in candidates if c.lifted_scope.matches(task=_unpack(scope.task_class), language=_unpack(scope.language), build=_unpack(scope.build_system)))where_unpack(dim: ScopeDim) -> strreturnsdim.valueforConcreteand"*"forWildcard(incoming wildcards mean "operator did not specify" — they always match per S1-02'smatchessemantics applied with"*"interpreted as the wildcard string-form; see Notes §"Incoming scope discipline").- If
matchesis empty AND the registry contains the universal plugin, returnUniversalFallbackResolution(reason="no_concrete_match", candidates_considered=()). - If
matchesis empty AND the registry does NOT contain the universal plugin, raisePluginRegistryCorrupted(reason="missing_universal"). - Sort
matchesby_sort_key:(-lifted_scope.specificity(), -plugin.manifest.precedence, plugin.manifest.name). (Negation for descending; the tuple is naturally ascending.) - If sorted head's
plugin.manifest.name == UNIVERSAL_FALLBACK_ID, returnUniversalFallbackResolution(reason="no_concrete_match", candidates_considered=_candidates_considered(registry)). - Else: walk the head's
extendschain viacompose_extends_chain(head.plugin, registry); return the resultingConcreteResolutionwithmatched_scope=head.lifted_scope. - [ ] AC-10 —
_candidates_consideredsemantics._candidates_considered(registry: PluginRegistry) -> tuple[PluginId, ...]returns the alphabetizedtuple[PluginId, ...]of everyplugin.manifest.nameinregistry.all()whose name !=UNIVERSAL_FALLBACK_ID. (The "concrete plugins were filtered out" semantics per ADR-0003 §Consequences row 5.) When invoked from step 6, this is the operator-visible debug surface; the universal plugin is intentionally excluded since the resolver already narrowed to it. - [ ] AC-11 —
compose_extends_chain.compose_extends_chain(plugin: Plugin, registry: PluginRegistry, *, max_depth: int = _MAX_EXTENDS_DEPTH) -> ConcreteResolution: - Walks
plugin.manifest.extends: tuple[PluginId, ...]depth-first, left-to-right. - Threads a
visited: frozenset[PluginId]through the recursion (initial:frozenset()); on entry, ifplugin.manifest.name in visited, raisePluginExtendsCycle(chain=tuple([*visited_path, plugin.manifest.name]))wherevisited_pathrecords the insertion order (use atuple[PluginId, ...]accumulator alongsidevisited). - On
len(visited) >= max_depth, raisePluginRejected(reason="extends_depth_exceeded", chain=tuple([*visited_path, plugin.manifest.name])). - Resolves each
extends_id: PluginIdviaregistry.get(extends_id); if not registered,registry.getalready raisesPluginNotRegistered(name)(S2-01); the resolver does NOT catch this — it propagates (the loader's startup integrity check is the right place to fail-fast for missing extends targets; see Notes §"WhyPluginNotRegisteredpropagates"). - Composes
composed_tccmandcomposed_adaptersleft-to-right (extends[0] applied first, then extends[1], …, thenpluginitself applied last). For dict merges, later wins on collision (a | bPython 3.9+ dict merge semantics OR{**a, **b}); forprovides(nesteddict[str, dict[str, str]]), the inner dicts are also merged later-wins per-key (one-level deep). - Returns
ConcreteResolution(plugin=plugin, extends_chain=(*resolved_extends_in_order, plugin), matched_scope=<filled by caller>, composed_tccm=..., composed_adapters=...). - [ ] AC-12 —
PluginRegistry.resolvedelegation.PluginRegistry.resolve(self, scope: PluginScope) -> PluginResolutiondelegates toresolver.resolve(self, scope). TheNotImplementedError("S2-04 …")stub from S2-01 is removed; AST-source-scan test (tests/static/test_no_notimplemented_in_registry.py) asserts the substring"NotImplementedError"does not appear insrc/codegenie/plugins/registry.py. - [ ] AC-13 — Module purity (functional core). AST scan (
tests/unit/plugins/test_resolver_purity.py) parsessrc/codegenie/plugins/resolver.pyand asserts theImport/ImportFromset is a subset of{__future__, dataclasses, typing, pydantic, codegenie.plugins.scope, codegenie.plugins.manifest, codegenie.plugins.protocols, codegenie.plugins.registry, codegenie.plugins.errors, codegenie.types.identifiers}. Noos,pathlib,logging, or any I/O module. (S1-02 AC-21 precedent; ADR-0001 chokepoint hygiene generalized.) - [ ] AC-14 — Exhaustiveness
matchAST scan. AST scan (tests/unit/plugins/test_resolver_exhaustiveness.py) parsesresolver.pyand asserts: everymatchblock whose subject is aPluginResolution-typed expression contains a finalcase _: assert_never(...)arm. The test also includes a_dispatch_example(resolution: PluginResolution) -> strhelper in the test file that does an exhaustivematchand is type-checked via mypy'sassert_never; adding a future variant without updating dispatch sites breaksmypy --strict(S1-02 AC-14 precedent). - [ ] AC-15 — Tests in
tests/unit/plugins/test_resolver.py. Parametrized + per-AC tests enumerated below (eleven concrete cases). Each MUST be individually runnable aspytest -k "<name>"(no shared mutable state): test_no_match_returns_universal_fallback(red — see TDD plan).test_exact_match_beats_wildcard— specificity-3 plugin AND specificity-1 plugin both match an incoming(vuln, node, npm)scope; assert the specificity-3 wins (mutation: sort key reversed → fails).test_precedence_breaks_specificity_tie— two plugins with equal specificity; one hasprecedence=100, the otherprecedence=50(the default per S2-02 AC-2); assert the precedence-100 wins.test_name_breaks_precedence_tie— two plugins with equal specificity, equal precedence, names"a-plugin"and"b-plugin"; assert"a-plugin"wins (alphabetical ascending).test_lift_manifest_scope_fans_out— parametrized table with the four examples from AC-7; each input → exact-tuple-of-PluginScopeoutput.test_extends_chain_composes_tccm_and_adapters_left_to_right— pluginAextends pluginB;B.transforms()returns{Foo: AdapterB};A.transforms()returns{Foo: AdapterA, Bar: AdapterC}; assertresolution.composed_adapters == {Foo: AdapterA, Bar: AdapterC}(later-wins onFoo;Baradded). Mirror withcomposed_tccm.provides:B.provides = {"vuln": {"x": "vB"}};A.provides = {"vuln": {"x": "vA", "y": "vA2"}}; assert composed{"vuln": {"x": "vA", "y": "vA2"}}(inner dict also later-wins per-key).test_extends_depth_4_composes_correctly— chainA → B → C → D(depth 4 =len(visited) == 4at leaf); assert no raise; assertextends_chaintuple length is 4 (root→leaf order:(D, C, B, A)— extends walked first applies first; chainextends_chain[-1] is Aispluginitself).test_extends_depth_5_raises_extends_depth_exceeded— chainA → B → C → D → E; assert raisesPluginRejectedwithreason == "extends_depth_exceeded"andchain == (PluginId("A"), PluginId("B"), PluginId("C"), PluginId("D"), PluginId("E")).test_extends_cycle_raises_plugin_extends_cycle—A extends B,B extends A; resolve(A); assert raisesPluginExtendsCyclewithchain == (PluginId("A"), PluginId("B"), PluginId("A"))(entry-point repeated at tail per Notes §"Cycle chain shape").test_only_universal_registered_returns_universal_fallback— registry contains only the universal plugin; resolve any scope; assertUniversalFallbackResolution(reason="no_concrete_match", candidates_considered=()). (Mutation: AC-9 step 3 missing → falls through to step 4 → raisesPluginRegistryCorrupted→ test fails.)test_missing_universal_raises_plugin_registry_corrupted— registry contains only concrete plugins; resolve a scope none match; assert raisesPluginRegistryCorrupted(reason="missing_universal").test_extends_missing_target_raises_plugin_not_registered—A extends B,Bis not registered; resolve(A); assert raisesPluginNotRegistered(name=PluginId("B"))(the S2-01 exception propagates).test_candidates_considered_alphabetized_and_excludes_universal— registry containsc-plugin,a-plugin,b-plugin(all concrete, none match the incoming scope) plus universal; resolve(non-matching-scope); assertresolution.candidates_considered == (PluginId("a-plugin"), PluginId("b-plugin"), PluginId("c-plugin"))(no universal).- [ ] AC-16 — Property test in
tests/unit/plugins/test_resolver_property.py. Hypothesis property test: - Strategy
concrete_plugin_name()draws fromtext(alphabet="abcdefghijklmnopqrstuvwxyz0123456789-", min_size=1, max_size=32).filter(lambda s: s != "universal--*--*" and not s.startswith("-") and not s.endswith("-")). The negative filter is mandatory — generatinguniversal--*--*as a concrete name would corrupt the test invariant (asserted as a meta-property in the strategy viaassume(name != UNIVERSAL_FALLBACK_ID)). - Strategy
concrete_plugins()draws 0..5 fake plugins with randomManifestScope(each dim independentlyWildcard | Concrete(<random word>)) and randomprecedence ∈ [0, 100]. - Strategy
incoming_scope()draws a randomPluginScope(each dim independentlyWildcard | Concrete(<random word>)). - For each (registry-with-universal + 0..5 concretes, incoming scope) pair:
resolution = resolve(registry, scope). Assert (via exhaustivematchoverPluginResolutionwithassert_never): ifConcreteResolution,resolution.matched_scope.matches(task=_unpack(scope.task_class), language=_unpack(scope.language), build=_unpack(scope.build_system))is True ANDresolution.plugin in registry.all(); ifUniversalFallbackResolution,resolution.plugin_id == UNIVERSAL_FALLBACK_ID(viaregistry.get(UNIVERSAL_FALLBACK_ID)). Never raises (assume Hypothesis-generated plugins are well-formed — no cycles, depth ≤ 4); never returnsNone. - Decorate
@settings(max_examples=200, deadline=None)— deadline disabled to avoid CI flakes on cold-cache machines; 200 examples to exercise the cross-product without slowing CI excessively. - [ ] AC-17 —
_MAX_EXTENDS_DEPTHconstant._MAX_EXTENDS_DEPTH: Final[int] = 4at module level with a comment naming ADR-0003 §Tradeoffs as the empirical basis. The literal4MUST NOT appear inline in anyif depth > 4/max_depth=4site (AST scan asserts at most one occurrence of the integer literal4inresolver.py; the named constant is the single source of truth). (Mutation-resistance for "raised the cap to 10 without an ADR amendment".) - [ ] AC-18 — Fixtures.
tests/fixtures/plugins/universal_fallback_fixture.pyexportsmake_universal_fallback() -> Pluginreturning a minimal universal plugin withmanifest.name == UNIVERSAL_FALLBACK_ID,manifest.scope == ManifestScope(task_class="*", languages="*", build_systems="*"),manifest.precedence == 0(lowest, below the S2-02 default of 50),manifest.extends == (),build_subgraph()returns a stub,adapters()/transforms()return{}.tests/fixtures/plugins/fake_plugin.py(extended from S2-01) accepts new kwargsextends: tuple[PluginId, ...] = (),precedence: int = 50,manifest_scope: ManifestScope | None = None(so resolver tests compose chains and fan-outs). - [ ] AC-19 —
ruff check,ruff format --check,mypy --strictclean onsrc/codegenie/plugins/resolver.py,tests/unit/plugins/test_resolver.py,tests/unit/plugins/test_resolver_property.py,tests/unit/plugins/test_resolver_purity.py,tests/unit/plugins/test_resolver_exhaustiveness.py,tests/static/test_universal_fallback_id_single_source.py,tests/static/test_no_notimplemented_in_registry.py,tests/fixtures/plugins/universal_fallback_fixture.py. Exhaustivenessmatchwithassert_neverdemonstrated at the_dispatch_examplesite intest_resolver_exhaustiveness.py.
Implementation outline¶
- Errors first. Populate
src/codegenie/plugins/errors.pyplaceholders (if S2-01 left them as such): class PluginExtendsCycle(Exception): chain: tuple[PluginId, ...]; exit_code: ClassVar[int] = 4.class PluginRejected(Exception): reason: Literal["extends_depth_exceeded", ...]; chain: tuple[PluginId, ...]; exit_code: ClassVar[int] = 4. (S2-03 may have already added the seven-variant tagged union; if so, extend with the newextends_depth_exceededvariant additively per its hardened tagged-union shape.)class PluginRegistryCorrupted(Exception): reason: Literal["missing_universal", "empty_registry"]; exit_code: ClassVar[int] = 4.- Sum-type return. In
src/codegenie/plugins/resolver.py(NEW), defineConcreteResolution,UniversalFallbackResolution, thePluginResolutionAnnotated[..., Field(discriminator="kind")]alias, theComposedTccmminimal placeholder Pydantic, and theScopedCandidatefrozen dataclass. Remove thePluginResolutionplaceholder fromsrc/codegenie/plugins/resolution.py(or leave the file as a one-linefrom codegenie.plugins.resolver import PluginResolution as PluginResolutionre-export — pick one and document in Notes §"Module placement"). - Sentinel + constants.
UNIVERSAL_FALLBACK_ID: Final[PluginId] = PluginId("universal--*--*")._MAX_EXTENDS_DEPTH: Final[int] = 4. - Pure helpers. Define in this order (each is pure-given-inputs):
_lift_dim(raw: str) -> ScopeDim—"*"→Wildcard(); elseConcrete(value=raw).lift_manifest_scope(ms: ManifestScope) -> tuple[PluginScope, ...]— cross-product over the three dims._lift_candidates(plugins: Sequence[Plugin]) -> tuple[ScopedCandidate, ...]— flat-map._unpack(dim: ScopeDim) -> str—dim.valueforConcrete,"*"forWildcard; total viamatch+assert_never._filter_matches(candidates: tuple[ScopedCandidate, ...], scope: PluginScope) -> tuple[ScopedCandidate, ...]._sort_key(c: ScopedCandidate) -> tuple[int, int, str]— returns(-c.lifted_scope.specificity(), -c.plugin.manifest.precedence, c.plugin.manifest.name). Named for clarity; tested directly in a small unit test._candidates_considered(registry: PluginRegistry) -> tuple[PluginId, ...]— alphabetized concrete-only names.compose_extends_chain. Recursive walker withvisited: frozenset[PluginId]+visited_path: tuple[PluginId, ...]threaded through; raisesPluginExtendsCycleon cycle,PluginRejected(extends_depth_exceeded)on over-depth; composescomposed_tccm(one-level-deep later-wins perprovideskey) andcomposed_adapters(single-level later-wins).resolve. Compose helpers in the AC-9 order. RaisePluginRegistryCorrupted(reason="empty_registry")ifregistry.all()is empty (defensive — should already be caught at loader startup, but fail-loud here too).- Wire delegation. Edit
src/codegenie/plugins/registry.py: replaceNotImplementedError("S2-04 …")body withreturn resolver.resolve(self, scope). Add thefrom codegenie.plugins import resolverimport locally inside the method (avoid module-level circular import —resolverimportsPluginRegistrytype for annotation only viaTYPE_CHECKING). - Fixtures. Land
tests/fixtures/plugins/universal_fallback_fixture.py+ extendtests/fixtures/plugins/fake_plugin.pywith the new kwargs. - Tests in dependency order. Unit cases first (AC-15 enumeration); module-purity / exhaustiveness AST scans; property test last (AC-16).
TDD plan — red / green / refactor¶
Red — failing test first¶
Test file path: tests/unit/plugins/test_resolver.py
from codegenie.plugins.registry import PluginRegistry, register_plugin
from codegenie.plugins.resolver import UNIVERSAL_FALLBACK_ID, UniversalFallbackResolution
from codegenie.plugins.scope import PluginScope
from codegenie.types.identifiers import PluginId
from tests.fixtures.plugins.fake_plugin import make_fake_plugin
from tests.fixtures.plugins.universal_fallback_fixture import make_universal_fallback
def test_no_match_returns_universal_fallback() -> None:
"""ADR-0003 §Decision step 3 + §Consequences row 5: when no concrete
plugin matches a scope and the universal fallback is registered, `resolve`
returns `UniversalFallbackResolution` with `candidates_considered`
listing every non-universal plugin in the registry (alphabetized). The
fallback is a registered plugin, not a hardcoded code path; this is the
type-level enforcement of production ADR-0009 (humans always merge).
"""
registry = PluginRegistry()
register_plugin(make_universal_fallback(), registry=registry)
register_plugin(
make_fake_plugin(
name=PluginId("vulnerability-remediation--python--pip"),
manifest_scope_kwargs={
"task_class": "vulnerability-remediation",
"languages": "python",
"build_systems": "pip",
},
),
registry=registry,
)
scope = PluginScope.parse("distroless-migration--node--npm").unwrap()
resolution = registry.resolve(scope)
assert isinstance(resolution, UniversalFallbackResolution)
assert resolution.kind == "universal_fallback"
assert resolution.reason == "no_concrete_match"
# python-pip plugin is in the registry, didn't match, so IS in candidates_considered
assert resolution.candidates_considered == (
PluginId("vulnerability-remediation--python--pip"),
)
# universal is excluded from candidates_considered (we resolved TO it)
assert UNIVERSAL_FALLBACK_ID not in resolution.candidates_considered
Why it fails: codegenie.plugins.resolver doesn't exist; PluginRegistry.resolve still raises NotImplementedError from S2-01; UNIVERSAL_FALLBACK_ID symbol absent; make_universal_fallback fixture absent.
Green follow-on — every AC-15 test name landed¶
After the red test goes green, land each of the following one at a time (verify each individually fails BEFORE the implementation lands, then passes AFTER):
test_exact_match_beats_wildcard— AC-15 #2.test_precedence_breaks_specificity_tie— AC-15 #3.test_name_breaks_precedence_tie— AC-15 #4.test_lift_manifest_scope_fans_out(parametrized, 4 cases) — AC-15 #5.test_extends_chain_composes_tccm_and_adapters_left_to_right— AC-15 #6.test_extends_depth_4_composes_correctly— AC-15 #7.test_extends_depth_5_raises_extends_depth_exceeded— AC-15 #8.test_extends_cycle_raises_plugin_extends_cycle— AC-15 #9.test_only_universal_registered_returns_universal_fallback— AC-15 #10.test_missing_universal_raises_plugin_registry_corrupted— AC-15 #11.test_extends_missing_target_raises_plugin_not_registered— AC-15 #12.test_candidates_considered_alphabetized_and_excludes_universal— AC-15 #13.- AST-scan tests (purity + exhaustiveness + single-source sentinel + no-NotImplementedError) — AC-13/14/2/12.
- Hypothesis property test
tests/unit/plugins/test_resolver_property.py:test_resolve_is_total— AC-16.
Refactor¶
- Pull
_sort_key,_unpack,_lift_diminto named functions; small unit tests for each (each is independently mutation-target-rich). - Move the
_dispatch_example(resolution: PluginResolution) -> strhelper intotest_resolver_exhaustiveness.pyand have mypy verify it (theassert_neverin thecase _:arm is the type-level proof). - The Hypothesis property test uses three strategies:
concrete_plugin_name()(forbidsUNIVERSAL_FALLBACK_ID),concrete_plugins()(0..5 fake plugins),incoming_scope()(random PluginScope). Assertion uses an exhaustivematchoverPluginResolution. candidates_consideredis computed via a helper; mutation: returning unsorted list →test_candidates_considered_alphabetized_and_excludes_universalfails.- The literal
4cap: pin via_MAX_EXTENDS_DEPTH; AST scan asserts the literal4appears at most once inresolver.py.
Mutation kill-list (selection)¶
| # | Mutation | Catching test |
|---|---|---|
| M1 | Sort key reversed ((specificity asc, ...)) |
test_exact_match_beats_wildcard |
| M2 | Precedence ignored in sort | test_precedence_breaks_specificity_tie |
| M3 | Name tie-break uses desc instead of asc |
test_name_breaks_precedence_tie |
| M4 | lift_manifest_scope returns first element only (no cross-product) |
test_lift_manifest_scope_fans_out (4 cases) |
| M5 | extends merge is right-to-left (later loses on collision) |
test_extends_chain_composes_tccm_and_adapters_left_to_right |
| M6 | Cycle detection uses depth-check only (no visited-set) | test_extends_cycle_raises_plugin_extends_cycle (A→B→A at depth 2 is NOT depth-cap) |
| M7 | Depth-cap uses > instead of >= (off-by-one allows depth 5) |
test_extends_depth_5_raises_extends_depth_exceeded |
| M8 | Universal-only registry raises PluginRegistryCorrupted instead of fallback |
test_only_universal_registered_returns_universal_fallback |
| M9 | Missing universal silently returns first concrete | test_missing_universal_raises_plugin_registry_corrupted |
| M10 | candidates_considered includes universal |
test_candidates_considered_alphabetized_and_excludes_universal |
| M11 | candidates_considered not alphabetized |
same |
| M12 | lift_manifest_scope constructs Concrete(value="*") instead of Wildcard() |
test_lift_manifest_scope_fans_out[universal] |
| M13 | _unpack(Wildcard()) returns empty string |
test_resolve_is_total property fails on incoming-wildcard cases |
| M14 | manifest.name == "universal--*--*" inlined; literal updated in one place only |
test_universal_fallback_id_single_source AST scan |
| M15 | match block over PluginResolution missing case _: assert_never |
test_resolver_exhaustiveness AST scan |
| M16 | resolver.py imports pathlib (impurity creep) |
test_resolver_purity AST scan |
| M17 | extends_chain order reversed (leaf→root instead of root→leaf) |
test_extends_depth_4_composes_correctly asserts extends_chain[-1] is plugin |
| M18 | _MAX_EXTENDS_DEPTH literal inlined at one site, refactor changes the constant only |
AST scan asserts literal 4 appears at most once in resolver.py |
Files to touch¶
| Path | Why |
|---|---|
src/codegenie/plugins/resolver.py |
NEW — UNIVERSAL_FALLBACK_ID, ScopedCandidate, ComposedTccm placeholder, ConcreteResolution, UniversalFallbackResolution, PluginResolution alias, lift_manifest_scope, compose_extends_chain, resolve, pure helpers. |
src/codegenie/plugins/registry.py |
Replace NotImplementedError in resolve with local-import delegation to resolver.resolve. |
src/codegenie/plugins/resolution.py |
Either deleted OR collapsed to a one-line re-export (from codegenie.plugins.resolver import PluginResolution as PluginResolution). Pick one; document in Notes §"Module placement". |
src/codegenie/plugins/errors.py |
Populate PluginExtendsCycle(chain), PluginRegistryCorrupted(reason), extend PluginRejected taxonomy with extends_depth_exceeded (additively, per S2-03's hardened tagged-union shape). |
tests/unit/plugins/test_resolver.py |
Unit + parametrized tests (AC-15 enumeration, 13 named tests). |
tests/unit/plugins/test_resolver_property.py |
Hypothesis property test (≥200 examples; deadline=None). |
tests/unit/plugins/test_resolver_purity.py |
Module-purity AST scan. |
tests/unit/plugins/test_resolver_exhaustiveness.py |
match + assert_never AST scan + mypy _dispatch_example helper. |
tests/static/test_universal_fallback_id_single_source.py |
AST scan asserting "universal--*--*" literal appears in at most two files. |
tests/static/test_no_notimplemented_in_registry.py |
AST scan asserting NotImplementedError is absent from registry.py. |
tests/fixtures/plugins/universal_fallback_fixture.py |
make_universal_fallback() — reused by S7-03's HITL plugin. |
tests/fixtures/plugins/fake_plugin.py |
Extend with extends=, precedence=, manifest_scope_kwargs= kwargs. |
Out of scope¶
- Concrete universal HITL subgraph behavior — handled by S7-03 (writes sanitized handoff markdown, emits
RequiresHumanReview). This story only needs a fixture plugin that registers asuniversal--*--*; itsbuild_subgraphreturns a stub. ComposedTccmreal shape +provides/requiresmerge semantics beyond the one-level-deep later-wins — Step 3 (S3-01) lands the realTCCMPydantic per ADR-0004. The placeholder defined here is intentional; the substitution point is documented in Notes §"TCCM substitution".composed_adaptersrealAdapterimplementations — Step 7 / S7-02 lands npm-specific adapters. Resolver tests use stubAdapterinstances; the composition logic (later-wins-on-collision) is what's exercised.PluginRegistryCorruptedspanning-event emission — that's the event log's concern (S6-01). This story raises the typed exception; the orchestrator (S6-04) maps it to event + exit code 4.- Plugin loader integration — S2-03 already loads plugins; this story consumes whatever the loader registered. No loader changes here.
- Per-plugin
RecipeRegistry— Step 5 / S5-01 (Gap 3 fix per phase-arch §Gap analysis line 1166). - Loader-time
extendscycle / depth pre-check — Notes §3 of the original story called this optional. The resolver's per-resolve cycle check is the contract; a startup pre-check is an additive safety net deferred to S2-03's hardened loader (already adopts verify-all-then-import-all per its validation). - Pre-loader integrity check that
extendsreferences resolve at startup time — the resolver propagatesPluginNotRegistered(a clean S2-01 exception); a loader-time pre-check is additive and deferred (see Notes §"WhyPluginNotRegisteredpropagates"). registry.all()empty — defensivePluginRegistryCorrupted(reason="empty_registry")is raised, but the loader's startup check (S2-03) is the canonical place to fail-fast on an empty registry; this story's raise is belt-and-braces.
Notes for the implementer¶
Why a Final[PluginId] sentinel, not a config knob¶
ADR-0003 §Decision step 3 reads literally "If the head plugin's id is universal--*--*". The string is the load-bearing convention. The hardening introduces UNIVERSAL_FALLBACK_ID as a typed Final constant so the literal lives in one place — adding S7-03's real fallback plugin, the make_universal_fallback fixture, and the loader's startup check all reference one symbol. Do not parameterize the name. Adding an alternate fallback (e.g., a team-specific HITL handler) is an ADR amendment, not a code-time decision. The AST single-source-of-truth scan catches a future contributor inlining the literal.
Incoming scope discipline¶
The resolver receives a PluginScope from the orchestrator (S6-04). That PluginScope is constructed from repo-context.yaml evidence — the task class is always concrete; the language is determined by Phase 1's LanguageDetection (concrete OR ambiguous, but never * from a deterministic detector); the build system is similarly concrete. So in production, the incoming scope is fully Concrete on all three dims. Still, the resolver must handle wildcards in the incoming scope (for tests, for Phase 4's LLM-fallback experiments where intent is intentionally broader, and for codegenie remediate --any-language-style operator overrides). _unpack(Wildcard()) -> "*" and PluginScope.matches(task="*", language=..., build=...) returns True iff every plugin dim either is Wildcard() or its Concrete.value == "*" — but no plugin should declare Concrete("*") per S1-02's parser (which would lift "*" to Wildcard()). The result: incoming-scope wildcards match every plugin candidate on that dim, which is what an operator means by "I don't care about the language."
assert_never discipline on match resolution¶
assert_never on match resolution: case ConcreteResolution() | UniversalFallbackResolution(): ... is the type-level enforcement that production ADR-0009 lives or dies on. mypy will catch a missed variant when Phase 4 adds (hypothetically) a LlmFallbackResolution; a reviewer will reject a Plugin | None regression. The _dispatch_example helper in test_resolver_exhaustiveness.py is the test asset that proves exhaustiveness today; do not delete it during refactor.
Depth-cap empirical basis¶
_MAX_EXTENDS_DEPTH = 4 is empirical (per ADR-0003 §Tradeoffs). The depth-5 test should construct a chain A → B → C → D → E and assert PluginRejected(reason="extends_depth_exceeded", chain=(A, B, C, D, E)). The chain length is 5; the visited-set size at the point of the check is what crosses the threshold (i.e., check fires when len(visited) >= _MAX_EXTENDS_DEPTH AND we are about to descend further — equivalently, when len(visited) == 4 and we try to walk into the 5th level).
Cycle chain shape¶
A extends B extends A. The cycle exception's chain field carries (PluginId("A"), PluginId("B"), PluginId("A")) — repeat the entry-point at the tail so an operator reading the stack can immediately see "we came back to where we started." This is more useful than (A, B). Pin in the test: full-tuple equality.
Left-to-right extends merge with later-wins-on-collision¶
When merging two adapter maps, {**a, **b} style suffices (Python dict update is "later wins"). But the chain order is extends[0] applied first → extends[1] applied second → ... → plugin itself applied last. The "leaf wins" property emerges from putting the plugin at the end of the chain (production ADR-0031 §Inheritance and override is explicit). Read it twice; this is the most common bug class in this story. The test test_extends_chain_composes_tccm_and_adapters_left_to_right pins it.
For composed_tccm.provides (a dict[str, dict[str, str]]), the inner dicts are also merged later-wins per-key (one-level deep). Beyond one level deep is out of scope — TCCM real shape lands in S3-01.
Why PluginNotRegistered propagates¶
If A extends B and B is not in the registry, registry.get(PluginId("B")) raises PluginNotRegistered(name=PluginId("B")) per S2-01. The resolver does NOT catch this — three reasons:
- The loader's startup integrity check (S2-03 hardened) should refuse to start when an
extendstarget is missing; if this happens at resolve-time it's a real corruption case. - Catching here would force the resolver to re-emit a different typed error (e.g.,
PluginRejected(reason="extends_target_missing")), bloating the resolver's error surface. PluginNotRegisteredalready carriesexit_code: ClassVar[int] = 4per S2-01, so the orchestrator's outer handler maps it correctly.
The AC pins this propagation behavior so a future contributor doesn't "improve" it.
TCCM substitution¶
ComposedTccm here is a minimal Pydantic placeholder (provides: dict[str, dict[str, str]] = {}, requires: dict[str, tuple[str, ...]] = {}). Step 3 / S3-01 lands the real TCCM per ADR-0004 §Decision. The substitution will be: change ConcreteResolution.composed_tccm: ComposedTccm → composed_tccm: TCCM. The resolver's left-to-right merge logic stays the same (the dict shapes line up); the property test grows new assertions about must_read / should_read / may_read query composition.
Module placement (resolver.py vs resolution.py)¶
S2-01 shipped a placeholder class PluginResolution: ... in src/codegenie/plugins/resolution.py. This story can:
- (a) Move the real definitions into
resolver.pyand reduceresolution.pyto a one-line re-export (from codegenie.plugins.resolver import PluginResolution as PluginResolution). - (b) Move the real definitions into
resolution.pyand haveresolver.pyimport from there.
Pick (a). Reason: callers already import PluginResolution from resolution.py (the S2-01 contract); the re-export preserves the import path. New consumers should import from resolver.py. Document in this story; the executor's Files to touch row for resolution.py is the one-line shim.
Hypothesis strategy discipline¶
The strategy must NOT generate UNIVERSAL_FALLBACK_ID as a concrete plugin name — reserve that string for the fallback fixture. The .filter(lambda s: s != "universal--*--*") plus an assume(name != UNIVERSAL_FALLBACK_ID) is belt-and-braces. The property is non-negotiable; if the test gets flaky, debug the strategy (likely max_examples exhaustion against an over-tight filter), not the assertion.
Mypy strictness¶
PluginResolution is Annotated[ConcreteResolution | UniversalFallbackResolution, Field(discriminator="kind")] — the discriminated kind is what Pydantic uses for narrowing at runtime; mypy uses the structural union for static narrowing. Annotate resolve's return as PluginResolution (the typed alias). Do NOT annotate as ConcreteResolution | UniversalFallbackResolution — that bypasses the alias and produces a less-readable error message when a future variant is added.
Sanitization of candidates_considered¶
candidates_considered carries PluginIds only, not paths or adapter import strings. The PluginId newtype (S1-01) is str at runtime but constrained at the smart-constructor boundary. Stating this in the docstring prevents a future contributor from "enriching" the list with file paths the operator never asked for. ADR-0003 §Consequences row 5 names the constraint.
Property test is the headline assertion¶
resolve is total — never raises (modulo well-formedness assumptions on the registry), never returns None. The Hypothesis property test is the proof. If the test grows flaky or times out, debug the strategy and the deadline=None setting, NOT the assertion. The property is the contract.