Story S2-03 — Reference TCCM + roundtrip integration exercising every Protocol method¶
Step: Step 2 — Plant kernel-side loaders (SkillsLoader, ConventionsCatalogLoader) and reference TCCM
Status: Done (2026-05-16) — see _attempts/S2-03.md
Effort: S
Depends on: S1-04 (Done — TCCMLoader, TCCM model, 5-variant DerivedQuery discriminated union, TCCMLoadError marker), S1-05 (SkillId newtype — required_skills carries list[SkillId]). Not S2-01: the reference TCCM names required_skills as data only; SkillsLoader.find_applicable(...) is never invoked (Out of scope §5). The story is implementable today against the current master.
ADRs honored: 02-ADR-0007 (kernel-side scaffolding only; reference TCCM lives in docs/, not in plugins/ — "documentation as code, deliberately outside the plugin namespace"), production ADR-0029 (Task-Class Context Manifests), production ADR-0030 (graph-aware context queries — five primitives), production ADR-0032 (Language Search Adapters — four Protocols); closes Gap 1 from ../phase-arch-design.md §"Gap analysis" ("Adapter Protocol drift between Phase 2 and Phase 3" / "Protocols defined, never called in Phase 2")
Validation notes (2026-05-15 — HARDENED)¶
The phase-story-validator identified four block-tier defects against the actually-shipped S1-04 modules (src/codegenie/tccm/queries.py, loader.py, model.py; src/codegenie/result.py; src/codegenie/errors.py) and rewrote the YAML, the _expected_tccm() constructors, and the failure-path assertions to match the source-of-truth shape. The original draft would have had every test red on phase-story-executor's first attempt for non-design reasons (constructor errors, wrong API surface) — the executor would have burned attempts hunting parser bugs that don't exist.
Block-tier corrections applied:
- YAML
compute:form rewritten to discriminator-form. The original draft prescribedcompute: dep_graph.consumers_of("@codegenie/scip")(an inline DSL). S1-04'sTCCMLoaderis a thinsafe_yaml.load → TCCM.model_validateshim — there is no DSL parser. EachDerivedQueryvariant is a Pydantic discriminated-union member withcompute: Literal[<primitive>]+ a single payload field (pkg/module/symbol) andextra="forbid". The YAML now usescompute: consumers_of+pkg: "@codegenie/scip". (Critic CO1 / T2 — block.) nameandmax_filesdropped from variants and YAML. The original draft constructedConsumersOf(name="who_consumes_scip_index", pkg=..., max_files=50). Neithernamenormax_filesis a field on anyDerivedQueryvariant inqueries.py;extra="forbid"would raise on every constructor call. They have been removed everywhere. If a future story wants per-query naming or budgeting, that's an ADR amendment to theDerivedQuerymodel — out of scope here. (Critic CO3 / T1 — block.)TCCMLoadError(reason=...)corrected to positionalargs[0]prefix.errors.pydefinesTCCMLoadErroras a marker (no__init__, no.reasonattribute).loader.pyencodes the reason as a prefix inargs[0](e.g.,"unknown_query_primitive: <detail>"). AC-7 now assertserr.args[0].startswith("unknown_query_primitive:"). (Critic CO7 / T3 — block.)scip.refscorrected torefs_to. ADR-0030's author-facing primitive name isscip.refs(symbol); the literalcomputetoken shipped by S1-04 (per phase-arch §"Data model" line 721) isrefs_to. Story prose now uses the literal token consistently and cites the alias once. (Critic CO2 — block.)
Harden-tier strengthening applied:
- All three
AdapterConfidencevariants exercised forconfidence_floorround-trip (was: onlyDegraded). New AC-3b parametrizesTrusted/Degraded(reason=...)/Unavailable(reason=...)against three sibling fixtures under_reference-tccm/_floors/. (Critic C2.) frozen=True/extra="forbid"invariants pinned by AC-3c (mutation raises; extra top-level field raises). Arch §"Data model" requires both. (Critic C3.)- JSON round-trip identity per variant pinned by AC-3d (
q == type(q).model_validate_json(q.model_dump_json())). Catches discriminator-tag-name mismatches before Phase 3. (Critic C9.) _dispatchProtocol-typed parameters pinned by AC-4b —dep: DepGraphAdapter,imp: ImportGraphAdapter, etc. — somypy --strictvalidates the call signatures against the Protocol surface, not the concrete mock. This is the actual Gap-1 closer for signature drift (mere "method called" tests don't catch aconsumers(self, pkg: str)→consumers(self, pkg: PackageId)change). (Critic C1.)mypy --strictscoped totests/integration/tccm/by AC-11b. (Critic C6.)pytest.raises(AssertionError)only for theassert_nevertest — the over-broad(AssertionError, TypeError, ValueError)umbrella would have hidden genuine dispatcher bugs.typing.assert_neverdeterministically raisesAssertionError(Python 3.11+). (Critics C7 / T5 / CO9.)- Single-defect invalid fixture — dropped extra
name/max_fileskeys from_invalid/unknown_compute.yamlso the only Pydantic error is the unknown-discriminator one. Avoids accidental coupling to Pydantic v2's error-emission order. (Critic CO5.) - Multi-defect coverage spread to siblings under
_invalid/— three new fixtures (bad_floor.yaml,missing_required_probes.yaml,extra_top_level_key.yaml) parametrized in AC-7b, each pinning one row of the loader'sLoaderReasontaxonomy. (Critic C4.) - AC-1 reworded to allow
README.mdalongsidetccm.yamland_invalid/(was: "only the reference TCCM and the_invalid/sibling"). (Critic C8.) MagicMockimport dropped — never used in the prescribed code; would have failedruff check. (Critic T7.)Trusted()/Degraded(reason=...)/Unavailable(reason=...)in mockconfidence()returns —kindis defaulted, the redundantkind=kwarg suggests the author didn't read the model. (Critic T8.)Counter-style multiset assertion for AC-2 (was: set equality, which would silently accept a fixture with the right set of variants but the wrong count). (Critic T11.)
Design-pattern strengthening applied (Notes for implementer §"Design patterns"):
_ProtocolMethodStrEnum introduced as the typed Protocol-method-name surface. Recorder and assertion both import the same enum; a typo ("reverse_lookups") surfaces at import, not silently. The "fan of nineassert called(...)lines" is collapsed to one enum-iterating loop. The original story's Refactor §6 prohibition on a "central Protocol coverage map" was the right spirit (no production registry) but the wrong form — a typed enum is the Protocol surface (mirror of S1-04'sLoaderReason: TypeAlias = Literal[...]), not a registry. The forcing-function discipline is preserved: a new Protocol method requires a new enum value, single-place edit, loud test failure on omission. (Critics DP2 / DP3.)- Coverage ratchet for the
matcharms — AC-13 sister-test (test_dispatcher_coverage_ratchet.py) deletes one arm from a copy of_dispatchat runtime, runsmypy --warn-unreachableon the copy, asserts mypy reports theassert_neverarm reachable. This is the production ADR-0030 §Consequences ratchet (same shape S1-04 already uses for_classify). Pins extension-by-addition: a sixthDerivedQueryvariant cannot land green without a newcasearm. (Critic DP1.) - Mock-class duplication is intentional — the four
_Mock*classes share an__init__+confidence()shape that violates three-strikes on its face. Rule 3 (Surgical Changes) wins here: these mocks are deleted when Phase 8's Bundle Builder ships real adapters; introducing a_RecordingAdaptermixin would couple the test fixtures to a micro-abstraction with one consumer. Notes-for-implementer flags it explicitly so a future contributor doesn't "fix" it. (Critic DP4.) - Newtype opportunity flagged for future ADR —
pkg: str,module: str,symbol: stron the Protocols are primitive-obsessed (production ADR-0033 §3); semantically distinct domains. S2-03 does not widen S1-03's Protocols (out of scope), but Notes-for-implementer records the opportunity for a Phase-3-entry amendment ADR. (Critic DP5.)
Twelve ACs original → fifteen ACs after hardening (AC-3b, AC-3c, AC-3d, AC-4b, AC-7b, AC-11b, AC-13 added; AC-1, AC-6 reworded). Implementation outline §1 / §2 / §4 rewritten. TDD-Red plan unchanged. Story is now ready for phase-story-executor.
Full critic reports + edit log: _validation/S2-03-reference-tccm-roundtrip.md.
Context¶
After S1-04 ships TCCMLoader and S2-01 ships SkillsLoader, Phase 2 still has a load-bearing hole: the four adapter Protocols at src/codegenie/adapters/protocols.py (S1-03 — DepGraphAdapter, ImportGraphAdapter, ScipAdapter, TestInventoryAdapter) are typed but never invoked. The arch's Gap 1 is explicit (../phase-arch-design.md §"Gap analysis" §1):
The synthesis ships four
Protocolclasses with zero implementations in Phase 2; the only Phase-2-internal proof they're shaped correctly is the integration test loading the reference TCCM and dispatching to a mock. Phase 3's first adapter (adapters/scip_node.py, etc.) may discover the Protocol signature is wrong … Discovering it at Phase 3 land means amending a Phase 2 module, which ripples throughreport/confidence_section.pyand any Phase 3 plugin code that was prototyped against the wrong shape.
This story closes Gap 1 by shipping two artifacts together:
-
A reference TCCM under
docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yaml— an illustrative manifest for anindex-health-self-checktask class. Documentation as code, deliberately outsideplugins/(the plugin namespace is Phase 3's per ADR-0031 §Consequences §1; this fixture is indocs/because shipping it undertests/fixtures/plugins/would imply pluggability Phase 3 owns — final-design §"Departures from all three inputs" §6). The TCCM exercises every field of theTCCMPydantic model and every variant of the five-primitiveDerivedQuerydiscriminated union (ConsumersOf,ProducersOf,ReverseLookup,RefsTo,TestsExercising). -
tests/integration/tccm/test_reference_tccm_roundtrips.py— loads the reference TCCM viaTCCMLoader; asserts the loaded model equals a hand-constructed Pydantic instance (field-for-field); and — closing Gap 1 — dispatches eachDerivedQueryvariant to a mock adapter that implements the four Phase 2Protocols structurally, asserting every Protocol method is invoked at least once. The mock dispatcher is the Phase-2-internal proof that the Protocol method signatures are shaped to receive theDerivedQueryvariants they were designed for.
The mock dispatcher is not the Bundle Builder (Phase 8 owns that). It is a ~40-LOC pattern-matching match query: case ConsumersOf(): dep_graph_adapter.consumers(...) switch that lives inside the test file. Its purpose is to make "Protocol method called with DerivedQuery payload" type-checkable at PR review time. If Phase 3 discovers consumers(self, pkg: str) should have been consumers(self, pkg: PackageId, *, transitively: bool = False), the amendment ADR points at this test file as the location where the drift was discovered. The "Phase-3 handoff trip-wire" (tests/integration/adapters/test_phase3_handoff_smoke.py, landed skipped in S7-04) is the cross-phase counterpart; this story is the in-Phase-2 proof.
References — where to look¶
- Architecture:
../phase-arch-design.md §"Component design" #7— the four adapterProtocolsignatures (the mock dispatcher must call methods matching these signatures verbatim).../phase-arch-design.md §"Component design" #8—TCCMLoader.load(path) -> Result[TCCM, TCCMLoadError]; five-variantDerivedQuery;AdapterConfidenceplaceholder.../phase-arch-design.md §"Gap analysis" Gap 1— the gap this story closes; cross-references S7-04's Phase-3 handoff trip-wire.../phase-arch-design.md §"Data model"—TCCMfield list (schema_version,task_class,required_probes,required_skills,derived_queries,confidence_floor);AdapterConfidence = Trusted | Degraded(reason) | Unavailable(reason).../phase-arch-design.md §"Scenarios" → Scenario 2—IndexHealthProbeself-check is the conceptual fit for the reference TCCM'stask_class.- Phase ADRs:
../ADRs/0007-no-plugin-loader-in-phase-2.md— reference TCCM lives underdocs/, notplugins/; the integration test loads fromdocs/.- Production ADRs:
../../production/adrs/0029-task-class-context-manifests.md— TCCM schema;required_probes,required_skills,derived_queriessemantics.../../production/adrs/0030-graph-aware-context-queries.md— five-primitiveDerivedQuerytaxonomy. The literalcomputetokens shipped by S1-04 (per phase-arch §"Data model" line 721) areconsumers_of,producers_of,reverse_lookup,refs_to,tests_exercising. ADR-0030's author-facing alias forrefs_toisscip.refs(symbol)— the alias is documentation; the literal token is the Pydantic discriminator.../../production/adrs/0032-language-search-adapters.md— the fourProtocols and which queries each method serves.- Source design:
../final-design.md §"Components" #8— Phase-2-internal consumer commitment forTCCMLoader; "reference TCCM underdocs/_reference-tccm/, notplugins/" pattern decision.../final-design.md §"Departures from all three inputs" §6— whydocs/overtests/fixtures/plugins/.../final-design.md §"Risks and mitigations"row "Adapter Protocol drift" — Gap 1 mitigation.- Existing code (Step 1 + S2-01):
src/codegenie/tccm/loader.py,model.py,queries.py(S1-04) —TCCMLoader.load,TCCM,DerivedQuery.src/codegenie/adapters/protocols.py(S1-03) —DepGraphAdapter,ImportGraphAdapter,ScipAdapter,TestInventoryAdapterruntime_checkableProtocols.src/codegenie/adapters/confidence.py—AdapterConfidencediscriminated union.src/codegenie/skills/__init__.py(S2-01) —Skill,SkillsLoader; the reference TCCM'srequired_skillsfield names IDs that S2-01's loader would index.- Validation lineage:
_validation/S1-04-*.md(if it lands) — confirms five-variantDerivedQueryshape this story relies on.
Goal¶
Ship two artifacts that together close Gap 1:
docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yaml— a single illustrative TCCM fortask_class: index-health-self-check. The manifest:- Sets
schema_version: "1". - Names
required_probes: [b2_index_health, b1_scip, b3_tree_sitter_imports](Phase-2 probe IDs). - Names
required_skills: [diagnose-stale-scip](illustrative — the SkillsLoader does not have to find an actualSKILL.mdfor this test; the field is data, not a resolution). - Sets
confidence_floor: {kind: degraded, reason: "stale_scip_acceptable_for_self_check"}. -
Includes five
derived_queriesentries — one perDerivedQueryvariant (ConsumersOf,ProducersOf,ReverseLookup,RefsTo,TestsExercising) — each in S1-04's discriminator-form (compute: <literal>+ payload fieldpkg/module/symbol; nonameormax_filesfields — neither exists on the variants andextra="forbid"rejects them). -
tests/integration/tccm/test_reference_tccm_roundtrips.py— six named tests: test_reference_tccm_loads_and_equals_expected_pydantic_instance—TCCMLoader().load(REFERENCE_PATH)returns anOk(value=tccm)(fromcodegenie.result;Ok/Errare exported at module top-level — there is noResult.Oknamespacing);tccm == _expected_tccm()where_expected_tccm()constructs the TCCM by hand from the canonical Pydantic constructors. Pins the YAML-to-Pydantic deserialization shape end-to-end.test_reference_tccm_exercises_every_derived_query_variant— exactly one entry per variant intccm.derived_queries; the five-element set oftype(q).__name__overtccm.derived_queriesequals{"ConsumersOf", "ProducersOf", "ReverseLookup", "RefsTo", "TestsExercising"}. Pins the "exercises every variant" commitment.test_mock_dispatcher_invokes_every_protocol_method_at_least_once— the load-bearing AC. A_MockDispatcherdefines four mock adapters that structurally satisfy the fourProtocols (isinstance(mock_dep, DepGraphAdapter)is True underruntime_checkable). The dispatcher routes eachDerivedQueryto the appropriate Protocol method; a call-recorder asserts that every method on every Protocol is invoked at least once across the five queries. Specifically:ConsumersOf→DepGraphAdapter.consumers(...)ProducersOf→DepGraphAdapter.producers(...)ReverseLookup→ImportGraphAdapter.reverse_lookup(...)RefsTo→ScipAdapter.refs(...)TestsExercising→TestInventoryAdapter.tests_exercising(...)- All four
*.confidence()methods are called once at the dispatcher boundary (single-batch confidence reporting); this picks up theProtocol.confidence()method from all four.
test_dispatcher_match_is_exhaustive_assert_never_fires_on_smuggled_variant— runtime smoke: a hand-constructedobject()with akindattribute the discriminator does not recognize triggers the dispatcher'scase _ as unreachable: assert_never(unreachable)branch. Pins ADR-0033 §4.test_unknown_compute_primitive_returns_typed_err_prefix— a sibling fixture (_reference-tccm/_invalid/unknown_compute.yaml) withcompute: graph_exploderound-trips throughTCCMLoaderand yieldsErr(error=TCCMLoadError(...))whoseerror.args[0]starts with"unknown_query_primitive:". TheTCCMLoadErrormarker has no.reasonattribute (pererrors.py+loader.pydocstring); the reason is encoded as the positionalargs[0]prefix. Pins S1-04's failure-path contract from the perspective of a real on-disk fixture.test_reference_tccm_lives_under_docs_not_under_plugins— directory-discipline guard: the resolved path ofREFERENCE_PATHstarts with<repo_root>/docs/, and<repo_root>/plugins/either does not exist or contains zerotccm.yamlfiles. If a future contributor moves the fixture totests/fixtures/plugins/, this test fails — closes the boundary 02-ADR-0007 sets.
Acceptance criteria¶
- [ ] AC-1 — reference TCCM exists at the canonical path.
docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yamlexists; the directory holds exactlytccm.yaml,README.md, the_invalid/sibling subdirectory (AC-7, AC-7b), and the_floors/sibling subdirectory (AC-3b). No other files. Pinned bytest_reference_tccm_lives_under_docs_not_under_plugins(path + walk assertion). - [ ] AC-2 — five
derived_queriesentries, one per variant. The fixture YAML'sderived_querieslist has exactly five entries (len(tccm.derived_queries) == 5); afterTCCMLoader().load,Counter(type(q).__name__ for q in tccm.derived_queries)equalsCounter({"ConsumersOf": 1, "ProducersOf": 1, "ReverseLookup": 1, "RefsTo": 1, "TestsExercising": 1})(multiset equality — no duplicates, no omissions, no over-counts). Pinned bytest_reference_tccm_exercises_every_derived_query_variant. - [ ] AC-3 — round-trip equality.
TCCMLoader().load(REFERENCE_PATH)returns anOk(value=tccm)(from codegenie.result import Ok) andtccm == _expected_tccm(), where_expected_tccm()is a hand-constructedTCCM(...)using the actual Pydantic constructors fromsrc/codegenie/tccm/{model,queries}.pyandsrc/codegenie/adapters/confidence.py. Asserts equality on every field (schema_version,task_class,required_probes,required_skills,derived_queries,confidence_floor). The variant constructors carry onlycompute(defaulted Literal) + one payload field (pkg/module/symbol); noname, nomax_files(those fields do not exist;extra="forbid"would reject them). Pinned bytest_reference_tccm_loads_and_equals_expected_pydantic_instance. - [ ] AC-3b — every
AdapterConfidencevariant round-trips throughconfidence_floor. Three sibling fixtures under_reference-tccm/_floors/(trusted.yaml,degraded.yaml,unavailable.yaml) — each a minimal TCCM identical to the reference except forconfidence_floor. Parametrized test assertsTCCMLoader().load(p).unwrap().confidence_flooris the expectedTrusted()/Degraded(reason=...)/Unavailable(reason=...)instance. Without this, an S1-04 discriminator-handling bug onUnavailablewould ship uncaught (the canonical reference TCCM exercises onlyDegraded). Pinned bytest_confidence_floor_round_trips_for_every_variant. - [ ] AC-3c —
frozen=Trueandextra="forbid"invariants enforced on the loadedTCCM. Two assertions: (a) attemptingtccm.task_class = "x"raisesValidationError(Pydantic v2frozensemantics); (b) loading a fixture (_invalid/extra_top_level_key.yaml) with an unknown top-level key (unexpected_field: 1) returnsErr(error=TCCMLoadError(...))whoseargs[0]starts with"schema:". Pins arch §"Data model" requirements explicitly. Pinned bytest_loaded_tccm_is_frozen_and_forbids_extras. - [ ] AC-3d — JSON round-trip identity per variant. For each
qintccm.derived_queries:q == type(q).model_validate_json(q.model_dump_json()). Catches discriminator-tag-name drift and any custom-validator regressions before Phase 3 plugins serialize/deserialize across IPC. Pinned bytest_every_derived_query_round_trips_through_json. - [ ] AC-4 — every Protocol method invoked. The mock dispatcher's call recorder reports at least one invocation for each of the nine Protocol methods enumerated by the in-test
_ProtocolMethod(StrEnum)(see Implementation outline §4):DepGraphAdapter.{consumers, producers, confidence},ImportGraphAdapter.{reverse_lookup, confidence},ScipAdapter.{refs, confidence},TestInventoryAdapter.{tests_exercising, confidence}. Asserted by iterating the enum, not by nine independent string-literalassert called(...)calls (DP3 — typo discipline). Adding a Protocol method requires a new enum value (single-place edit); the test fails loudly on omission. This is the Gap 1 closer for invocation. Pinned bytest_mock_dispatcher_invokes_every_protocol_method_at_least_once. - [ ] AC-4b —
_dispatchparameters are typed against theProtocols, not the concrete mocks. The_dispatchsignature isdef _dispatch(query: DerivedQuery, *, dep: DepGraphAdapter, imp: ImportGraphAdapter, scip: ScipAdapter, tests: TestInventoryAdapter) -> None. Undermypy --strict tests/integration/tccm/, the call sites inside_dispatch(dep.consumers(p), etc.) are checked against the Protocol signatures, not the mocks' duck-typed methods. This is the Gap 1 closer for signature drift — a future change ofconsumers(self, pkg: str)toconsumers(self, pkg: PackageId, *, transitively: bool = False)on the Protocol would break this file at type-check, surfacing the drift in this story's test (the in-Phase-2 anchor) the way Gap 1 promises. Pinned by AC-11b. - [ ] AC-5 — mocks structurally satisfy the Protocols.
isinstance(mock_dep, DepGraphAdapter),isinstance(mock_import, ImportGraphAdapter),isinstance(mock_scip, ScipAdapter),isinstance(mock_test, TestInventoryAdapter)are allTrue(relying on@runtime_checkable). Pins that the mocks aren't faking the conformance. - [ ] AC-6 — exhaustive
matchdiscipline. The dispatcher'smatch query: case ...: ... case _ as unreachable: assert_never(unreachable)branch fires when a hand-constructed imposter object is passed; the test assertspytest.raises(AssertionError)exactly (Python 3.11+typing.assert_neverraisesAssertionErrordeterministically — the previous draft's(AssertionError, TypeError, ValueError)umbrella was over-defensive and would have hidden genuine dispatcher bugs that swallowed exceptions or returnedNone). Pinned bytest_dispatcher_match_is_exhaustive_assert_never_fires_on_smuggled_variant. - [ ] AC-7 — invalid-fixture path returns typed
Errwithunknown_query_primitive:prefix._reference-tccm/_invalid/unknown_compute.yaml(which contains a singlederived_queriesentry withcompute: graph_explode+pkg: "@x"— single-defect fixture, no extra fields) loads viaTCCMLoader().load(...)toErr(error=TCCMLoadError(...));result.unwrap_err().args[0].startswith("unknown_query_primitive:")isTrue. TheTCCMLoadErrormarker carries no.reasonattribute (errors.py+loader.pydocstring) — the test reads the prefix offargs[0], noterr.reason. Pins the S1-04 failure-path contract under realistic on-disk input. - [ ] AC-7b —
LoaderReasontaxonomy covered by sibling invalid fixtures. Three additional fixtures parametrize the loader's three-rowLoaderReasontaxonomy (parse | schema | unknown_query_primitive): _invalid/malformed.yaml— invalid YAML bytes ([unclosed) →args[0].startswith("parse:")._invalid/missing_required_probes.yaml— well-formed YAML missing therequired_probesfield →args[0].startswith("schema:")._invalid/extra_top_level_key.yaml— well-formed YAML withunexpected_field: 1at top-level →args[0].startswith("schema:")(covered by AC-3c too — single fixture, two consumers). Each loaded fixture'sErr.erroris an instance ofTCCMLoadError. Pinned bytest_invalid_fixtures_cover_loader_reason_taxonomy.- [ ] AC-8 — directory discipline (02-ADR-0007). The reference TCCM's resolved path is rooted at
<repo_root>/docs/. A walk of<repo_root>/plugins/(if it exists) finds zero files namedtccm.yaml. Pinned bytest_reference_tccm_lives_under_docs_not_under_plugins. This guards the Phase-2/Phase-3 directory boundary. - [ ] AC-9 — fixture is
safe_yaml.load-parseable independent ofTCCMLoader. Smoke-check that the on-disk reference fixture is well-formed at the parser-layer too:safe_yaml.load(REFERENCE_PATH, max_bytes=64 << 10)returns adictwithdata["schema_version"] == "1". Not a parser-layer test (S1-04 owns those); a fixture-validity smoke. Asserted in one-line sanity test. - [ ] AC-10 — README under
_reference-tccm/. A_reference-tccm/README.md(one paragraph) explains (a) why this directory lives underdocs/, notplugins/; (b) which Phase 2 test consumes it; (c) cross-links to 02-ADR-0007 + Gap 1 in the arch design. Future contributors must not delete or move this fixture casually. - [ ] AC-11 — toolchain.
ruff check,ruff format --check,mypy --strict,mypy --warn-unreachableclean.pytest tests/integration/tccm/passes. - [ ] AC-11b —
mypy --stricton the test directory.mypy --strict tests/integration/tccm/exits zero. The test file's_dispatchis annotated with Protocol-typed parameters (per AC-4b); noAny, no unparameterized generics, no untypeddict. This is what makes AC-4b load-bearing: withoutmypy --stricton the test file, the Protocol-typed signature is just a comment. - [ ] AC-12 — TDD discipline. Red tests committed failing (because the fixture YAML and the mock dispatcher do not yet exist); green commit makes them pass; refactor commit is no-op behavior. Validator can reproduce.
- [ ] AC-13 — coverage ratchet for the
matcharms. A sibling integration testtests/integration/tccm/test_dispatcher_coverage_ratchet.pywrites a synthetic copy of the_dispatchfunction (undertests/integration/tccm/_ratchet_fixtures/) with onecasearm intentionally removed, runsmypy --warn-unreachableagainst it viasubprocess, and asserts mypy reports theassert_neverarm reachable (i.e., theerrorcount includes a hit on the deleted arm's coverage). This is the production ADR-0030 §Consequences ratchet — mirrors the discipline S1-04's_classifyfunction uses. A sixthDerivedQueryvariant (added under an ADR amendment) cannot land green without a correspondingcasearm. Pinned bytest_dispatcher_match_arms_are_under_mypy_warn_unreachable_ratchet.
Implementation outline¶
- Author the reference TCCM at
docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yaml. Eachderived_queriesentry uses the Pydantic discriminator-form:compute: <literal>+ a single payload field (pkg/module/symbol). Noname, nomax_files— neither field exists on any variant insrc/codegenie/tccm/queries.pyandextra="forbid"rejects them. Verify the literalcomputetokens againstqueries.pybefore writing the file:The five literalschema_version: "1" task_class: index-health-self-check required_probes: - b2_index_health - b1_scip - b3_tree_sitter_imports required_skills: - diagnose-stale-scip confidence_floor: kind: degraded reason: stale_scip_acceptable_for_self_check derived_queries: - compute: consumers_of pkg: "@codegenie/scip" - compute: producers_of pkg: "@codegenie/scip" - compute: reverse_lookup module: "codegenie/probes/layer_b/index_health.py" - compute: refs_to symbol: "codegenie/indices/freshness.IndexFreshness" - compute: tests_exercising symbol: "codegenie/probes/layer_b/index_health.py"computetokens are owned by S1-04'squeries.py(consumers_of/producers_of/reverse_lookup/refs_to/tests_exercising). If a future amendment renames any token, this fixture and_expected_tccm()move together. - Author the invalid fixtures under
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_invalid/. Each is a single-defect fixture (avoids accidental coupling to Pydantic v2's error-emission order):# _invalid/unknown_compute.yaml — AC-7 (unknown_query_primitive: prefix) schema_version: "1" task_class: index-health-self-check required_probes: [b2_index_health] required_skills: [] confidence_floor: {kind: trusted} derived_queries: - compute: graph_explode pkg: "@codegenie/scip"# _invalid/missing_required_probes.yaml — AC-7b (schema: prefix) schema_version: "1" task_class: index-health-self-check required_skills: [] confidence_floor: {kind: trusted} derived_queries: []And the three# _invalid/extra_top_level_key.yaml — AC-3c + AC-7b (schema: prefix) schema_version: "1" task_class: index-health-self-check required_probes: [] required_skills: [] confidence_floor: {kind: trusted} derived_queries: [] unexpected_field: 1_floors/fixtures (AC-3b) — each identical to the reference TCCM except for theconfidence_floorblock: - Author
_reference-tccm/README.md— one paragraph per AC-10. - Test file
tests/integration/tccm/__init__.py(empty) andtests/integration/tccm/test_reference_tccm_roundtrips.py. The_ProtocolMethod(StrEnum)is the typed surface of the nine Protocol methods; recorder + assertions both consume it (DP2/DP3)._dispatch's parameters are typed against the Protocols, not the mocks (AC-4b). AllTrusted/Degraded/Unavailableconstructors omit the redundantkind=kwarg (the field has a default):# tests/integration/tccm/test_reference_tccm_roundtrips.py from __future__ import annotations from enum import StrEnum from pathlib import Path from typing import assert_never import pytest from pydantic import ValidationError from codegenie.adapters.confidence import ( AdapterConfidence, Degraded, Trusted, Unavailable, ) from codegenie.adapters.protocols import ( DepGraphAdapter, ImportGraphAdapter, ScipAdapter, TestInventoryAdapter, Occurrence, TestId, ) from codegenie.errors import TCCMLoadError from codegenie.parsers import safe_yaml from codegenie.result import Ok from codegenie.tccm.loader import TCCMLoader from codegenie.tccm.model import TCCM from codegenie.tccm.queries import ( ConsumersOf, DerivedQuery, ProducersOf, ReverseLookup, RefsTo, TestsExercising, ) from codegenie.types.identifiers import ProbeId, SkillId, TaskClassId REPO_ROOT = Path(__file__).resolve().parents[3] REFERENCE_PATH = REPO_ROOT / "docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yaml" INVALID_DIR = REFERENCE_PATH.parent / "_invalid" FLOORS_DIR = REFERENCE_PATH.parent / "_floors" class _ProtocolMethod(StrEnum): """Typed enumeration of the nine Phase-2 Protocol methods. Recorder and assertions both consume this — typo discipline (DP2/DP3). Adding a Protocol method adds an enum value; the test then auto-asserts it.""" DEP_CONSUMERS = "DepGraphAdapter.consumers" DEP_PRODUCERS = "DepGraphAdapter.producers" DEP_CONFIDENCE = "DepGraphAdapter.confidence" IMP_REVERSE_LOOKUP = "ImportGraphAdapter.reverse_lookup" IMP_CONFIDENCE = "ImportGraphAdapter.confidence" SCIP_REFS = "ScipAdapter.refs" SCIP_CONFIDENCE = "ScipAdapter.confidence" TESTS_EXERCISING = "TestInventoryAdapter.tests_exercising" TESTS_CONFIDENCE = "TestInventoryAdapter.confidence" # ---- Mocks that structurally satisfy the four Protocols -------------------- # Duplication across the four mocks is intentional: these are deleted when # Phase 8 ships real adapters. Do NOT pull a `_RecordingAdapter` mixin (DP4). class _MockDepGraph: def __init__(self) -> None: self.calls: list[_ProtocolMethod] = [] def consumers(self, pkg: str) -> list[str]: self.calls.append(_ProtocolMethod.DEP_CONSUMERS); return ["a", "b"] def producers(self, pkg: str) -> list[str]: self.calls.append(_ProtocolMethod.DEP_PRODUCERS); return ["c"] def confidence(self) -> AdapterConfidence: self.calls.append(_ProtocolMethod.DEP_CONFIDENCE); return Trusted() class _MockImportGraph: def __init__(self) -> None: self.calls: list[_ProtocolMethod] = [] def reverse_lookup(self, module: str) -> list[str]: self.calls.append(_ProtocolMethod.IMP_REVERSE_LOOKUP); return ["x.py"] def confidence(self) -> AdapterConfidence: self.calls.append(_ProtocolMethod.IMP_CONFIDENCE); return Trusted() class _MockScip: def __init__(self) -> None: self.calls: list[_ProtocolMethod] = [] def refs(self, symbol: str) -> list[Occurrence]: self.calls.append(_ProtocolMethod.SCIP_REFS); return [] def confidence(self) -> AdapterConfidence: self.calls.append(_ProtocolMethod.SCIP_CONFIDENCE) return Degraded(reason="self_check") class _MockTestInventory: def __init__(self) -> None: self.calls: list[_ProtocolMethod] = [] def tests_exercising(self, symbol: str) -> list[TestId]: self.calls.append(_ProtocolMethod.TESTS_EXERCISING); return [] def confidence(self) -> AdapterConfidence: self.calls.append(_ProtocolMethod.TESTS_CONFIDENCE); return Trusted() def _dispatch( query: DerivedQuery, *, dep: DepGraphAdapter, imp: ImportGraphAdapter, scip: ScipAdapter, tests: TestInventoryAdapter, ) -> None: """Route a DerivedQuery to the appropriate Protocol method. Parameter types are the *Protocols*, not the concrete mocks: under `mypy --strict` the calls below are checked against the Protocol signatures — that's the in-Phase-2 anchor for Gap-1 signature drift (AC-4b). `assert_never` on unreachable is the exhaustiveness guard (AC-6); production ADR-0030 §Consequences ratchet (AC-13).""" match query: case ConsumersOf(pkg=p): dep.consumers(p) case ProducersOf(pkg=p): dep.producers(p) case ReverseLookup(module=m): imp.reverse_lookup(m) case RefsTo(symbol=s): scip.refs(s) case TestsExercising(symbol=s): tests.tests_exercising(s) case _ as unreachable: assert_never(unreachable) # ---- AC-1, AC-8 (location discipline) -------------------------------------- def test_reference_tccm_lives_under_docs_not_under_plugins() -> None: assert REFERENCE_PATH.exists(), f"missing fixture: {REFERENCE_PATH}" assert str(REFERENCE_PATH).startswith(str(REPO_ROOT / "docs")) plugins_root = REPO_ROOT / "plugins" if plugins_root.exists(): stray = list(plugins_root.rglob("tccm.yaml")) assert stray == [], ( "02-ADR-0007 violation: reference TCCMs MUST live under docs/, " f"not plugins/. Found: {stray}" ) # ---- AC-9 (fixture-validity smoke) ----------------------------------------- def test_reference_tccm_is_safe_yaml_parseable() -> None: data = safe_yaml.load(REFERENCE_PATH, max_bytes=64 << 10) assert isinstance(data, dict) assert data["schema_version"] == "1" # ---- AC-3 (round-trip equality) -------------------------------------------- def _expected_tccm() -> TCCM: return TCCM( schema_version="1", task_class=TaskClassId("index-health-self-check"), required_probes=[ProbeId("b2_index_health"), ProbeId("b1_scip"), ProbeId("b3_tree_sitter_imports")], required_skills=[SkillId("diagnose-stale-scip")], confidence_floor=Degraded(reason="stale_scip_acceptable_for_self_check"), derived_queries=[ ConsumersOf(pkg="@codegenie/scip"), ProducersOf(pkg="@codegenie/scip"), ReverseLookup(module="codegenie/probes/layer_b/index_health.py"), RefsTo(symbol="codegenie/indices/freshness.IndexFreshness"), TestsExercising(symbol="codegenie/probes/layer_b/index_health.py"), ], ) def test_reference_tccm_loads_and_equals_expected_pydantic_instance() -> None: result = TCCMLoader().load(REFERENCE_PATH) assert result.is_ok(), f"TCCMLoader failed: {result}" assert isinstance(result, Ok) assert result.unwrap() == _expected_tccm() # ---- AC-2 (Counter-style multiset) ----------------------------------------- def test_reference_tccm_exercises_every_derived_query_variant() -> None: from collections import Counter tccm = TCCMLoader().load(REFERENCE_PATH).unwrap() assert len(tccm.derived_queries) == 5 counts = Counter(type(q).__name__ for q in tccm.derived_queries) assert counts == Counter({ "ConsumersOf": 1, "ProducersOf": 1, "ReverseLookup": 1, "RefsTo": 1, "TestsExercising": 1, }) # ---- AC-3b (every confidence_floor variant round-trips) -------------------- @pytest.mark.parametrize(("filename", "expected"), [ ("trusted.yaml", Trusted()), ("degraded.yaml", Degraded(reason="self_check_acceptable")), ("unavailable.yaml", Unavailable(reason="scip_offline")), ]) def test_confidence_floor_round_trips_for_every_variant( filename: str, expected: AdapterConfidence, ) -> None: result = TCCMLoader().load(FLOORS_DIR / filename) assert result.is_ok(), f"{filename}: {result}" assert result.unwrap().confidence_floor == expected # ---- AC-3c (frozen + extra=forbid) ----------------------------------------- def test_loaded_tccm_is_frozen_and_forbids_extras() -> None: tccm = TCCMLoader().load(REFERENCE_PATH).unwrap() with pytest.raises(ValidationError): tccm.task_class = "x" # type: ignore[misc] result = TCCMLoader().load(INVALID_DIR / "extra_top_level_key.yaml") assert result.is_err() assert result.unwrap_err().args[0].startswith("schema:") # ---- AC-3d (per-variant JSON round-trip) ----------------------------------- def test_every_derived_query_round_trips_through_json() -> None: tccm = TCCMLoader().load(REFERENCE_PATH).unwrap() for q in tccm.derived_queries: same = type(q).model_validate_json(q.model_dump_json()) assert q == same, f"JSON round-trip drift on {type(q).__name__}" # ---- AC-4, AC-5 (Gap 1 closer — invocation) -------------------------------- def test_mock_dispatcher_invokes_every_protocol_method_at_least_once() -> None: tccm = TCCMLoader().load(REFERENCE_PATH).unwrap() dep, imp, scip, tests = ( _MockDepGraph(), _MockImportGraph(), _MockScip(), _MockTestInventory(), ) # AC-5 — structural Protocol conformance. assert isinstance(dep, DepGraphAdapter) assert isinstance(imp, ImportGraphAdapter) assert isinstance(scip, ScipAdapter) assert isinstance(tests, TestInventoryAdapter) for query in tccm.derived_queries: _dispatch(query, dep=dep, imp=imp, scip=scip, tests=tests) for adapter in (dep, imp, scip, tests): adapter.confidence() all_calls: set[_ProtocolMethod] = ( set(dep.calls) | set(imp.calls) | set(scip.calls) | set(tests.calls) ) missing = set(_ProtocolMethod) - all_calls assert not missing, f"never invoked: {sorted(m.value for m in missing)}" # ---- AC-6 (assert_never fires on imposter — narrow exception) -------------- def test_dispatcher_match_is_exhaustive_assert_never_fires_on_smuggled_variant() -> None: # `match` matches ConsumersOf/etc. by class identity. An arbitrary # object falls cleanly through to `case _ as unreachable`. class _Imposter: pass with pytest.raises(AssertionError): _dispatch( _Imposter(), # type: ignore[arg-type] dep=_MockDepGraph(), imp=_MockImportGraph(), scip=_MockScip(), tests=_MockTestInventory(), ) # ---- AC-7 (unknown_query_primitive: prefix on args[0]) --------------------- def test_unknown_compute_primitive_returns_typed_err_prefix() -> None: result = TCCMLoader().load(INVALID_DIR / "unknown_compute.yaml") assert result.is_err(), f"expected Err, got {result}" err = result.unwrap_err() assert isinstance(err, TCCMLoadError) # TCCMLoadError is a marker — no .reason attribute. Reason lives in # args[0] as a prefix (errors.py + loader.py docstring). assert err.args[0].startswith("unknown_query_primitive:"), err.args # ---- AC-7b (LoaderReason taxonomy via sibling fixtures) -------------------- @pytest.mark.parametrize(("filename", "expected_prefix"), [ ("malformed.yaml", "parse:"), ("missing_required_probes.yaml", "schema:"), ("extra_top_level_key.yaml", "schema:"), ("unknown_compute.yaml", "unknown_query_primitive:"), ]) def test_invalid_fixtures_cover_loader_reason_taxonomy( filename: str, expected_prefix: str, ) -> None: result = TCCMLoader().load(INVALID_DIR / filename) assert result.is_err(), f"{filename}: expected Err, got {result}" err = result.unwrap_err() assert isinstance(err, TCCMLoadError) assert err.args[0].startswith(expected_prefix), \ f"{filename}: expected '{expected_prefix}…' got '{err.args[0]}'"
The AC-13 ratchet test lives in a sibling file tests/integration/tccm/test_dispatcher_coverage_ratchet.py to keep this file focused on the round-trip story. It writes a copy of _dispatch with one case arm removed to a tmp file under _ratchet_fixtures/, runs mypy --warn-unreachable via subprocess, and asserts the assert_never arm is reported as unreachable (i.e., the inverse coverage ratchet fires).
5. Sanity-check against the live queries.py before committing. Run python -c "from codegenie.tccm.queries import ConsumersOf, ProducersOf, ReverseLookup, RefsTo, TestsExercising; [print(c.__name__, list(c.model_fields)) for c in (ConsumersOf, ProducersOf, ReverseLookup, RefsTo, TestsExercising)]". The expected output (verified against master 2026-05-15) is two fields per variant: compute (defaulted Literal) and one of pkg / module / symbol. If master has drifted (additional fields, renamed payload field), align the YAML and _expected_tccm() to whatever queries.py ships and record the resolution in the attempt log — queries.py is the source of truth, this story conforms.
6. _reference-tccm/README.md — short, one paragraph:
# Reference TCCM for `index-health-self-check`
This directory holds the Phase 2 reference TCCM — a minimal Task-Class
Context Manifest (ADR-0029) that exercises every field of the `TCCM`
Pydantic model and every variant of the five-primitive `DerivedQuery`
discriminated union (ADR-0030). It is **documentation**, not a plugin
(02-ADR-0007); it lives under `docs/`, not under `plugins/`, deliberately
outside the namespace Phase 3 owns (ADR-0031 §Consequences §1).
Consumed by `tests/integration/tccm/test_reference_tccm_roundtrips.py`,
which closes `phase-arch-design.md §"Gap analysis" Gap 1`
("Protocols defined, never called in Phase 2") by dispatching each
`DerivedQuery` variant to a mock adapter implementing all four Phase 2
`Protocol`s and asserting every Protocol method is invoked at least once.
Do not move this fixture without updating the test paths and 02-ADR-0007.
`_invalid/unknown_compute.yaml` is a sibling negative fixture for the
`TCCMLoadError(reason="unknown_query_primitive")` path (S1-04).
TDD plan — red / green / refactor¶
Red — write the failing test first¶
Land tests/integration/tccm/__init__.py + test_reference_tccm_roundtrips.py (the file shown in Implementation outline §4) before the YAML fixture. Every test fails because REFERENCE_PATH does not exist (or, in the case of _expected_tccm, because the TCCMLoader call returns an error). Commit as red.
Green — make it pass¶
- Create the directory
docs/phases/02-context-gather-layers-b-g/_reference-tccm/. - Write
tccm.yamlper Implementation outline §1. Run the test once — it will likely fail with a PydanticValidationErroragainst_expected_tccm()if the variant field names differ from this story's prescription. Adjust the YAML and_expected_tccm()until they agree, matching whatever shape S1-04 actually shipped (which is the authoritative parser). - Write
_invalid/unknown_compute.yamlper Implementation outline §2. - Write
_reference-tccm/README.mdper Implementation outline §6. - Re-run the test suite. All six tests pass.
Refactor — clean up¶
- The mock adapter classes stay in the test file (one-shot mocks are not worth a fixture module).
- The
_dispatchhelper stays in the test file as a private function; it is the in-Phase-2 proof of correctness for the Protocol/Query alignment, not production code (Phase 8's Bundle Builder is the production dispatcher). - If
_expected_tccm()grows past ~30 lines, factor it to a_fixtures.pysibling; otherwise inline. - Do not centralize a "Protocol coverage map" — the
called(...)assertions intest_mock_dispatcher_invokes_every_protocol_method_at_least_onceare the documentation. A central map would have to be edited every time a Protocol method is added; the test failure on a missed method is the better forcing function.
Files to touch¶
| Path | Why |
|---|---|
docs/phases/02-context-gather-layers-b-g/_reference-tccm/tccm.yaml |
New — reference TCCM exercising every TCCM field + every DerivedQuery variant |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/README.md |
New — explains why the fixture lives under docs/, not plugins/; cites 02-ADR-0007 + Gap 1 |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_invalid/unknown_compute.yaml |
New — unknown_query_primitive: prefix (AC-7) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_invalid/malformed.yaml |
New — parse: prefix (AC-7b) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_invalid/missing_required_probes.yaml |
New — schema: prefix (AC-7b) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_invalid/extra_top_level_key.yaml |
New — schema: prefix (AC-3c, AC-7b) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_floors/trusted.yaml |
New — confidence_floor: Trusted round-trip (AC-3b) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_floors/degraded.yaml |
New — confidence_floor: Degraded round-trip (AC-3b) |
docs/phases/02-context-gather-layers-b-g/_reference-tccm/_floors/unavailable.yaml |
New — confidence_floor: Unavailable round-trip (AC-3b) |
tests/integration/__init__.py |
New if absent — package marker |
tests/integration/tccm/__init__.py |
New — package marker |
tests/integration/tccm/test_reference_tccm_roundtrips.py |
New — eleven named integration tests (incl. parametrized) closing Gap 1 |
tests/integration/tccm/test_dispatcher_coverage_ratchet.py |
New — AC-13 coverage ratchet via mypy --warn-unreachable subprocess |
Out of scope¶
- Bundle Builder — Phase 8 owns the production dispatcher. The mock dispatcher in this story is a test-time proxy; it is not exported and not consumed by any production code.
- A second TCCM — Phase 3 ships the first real plugin's
tccm.yaml(plugins/vulnerability-remediation--node--npm/tccm.yaml); Phase 2 ships one reference TCCM, deliberately underdocs/. - Real adapter implementations — Phase 3 (
adapters/scip_node.py, etc.) per 02-ADR-0007. The mocks here are deliberately structural, not behavioral. @register_index_freshness_checkregistration for the reference TCCM — Phase 2's freshness registry is exercised byIndexHealthProbe(S4-01), not by the TCCM loader. The reference TCCM'sconfidence_floor: degradedis a model field, not a runtime registration.- A
SkillsLoader.find_applicable(["diagnose-stale-scip"])call against the reference TCCM — the reference TCCM'srequired_skillsis data, not resolved against a live SkillsLoader; this story does not assert that the Skill named there exists. A future Phase 2 contributor addingtests/fixtures/skills/diagnose-stale-scip/SKILL.mdand atest_required_skill_is_findableis welcome but outside this story. tests/integration/adapters/test_phase3_handoff_smoke.py— that file lands skipped in S7-04 as the cross-phase trip-wire. This story is the in-Phase-2 counterpart; the two are complementary, not duplicative.
Notes for the implementer¶
- Pin S1-04's variant field names before writing the YAML. The story prescribes
ConsumersOf(pkg=..., name=..., max_files=...)etc., but the authoritative definition lives insrc/codegenie/tccm/queries.py(S1-04). Runpython -c "from codegenie.tccm.queries import ConsumersOf; print(ConsumersOf.model_fields.keys())"first; align the YAML,_expected_tccm(), and thematcharms to whatever S1-04 actually shipped. If a variant field name disagrees with this story's prescription, adjust this story's test file and YAML to match S1-04 — do not amend S1-04. - The
compute:string parsing belongs to S1-04, not to this story. S1-04 owns the regex / string-to-variant mapping (e.g., doesdep_graph.consumers_of("@x")parse asConsumersOf(pkg="@x")?). This story's YAML must produce a tree that S1-04's loader accepts. If S1-04'scompute:form iscompute: { primitive: consumers_of, args: { pkg: "@x" } }instead of the inline-DSL shape this story shows, adopt S1-04's form verbatim and update_expected_tccm(). runtime_checkableis the load-bearing decorator on the Protocols. AC-5'sisinstance(mock_dep, DepGraphAdapter)only works because S1-03 applied@runtime_checkableto the Protocols. If for some reason S1-03 omitted that decorator, file a Phase 2 amendment ADR and add it — this is the discoverability mechanism for Phase 3 plugin authors.- Adapter-confidence shape. The
AdapterConfidencediscriminated union lives atsrc/codegenie/adapters/confidence.py. The mockconfidence()methods return one ofTrusted,Degraded,Unavailablefrom there; do not invent a fourth state in this test. If the dispatcher boundary returnsUnavailable(reason="scip_offline_in_test"), that's fine for the mock — the test only asserts the method was called, not that it returned a specific value. _invalid/sibling directory. The negative fixture (unknown_compute.yaml) lives under_reference-tccm/_invalid/so it's discoverable to a future contributor looking at the reference fixture. Do not put it undertests/fixtures/— keeping the positive and negative TCCM examples together is the documentation discipline 02-ADR-0007 implies.assert_neverin the test's_dispatch. AC-6 demands acase _ as unreachable: assert_never(unreachable)arm. Python'styping.assert_neverraisesAssertionErrorat runtime when the static-typing escape hatch is reached;pytest.raises((AssertionError, TypeError, ValueError))is the defensive catch (different Python minor versions have raised different exception types).- The mock dispatcher is NOT production code. Do not export it from a
src/module. Do not let it leak into a future story. It is a test fixture that proves the Phase 2 Protocols are wired to accept the Phase 2DerivedQueryshapes. Phase 8's Bundle Builder is the production dispatcher; that is a separate phase's design. - If the Phase 1
ResultAPI isResult.Ok(value)/Result.Err(error)constructors plus.is_ok()/.unwrap()/.is_err()/.unwrap_err()— verify the method names insrc/codegenie/result.pybefore writing the test. The exact API was nailed down in S1-04; mirror it. Ifresultuses.value/.errorattribute access instead of.unwrap()methods, conform. mypy --warn-unreachableon the dispatcher. The five-armmatchoverDerivedQueryplus anassert_neverfinal arm should type-check cleanly. Ifmypycomplains that the_ as unreachablearm is reachable, you have a missingcase— fix the dispatcher, do not silence the warning. This is the exhaustiveness ratchetphase-arch-design.md §"Agentic best practices" → "Typed state"requires from day 1.
Design patterns¶
- Sum type + exhaustive
match+assert_neveris the correct shape for_dispatch. Sibling-symmetric with S1-01 (StaleReason), S1-04 (_classifyoverLoaderReason), and theAdapterConfidenceconsumers. Do not introduce a registry pattern here (@register_dispatcher_for(ConsumersOf)) — three-strikes hasn't fired (one dispatcher in this phase) and Phase 8's Bundle Builder is the production dispatcher per ADR-0030's design lineage. The Refactor section codifies "do not export" — keep it that way. _ProtocolMethod(StrEnum)is the typed surface, not a registry. Mirror of S1-04'sLoaderReason: TypeAlias = Literal[...]. Adding a Protocol method requires adding an enum value (single-place edit); the AC-4 assertion auto-iterates. The forcing-function discipline survives without a "central Protocol coverage map" — the StrEnum is the surface.- Mock-class duplication is intentional. The four
_Mock*classes share an__init__+confidence()shape that violates three-strikes on its face. Rule 3 wins: these mocks are deleted when Phase 8's Bundle Builder ships real adapters. Pulling a_RecordingAdaptermixin couples four test fixtures to a micro-abstraction with one consumer, makes the mocks harder to delete later, and obscures the structural-Protocol-conformance proof (the explicit method list is the documentation). - Coverage ratchet via
mypy --warn-unreachablesubprocess. AC-13's sibling test is the inverse coverage ratchet: a copy of_dispatchwith onecasearm deleted should makemypy --warn-unreachableflag theassert_neverarm as reachable. This is the production ADR-0030 §Consequences ratchet — not a new mechanism, an enforcement of an existing one. Keep the synthetic copy under_ratchet_fixtures/(gitignored) so future renames don't ripple into the test. - Newtype opportunity flagged for future ADR (out of scope here). The four Protocols take
pkg: str,module: str,symbol: str— semantically distinct domains all collapsed tostr(production ADR-0033 §3 "primitive obsession on domain identifiers" review-blocker pattern). S2-03 does not widen S1-03's Protocol surface; that's a Phase-3-entry amendment ADR. Record the opportunity here but do not act on it. - Functional-core / imperative-shell:
_dispatchmutates the mocks via the Protocol calls — tangled. Pure version would returnlist[Call]. Resist the refactor: this is a test fixture that proves Protocol invocation; the side effect IS the assertion target. Pure version would obscure intent.