Attempt log — S2-02 ConventionsCatalogLoader¶
Attempt 1 — 2026-05-16 — GREEN¶
Outcome¶
All 16 ACs satisfied; 55 new tests; full suite green (1946 passed, 5 skipped,
2 xfail). Coverage on codegenie.conventions.*: 89–100% per file; well
above the 85% phase floor.
Files created¶
src/codegenie/conventions/__init__.py—__all__re-exports + package docstringsrc/codegenie/conventions/_io.py—read_capped_text(path, *, max_bytes)src/codegenie/conventions/model.py— fourConventionRule*Pydantic variants withmodel_validator(mode="after")regex compile,ConventionRulediscriminated union,Pass/Fail/NotApplicableminimal-field models,ConventionResultdiscriminated unionsrc/codegenie/conventions/loader.py— sevenConventionsErrorvariants,CatalogLoadOutcome,FatalLoadError,ConventionsCatalogLoaderwith multi-file partial-success driver,_classify_validation_error/_classify_yaml_failurepure helperssrc/codegenie/conventions/catalog.py—Catalogmodel with_memo(PrivateAttr-backed) for AC-12, module-level_apply_oneexhaustivematch+assert_never, four independent_apply_*helperstests/unit/conventions/__init__.pytests/unit/conventions/test_catalog.py— 47 behavioural tests (AC-1..AC-13d, AC-15, AC-16)tests/unit/conventions/test_no_direct_yaml_import.py— AST scan (AC-10)tests/unit/conventions/test_no_model_construct.py— AST scan (AC-14)tests/unit/conventions/test_inverted_helper_is_independent.py— AST scan (AC-5a)tests/unit/conventions/test_no_local_convention_id.py— AST scan (AC-1a)tests/unit/conventions/test_apply_match_is_exhaustive_compile_time.py— mypy invocation (AC-9 compile-time half)tests/fixtures/mypy_warn_unreachable/incomplete_match_conventions.py— deliberately incomplete match fixture
Files modified¶
src/codegenie/types/identifiers.py— additive only:ConventionId = NewType("ConventionId", str)+__all__extension. Zero edits to existing newtypes (B2 closure).
Acceptance-criteria evidence¶
| AC | Evidence |
|---|---|
| AC-1 | test_ac1_module_all_is_exact_set (21-symbol set pinned) + parametrized test_ac1_models_are_frozen_and_extra_forbid (9 models) |
| AC-1a | test_no_local_convention_id_newtype (AST scan) + test_convention_id_resolves_from_identifiers_module |
| AC-2 | test_ac2_constructor_does_no_io (monkeypatch 8 I/O primitives including Path.glob and Path.iterdir) |
| AC-3 | test_ac3_happy_path_per_kind parametrized over all four kind literals; per-kind field assertion-strict |
| AC-3a | test_ac3a_multi_rule_single_file_in_order — two-rule catalog, order pinned |
| AC-3b | test_ac3b_multi_file_merge_is_sorted_relpath — b.yaml written first; loader returns [from-a, from-b] |
| AC-4 | test_ac4_dockerfile_pattern_three_outcomes — Pass/Fail/NA + exact rule_id checks + reason == "no_dockerfile_present" |
| AC-4d | test_ac4d_dockerfile_pattern_uses_re_multiline — ^FROM cgr\.dev/chainguard/ against # comment\nFROM cgr.dev/...\n |
| AC-5 | test_ac5_dockerfile_pattern_inverted_three_outcomes — ^USER root match→Fail, no-match→Pass, missing→NA |
| AC-5a | test_inverted_helper_does_not_call_main_helper (AST walk over _apply_dockerfile_pattern_inverted body) |
| AC-6 | test_ac6_file_pattern_zero_matches_is_not_applicable — reason="file_glob_no_matches" pinned |
| AC-6a | test_ac6a_file_pattern_pass_when_all_match — two passing tsconfigs |
| AC-6b | test_ac6b_file_pattern_fail_names_lex_first_failing — b/tsconfig.json named, z/tsconfig.json not |
| AC-6c | test_ac6c_file_glob_pathlib_recursive_and_dot_exclusion — .hidden/foo.json excluded by pathlib default |
| AC-7 | test_ac7_missing_file_passes_when_absent_fails_when_present — Pass on clean repo, Fail on dirty (evidence contains "Dockerfile") |
| AC-8 | test_ac8_unknown_pattern_type_isolated_other_rules_unaffected — partial-success contract verified |
| AC-8a | test_ac8a_unsafe_yaml_python_object_does_not_execute (sentinel uncreated; AC-8a syntax-error sibling for umbrella honesty) |
| AC-8b | test_ac8b_size_cap_exceeded — 1.1 MiB padded file |
| AC-8c | test_ac8c_depth_cap_exceeded — > 64 nesting levels |
| AC-9 | runtime: test_ac9_apply_match_smoke_asserts_assert_never_only (pytest.raises(AssertionError) only — B4); compile-time: test_incomplete_conventions_match_fails_mypy_strict |
| AC-9a | test_ac9a_result_model_field_sets_are_minimal — Pass.model_dump() exact dict + rejection of stray evidence |
| AC-10 | AST: test_no_conventions_module_imports_yaml_directly; runtime: test_ac10_safe_yaml_chokepoint_is_only_yaml_call_site (spy counts == 2 for two catalog files) |
| AC-11 | test_ac11_extra_field_yields_typed_schema_error — unexpected_key lands in details |
| AC-11a | test_ac11a_uncompilable_regex_pattern_lands_at_load — [unterminated → SchemaError with details[*].loc containing "pattern" |
| AC-12 | test_ac12_catalog_apply_is_idempotent_without_repeated_io — second apply() call yields zero Path.open over the repo (memoization on id(repo)) |
| AC-13 | parametrized test_ac13_conventions_error_discriminated_union_seven_reasons (7 reasons round-trip via TypeAdapter) + test_ac13_symlink_refused_json_shape + test_ac13_unknown_reason_raises |
| AC-13a | test_ac13a_toctou_disappearance_yields_catalog_file_unreadable — monkeypatch raises FileNotFoundError for one file; other file's rule persists |
| AC-13b | test_ac13b_partial_success_under_mixed_quality — three-file fixture (ok.yaml, unknown.yaml, broken.yaml) → one rule + two errors (unknown_pattern_type, unsafe_yaml) |
| AC-13c | test_ac13c_empty_search_paths_returns_ok_empty |
| AC-13d | test_ac13d_fatal_when_every_search_path_unreadable — os.access patched to always-False → Result.Err(FatalLoadError(...)) |
| AC-14 | test_no_model_construct_call_or_assignment (AST scan) + pre-commit forbidden-patterns already includes conventions in the Phase-2 packages set (existing rule in scripts/check_forbidden_patterns.py) |
| AC-15 | ruff check, ruff format --check, mypy --strict --warn-unreachable all clean; pytest tests/unit/conventions/ passes (55 tests in 0.33s) |
| AC-16 | RED-GREEN: confirmed first that the test file fails to import (ModuleNotFoundError: No module named 'codegenie.conventions'), then landed the implementation in the order prescribed by the story (identifiers → _io → model → loader → catalog → __init__); refactor was a no-op behavior |
Design-pattern decisions recorded¶
- Strategy + Open/Closed at the file boundary. Four
_apply_*helpers live as independent module-level pure functions, dispatched via an exhaustivematchon theConventionRulediscriminated union. Adding a fifth variant in Phase 3+ is a new model class + new helper + newcasearm — zero edits to existing helpers,Catalog, orConventionsCatalogLoader. Themypy --warn-unreachableratchet (repo-wide, Phase 0) makes a missingcasearm a build failure. - Functional core / imperative shell.
_apply_*helpers are pure given aRepoSnapshot; reading repo files via_io.read_capped_textkeeps the seam atpathlib.Pathrather than amending the frozenRepoSnapshotcontract (Phase 0 ADR-0007 — B1 closure).ConventionsCatalogLoader.load_all()is the I/O shell. - Smart constructor on regex patterns. Each
pattern-carrying variant compiles its regex at load time viamodel_validator(mode="after")and stashes the compiledre.Patternon_compiled_pattern. Compilation failure surfaces as aSchemaErrorat the gather phase, not aRuntimeErrormid-Catalog.apply.re.MULTILINEis the compile-time flag. - Newtype
ConventionIdat the canonical home.codegenie.types.identifiersgains exactly one new symbol; AST source-scan ratchet forbids local redefinition (B2). - Tagged-union
ConventionsErrormirrorsSkillsLoadError(S2-01). Same field names (reason,path,details), sevenreasonLiterals. Phase 3 plugin code can pattern-match uniformly across the two. - Make-illegal-states-unrepresentable on
ConventionResult.Passhas noevidence,NotApplicablehas noevidence,Failhas noreason.extra="forbid"rejects defensive leakage at construction time. Catalog.applymemoization onid(repo)._memo: dict[int, list[ConventionResult]]via PydanticPrivateAttrso two consecutive applies against the same snapshot reuse results — the snapshot is the I/O boundary (AC-12). A fresh snapshot wires a fresh evaluation.unsafe_yamlumbrella keeps operator response coherent.MalformedYAMLErrorcovers parser, scanner, constructor, and top-level-non-mapping; operator response is the same (inspect before re-running) so the umbrella names the response. AC-8a verifies both the!!python/objectconstructor case AND a pure syntax-error case land in the umbrella, and that no code executes for either.
Lessons (carry forward)¶
- L1 — Pydantic v2 discriminator-tag failures locate at the parent list slot, not at
kind. The error row carriesloc=['rules', 0](or[0, '<variant_kind>']) andtype='union_tag_invalid'; the offending tag is inctx.tag(string) orinput['kind'](fallback). The original "loc ends in 'kind'" filter never fires — match ontype ∈ {union_tag_invalid, union_tag_not_found}and pull fromctx.tag. - L2 —
ValidationError.errors()may include non-JSON-serialisable values. When amodel_validator(mode="after")raises a Python exception (e.g.,ValueErrorfrom a regexre.errorre-raise), Pydantic surfaces the exception object insideerrors()[i].ctx.error. Storing the rows directly inSchemaError.details: list[dict[str, object]]lets construction succeed but blows up atmodel_dump(mode="json"). Fix: round-trip throughexc.json()(JSON-safe) andjson.loadsback to Python — single line, alias-resistant. - L3 —
frozen=TruePydantic models can still hold a mutable_memoviaPrivateAttr.PrivateAttr(default_factory=dict)is excluded from serialization, equality, and the frozen write-block — exactly the shape needed for per-instance memoization of pure-functional outputs. Stash sites must mutate the dict in place; reassigningself._memo = ...would still violate frozen. - L4 — Compile-time pattern compilation at the discriminator boundary needs explicit MULTILINE.
re.search(pattern, text)with^anchors to string start only; users authoring Dockerfile patterns expect line anchors. Compile withre.MULTILINEonce at load time so every_apply_*helper inherits the semantics without re-passing flags at call sites. - L5 — The Pydantic v2
frozen=True+ private-attribute write needsobject.__setattr__. Themodel_validator(mode="after")runs after the model has been frozen — directself._compiled_pattern = ...raises.object.__setattr__(self, "_compiled_pattern", compiled)is the documented escape hatch.