Skip to content

Story S5-01 — ScenarioResult + ScannerOutcome shared discriminated unions

Status: Done Completed: 2026-05-17 Attempts: 1 (GREEN on first pass — see _attempts/S5-01.md) Evidence: - Files: src/codegenie/probes/_shared/__init__.py, src/codegenie/probes/_shared/scanner_outcome.py, src/codegenie/probes/layer_c/__init__.py, src/codegenie/probes/layer_c/scenario_result.py, scripts/check_forbidden_patterns.py (S5-01 path predicate + Rule row) - Tests: tests/unit/probes/_shared/test_scanner_outcome.py (24 tests / 36 rows), tests/unit/probes/layer_c/test_scenario_result.py (22 tests / 33 rows), tests/property/test_sum_types_roundtrip.py (2 Hypothesis properties), tests/unit/pre_commit/test_forbidden_patterns_phase2_extension.py (+15 new cells: 12 positive S5-01 + 3 negative neighbours) - Gates: pytest 2431 passed / 15 skipped / 2 xfailed (pre-existing); ruff check clean; ruff format --check clean (304 files); mypy --strict src/ clean (97 files); lint-imports 2 contracts kept / 0 broken; coverage 93.30% (above 85% floor) - Commit: (pending push)

Step: Step 5 — Ship Layer C (runtime + container) probes Original status (pre-execution): Ready — HARDENED (validated 2026-05-16) Effort: S Depends on: S1-07 (run_external_cli lands the ProcessResult shape ScannerFailed mirrors), S3-03 (writer signature tightening — ScannerOutcome flows through the redaction chokepoint) ADRs honored: 02-ADR-0001 (Layer C/G binaries — outcome types model the failure modes), 02-ADR-0006 (sum-type discipline for state machines)

Validation notes (2026-05-16)

Story hardened by phase-story-validator (_validation/S5-01-scenario-scanner-outcome-types.md). This is the 2nd canonical sum-type story in Phase 2 (S1-01 IndexFreshness was the 1st); the precedent set there now applies symmetrically. Verdict: HARDENED. Twelve in-place edits applied:

  1. Discriminator-string pinning (new AC-12): exact "ran"/"skipped"/"failed" (ScannerOutcome) and the matching strings for ScenarioResult / TraceFailureReason / TraceSkipReason variants — symmetric kind swaps would round-trip but break every downstream consumer. Mirrors S1-01 hardening F2.
  2. Nested-type roundtrip preservation (AC-5 tightened): type(decoded.reason) is type(instance.reason) for every TraceScenarioFailed/TraceScenarioSkipped and Finding element on ScannerRan. Guards a regression that drops Field(discriminator="kind") from the inner Annotated wrapper (the same regression S1-01 F1/mutation #3 caught).
  3. JSON-shape pin (new AC-13): literal model_dump(mode="json") snapshot for one ScannerOutcome variant and one ScenarioResult variant — pins the cross-doc kind discriminator-field name at the JSON boundary, not just the Python-object boundary.
  4. Unknown-discriminator rejection (new AC-14): TypeAdapter(<Union>).validate_python({"kind": "bogus"}) raises ValidationError for all four unions (ScannerOutcome, ScenarioResult, TraceFailureReason, TraceSkipReason).
  5. Exhaustive match over inner unions (new AC-6a): TraceFailureReason and TraceSkipReason get their own assert_never consumer helpers; rehearses the discipline at every level, not just the top.
  6. Hypothesis property test (new AC-15): adds tests/property/test_sum_types_roundtrip.py covering both unions, mirroring S1-01's ADR-0006-§Consequences-anchored property test. Symmetry argument is load-bearing.
  7. Finding.metadata JSONValue round-trip (new AC-16): an arbitrary nested JSONValue payload round-trips byte-for-byte through ScannerRan(findings=[Finding(...)]) — pins consumption of Phase 1's existing JSONValue and catches a regression to dict[str, Any].
  8. Frozen + extra=forbid mutation-resistance test (new AC-17): inst.kind = "other" raises ValidationError (frozen); Model(..., extra_field=1) raises ValidationError (extra=forbid) — for every variant of both unions. Mirrors S1-01.
  9. __all__ is pinned literally (new AC-18): regression test asserts set(module.__all__) == EXPECTED to catch silent export drift.
  10. AC-10 rewritten — repo-wide mypy flag (consistency fix): S1-11 validation confirmed [tool.mypy] warn_unreachable = true is already repo-wide (pyproject.toml line 141, since Phase 0 S1-02). Per-module overrides for these two modules would be redundant noise; AC-10 now asserts the repo-wide flag is in force and the module is included in default mypy --strict runs.
  11. AC-11 hedge dropped — forbidden-patterns extension is required (consistency fix): inspection of scripts/check_forbidden_patterns.py confirms _is_under_phase2_banned_package covers {indices, tccm, skills, conventions, adapters, depgraph, output} — does NOT cover probes/_shared/ or probes/layer_c/. The script must be extended (new path predicate or extended package set) with a dedicated test parametrized over the four model_construct source-forms × the two new path scopes, mirroring S1-11's AC-2/AC-3 pattern.
  12. Smart-constructor module constant (CF8 / DF2): STDERR_TAIL_CAP_BYTES: Final[int] = 4096 is exposed as a module-level constant, both the field_validator and the boundary tests import it (no magic number duplication); module docstring contrasts this per-outcome cap with the S3-03 writer's 64 MB cap.

Notes-for-implementer extended with four new paragraphs: variant-set extension is ADR-amendment-gated (NOT Open/Closed — mirrors S1-01); producer/consumer assert_never ladder discipline (this module is the producer; S5-02 / S5-04 / S6-06 / S6-07 / S6-08 are consumers); arch-doc drift note (phase-arch-design.md §"Data model" line 731 still pins layer_g/scanner_outcome.py — High-level-impl.md §172 is the authoritative location); scenario_name newtype deferral to S1-05.

Context

Layer C's RuntimeTraceProbe (S5-02) and Layer G's scanner family (SemgrepProbe, SyftProbe, GrypeProbe, GitleaksProbe; S5-04 + S6-06 + S6-07) both need typed outcomes. RuntimeTraceProbe runs 5 scenarios per gather; each scenario can complete, fail (timeout / docker-build error / strace-unavailable), or be skipped (no Dockerfile present, image-digest unresolved). Every Layer G scanner can run, be skipped (tool missing), or fail (non-zero exit, invalid-JSON stdout). The architecture (phase-arch-design.md §"Data model" + §"Component design" #5–#6) names two discriminated unions:

  • ScenarioResult = TraceScenarioCompleted | TraceScenarioFailed | TraceScenarioSkipped — Layer C only.
  • ScannerOutcome = ScannerRan | ScannerSkipped | ScannerFailedshared between Layer C (SyftProbe/GrypeProbe in S5-04) and Layer G (S6-06/S6-07/S6-08). Both layers must import the same type, so the type lives under codegenie/probes/_shared/ per the manifest's pinned location.

This story plants both unions before any probe consumes them. ADR-0006's sum-type discipline (ADR-0033 §3, make-illegal-states-unrepresentable) applies: every variant carries a kind: Literal[…] discriminator, Pydantic frozen=True, extra="forbid", round-trip identity through model_dump_json / model_validate_json is asserted by test, and consumers match exhaustively with assert_never on the otherwise-reachable branch.

References

Goal

Land two pure-typing modules — src/codegenie/probes/layer_c/scenario_result.py and src/codegenie/probes/_shared/scanner_outcome.py — exporting Pydantic discriminated unions with kind discriminators, JSON round-trip identity, and exhaustive match enforced at the type level for downstream consumers. Zero probes consume these in this story; S5-02 / S5-04 / S6-06 / S6-07 / S6-08 are the consumers.

Acceptance criteria

  • [x] src/codegenie/probes/layer_c/scenario_result.py exists and exports TraceScenarioCompleted, TraceScenarioFailed, TraceScenarioSkipped, StraceUnavailable, ScenarioResult, and the variants' kind Literal values; __all__ is the authoritative export list.
  • [x] src/codegenie/probes/_shared/__init__.py and src/codegenie/probes/_shared/scanner_outcome.py exist; exports ScannerRan, ScannerSkipped, ScannerFailed, ScannerOutcome; both Layer C (S5-04) and Layer G (S6-06/07/08) probes import from this location (no duplicate definitions).
  • [x] Every variant is a Pydantic BaseModel with model_config = ConfigDict(frozen=True, extra="forbid") and a kind: Literal["..."] field with a unique value.
  • [x] ScenarioResult and ScannerOutcome are Annotated[Union[...], Field(discriminator="kind")] (exactly as phase-arch-design.md §"Data model" prescribes).
  • [x] Round-trip identity test: for every variant of both unions, parse(dump(v)) == v byte-for-byte through model_dump_json / model_validate_json (Hypothesis-friendly; the property test in S7-05 extends this). The roundtrip MUST additionally preserve nested discriminated-union types: for every TraceScenarioFailed(reason=R) and TraceScenarioSkipped(reason=R) instance, type(decoded.reason) is type(instance.reason) AND decoded.reason == instance.reason; for every ScannerRan(findings=[Finding(...)]) instance, [type(f) for f in decoded.findings] == [type(f) for f in instance.findings]. This guards a regression that drops Field(discriminator="kind") from the inner Annotated wrapper — a regression which would otherwise round-trip Stale-style equality while silently deserializing reason as a plain dict.
  • [x] Exhaustive match test: a helper consumer function _describe(outcome) -> str matches every variant and assert_nevers the otherwise branch; deliberately removing one case and running mypy --warn-unreachable against the test file produces a build error (proves the discipline is enforceable). The deletion test is documented but not committed in the deleted state — it is the smoke-test of mypy configuration once S1-11's per-module override is in place.
  • [x] StraceUnavailable is a Pydantic model carried as the reason field on TraceScenarioFailed when the macOS path triggers; the reason field's type is itself a discriminated union (placeholder variants today: StraceUnavailable | DockerBuildFailed | ScenarioTimeout | ImageDigestUnresolved) so S5-02 cannot smuggle a string. Each placeholder variant ships with kind: Literal[…] + round-trip; new variants are added by S5-02 implementation as needed (not invented speculatively here).
  • [x] TraceScenarioSkipped carries a typed reason (e.g., NoDockerfile, ImageBuildUnavailable) — same discriminated-union discipline.
  • [x] ScannerFailed carries exit_code: int and stderr_tail: str (capped at 4 KB at construction time — field_validator truncates if longer; documented in module docstring as "the writer caps further at 64 MB; this is the per-outcome cap").
  • [x] ScannerSkipped carries reason: Literal["tool_missing", "tool_unhealthy", "upstream_unavailable"] — a Literal-string enum keeps the slack tight; adding a fourth requires an ADR amendment to 02-ADR-0001's "what shape do scanner outcomes take" footnote or a follow-up ADR.
  • [x] mypy --strict clean on both modules. No per-module [[tool.mypy.overrides]] edit is required: [tool.mypy] warn_unreachable = true is already repo-wide since Phase 0 S1-02 (verified at pyproject.toml line 141; established by S1-11 validation as "honored-broader-than-arch"). A test (or static assertion in the CI fence) asserts the repo-wide flag is present and unmodified; both new modules are included in the default mypy --strict glob (i.e., no exclude = … entry added).
  • [x] No file under src/codegenie/probes/ imports a discriminated-union variant by name from outside _shared/ or layer_c/scenario_result.py (a smoke import test asserts this); the contract is "import the union, not the variants" except for construction.
  • [x] scripts/check_forbidden_patterns.py is extended to ban model_construct under src/codegenie/probes/_shared/** and src/codegenie/probes/layer_c/scenario_result.py. Inspection at validation time confirms _is_under_phase2_banned_package currently covers {indices, tccm, skills, conventions, adapters, depgraph, output} and does NOT cover probes/_shared/ or probes/layer_c/; the hedge "if not already covered" from the original draft is dropped. The extension MUST live inside the script's applies_when predicate (path-scoped rule) per S1-11 AC-1's discipline — NOT in .pre-commit-config.yaml's files:/exclude: regex — so the test surface and runtime surface are the same. A dedicated test parametrizes the four source_form ∈ {class_call, instance_call, kwarg, renamed_class} × two new paths (src/codegenie/probes/_shared/synth.py, src/codegenie/probes/layer_c/scenario_result.py-style synth) — 8 combinations, each expected to exit non-zero with both 02-ADR-0010 §Decision and production ADR-0033 §3 substrings emitted (mirrors S1-11 AC-2). Negative coverage: probes/layer_a/synth.py and probes/layer_b/synth.py MUST exit zero (the predicate is surgical, not blanket — mirrors S1-11 AC-3).
  • [x] Discriminator strings are exactly pinned (cross-doc contract). For each variant of each union, a dedicated test asserts the exact string named in phase-arch-design.md §"Component design" #5 / #6 — for ScannerOutcome: ScannerRan().kind == "ran", ScannerSkipped(reason="tool_missing").kind == "skipped", ScannerFailed(exit_code=1, stderr_tail="").kind == "failed". For ScenarioResult: the matching strings ("completed", "failed", "skipped"). For TraceFailureReason: "strace_unavailable", "docker_build_failed", "scenario_timeout", "image_digest_unresolved". For TraceSkipReason: "no_dockerfile", "image_build_unavailable". A symmetric swap (e.g., ScannerRan.kind = "failed" + ScannerFailed.kind = "ran") would round-trip cleanly but break every downstream consumer + every renderer / golden file — this test is the structural pin against that mutation.
  • [x] JSON-shape pin: a dedicated test asserts model_dump(mode="json") produces a literal dict with key "kind" (not "tag", "type", etc.) for one ScannerOutcome variant (ScannerSkipped(reason="tool_missing"){"kind": "skipped", "reason": "tool_missing"}) and one ScenarioResult variant (TraceScenarioFailed(scenario_name="startup", reason=StraceUnavailable()){"kind": "failed", "scenario_name": "startup", "reason": {"kind": "strace_unavailable"}}). Catches a symmetric rename of the discriminator field name (e.g., kind → tag on every variant) which round-trip identity would otherwise tolerate.
  • [x] Unknown discriminator rejection at the top-level AND every nested level. For each of ScannerOutcome, ScenarioResult, TraceFailureReason, TraceSkipReason, a parametrized test asserts TypeAdapter(<Union>).validate_python({"kind": "bogus_<name>"}) raises pydantic.ValidationError. Pins the Field(discriminator="kind") wrapper at every level of the sum-of-sums.
  • [x] Exhaustive match test over the inner unions (AC-6a): helper consumer functions _describe_failure_reason(r: TraceFailureReason) -> str and _describe_skip_reason(r: TraceSkipReason) -> str each match every variant and assert_never on the otherwise branch — symmetric with the top-level discipline in AC-6. The producer/consumer assert_never ladder discipline must be rehearsed at EVERY level of the sum, not just the top (S5-05 freshness + S8-01 renderer will match on reason too).
  • [x] Hypothesis property test (new file tests/property/__init__.py if not present, plus tests/property/test_sum_types_roundtrip.py): for any Hypothesis-generated value of ScannerOutcome AND any Hypothesis-generated value of ScenarioResult, the JSON round-trip is byte-identical AND nested-type-preserving. Strategies registered for each variant; integer ranges bounded (exit_code ∈ [0, 255], stderr_tail length ∈ [0, 4096]); the property is the load-bearing argument that pre-shipping the sum types pays — exhaustive over input space, not just example-based. Mirrors S1-01's tests/property/test_index_freshness_roundtrip.py. Forms the basis of S7-05's portfolio integration.
  • [x] Finding.metadata JSONValue round-trip: a dedicated test constructs ScannerRan(findings=[Finding(id="rule-1", severity="medium", metadata={"a": [1, 2.0, "x", True, None, {"nested": [{"deep": [None]}]}]})]) and asserts byte-identical JSON round-trip preserves the metadata tree, including nested list / dict / None. Pins consumption of Phase 1's existing JSONValue (at src/codegenie/parsers/__init__.py:34) and catches a regression to metadata: dict[str, Any] (which would still round-trip primitives but lose the recursive JSONValue constraint mypy enforces).
  • [x] Frozen + extra="forbid" mutation-resistance: for every variant of both unions, a parametrized test asserts (a) inst.kind = "other" raises ValidationError (frozen discipline); (b) constructing with an unexpected field — e.g., ScannerRan(findings=[], extra_field=1) — raises ValidationError (extra=forbid discipline). One test per (variant, mutation) crosscut.
  • [x] __all__ is pinned literally: a test asserts set(module.__all__) == EXPECTED_NAMES_SCANNER for _shared/scanner_outcome.py and set(module.__all__) == EXPECTED_NAMES_SCENARIO for layer_c/scenario_result.py, where the expected sets are the variant names + the union alias + (for scenario) the inner TraceFailureReason / TraceSkipReason reason types. Catches silent export drift.
  • [x] stderr_tail cap as named module constant: STDERR_TAIL_CAP_BYTES: Final[int] = 4096 is exposed as a module-level constant in _shared/scanner_outcome.py. Both the field_validator and the boundary tests import it (no magic number duplication). Module docstring contrasts this per-outcome cap with the S3-03 writer's 64 MB cap; the test asserts the constant is Final[int] = 4096.
  • [x] stderr_tail cap boundary mutation-resistance: parametrized test over input lengths {0, 1, 4095, 4096, 4097, 8192}; expected output lengths {0, 1, 4095, 4096, 4096, 4096}. Pins off-by-one in [: cap] slicing.
  • [x] ScannerSkipped.reason Literal closure: a parametrized test cycles every value in {"tool_missing", "tool_unhealthy", "upstream_unavailable"} (success) and asserts at least three out-of-set strings ("", "ad_hoc", "TOOL_MISSING" — note casing) each raise ValidationError.
  • [x] Finding.severity Literal closure: a parametrized test cycles every value in {"info", "low", "medium", "high", "critical"} (success) and at least three out-of-set strings ("unknown", "INFO", "") each raise ValidationError.
  • [x] Source-scan for model_construct in both new modules: (Path(scanner_outcome.py.read_text()) and the scenario module) do not contain the literal substring model_construct (other than potentially in a docstring naming the ban). Complementary to the forbidden-patterns script extension; survives even if pre-commit is bypassed.

Implementation outline

  1. Create src/codegenie/probes/_shared/__init__.py and src/codegenie/probes/_shared/scanner_outcome.py.
  2. Define ScannerRan, ScannerSkipped, ScannerFailed as Pydantic models with frozen=True, extra="forbid" and kind discriminators. ScannerRan.findings: list[Finding] is a forward reference to a Finding placeholder Pydantic model (also defined in this module — minimal shape: kind: Literal["finding"], id: str, severity: Literal["info","low","medium","high","critical"], metadata: dict[str, JSONValue]). The full Finding shape evolves with S5-04 / S6-06 / S6-07; today it is the smallest model that satisfies round-trip.
  3. Export ScannerOutcome = Annotated[Union[ScannerRan, ScannerSkipped, ScannerFailed], Field(discriminator="kind")].
  4. Create src/codegenie/probes/layer_c/__init__.py and src/codegenie/probes/layer_c/scenario_result.py.
  5. Define StraceUnavailable, DockerBuildFailed, ScenarioTimeout, ImageDigestUnresolved as Pydantic models (each kind: Literal["…"]), and a TraceFailureReason = Annotated[Union[...], Field(discriminator="kind")] union for the inner reason field.
  6. Define NoDockerfile, ImageBuildUnavailable as Pydantic models, and a TraceSkipReason = Annotated[Union[...], Field(discriminator="kind")] union for TraceScenarioSkipped.reason.
  7. Define TraceScenarioCompleted(kind, scenario_name: str, artifact_uri: Path, wall_clock_ms: int, syscalls_observed: int, shared_libs_count: int); TraceScenarioFailed(kind, scenario_name, reason: TraceFailureReason); TraceScenarioSkipped(kind, scenario_name, reason: TraceSkipReason).
  8. Export ScenarioResult = Annotated[Union[TraceScenarioCompleted, TraceScenarioFailed, TraceScenarioSkipped], Field(discriminator="kind")].
  9. Write the round-trip + exhaustive-match tests under tests/unit/probes/_shared/test_scanner_outcome.py and tests/unit/probes/layer_c/test_scenario_result.py.
  10. Extend pyproject.toml [tool.mypy] per-module overrides if S1-11 hasn't already pinned codegenie.probes._shared.* and codegenie.probes.layer_c.scenario_result — surface the diff in "Notes for the implementer".

TDD plan — red / green / refactor

Red (write before code; both files start absent or empty):

  1. test_scanner_outcome_roundtrip (tests/unit/probes/_shared/test_scanner_outcome.py): import ScannerOutcome and each variant; construct one of each (with ScannerRan(findings=[Finding(...)]) carrying at least one element); for each constructed value v, assert type(v).model_validate_json(v.model_dump_json()) == v AND [type(f) for f in decoded.findings] == [type(f) for f in v.findings] for ScannerRan. Initial state: ModuleNotFoundError.
  2. test_scanner_outcome_match_exhaustive: a private _describe(outcome: ScannerOutcome) -> str defined inside the test module matches each kind and assert_never(outcome) on the otherwise branch; assert each variant's string. Initial state: import fails.
  3. test_scenario_result_roundtrip (tests/unit/probes/layer_c/test_scenario_result.py): construct one of each top-level variant; for TraceScenarioFailed, parametrize over every TraceFailureReason variant (4) and assert nested-type preservation type(decoded.reason) is type(instance.reason); for TraceScenarioSkipped, parametrize over every TraceSkipReason variant (2) with the same nested-type assertion. Initial state: ModuleNotFoundError.
  4. test_scenario_result_match_exhaustive: helper _describe(result: ScenarioResult) -> str with exhaustive match + assert_never on the otherwise branch. Initial state: import fails.
  5. test_trace_failure_reason_match_exhaustive and test_trace_skip_reason_match_exhaustive (AC-6a): helpers _describe_failure_reason(r: TraceFailureReason) -> str and _describe_skip_reason(r: TraceSkipReason) -> str each match every variant with assert_never. Initial state: import fails.
  6. test_strace_unavailable_is_typed: TraceScenarioFailed(scenario_name="startup", reason=StraceUnavailable()) round-trips with reason.kind == "strace_unavailable"; the parametrized matrix in Test 3 already covers the other inner variants.
  7. test_scanner_failed_stderr_tail_truncates: parametrized over input lengths {0, 1, 4095, 4096, 4097, 8192} (expected output lengths {0, 1, 4095, 4096, 4096, 4096}); the test imports STDERR_TAIL_CAP_BYTES from the module — no magic number repetition.
  8. test_scanner_skipped_reason_literal_closure: parametrized successes over {"tool_missing", "tool_unhealthy", "upstream_unavailable"}; parametrized failures over {"", "ad_hoc", "TOOL_MISSING"} (each raises ValidationError).
  9. test_finding_severity_literal_closure: parametrized successes over {"info", "low", "medium", "high", "critical"}; parametrized failures over {"unknown", "INFO", ""} (each raises ValidationError).
  10. test_discriminator_strings_are_exactly_pinned: for every variant of every union, assert Variant(...).kind == "<exact string named in phase-arch-design.md §Component design #5/#6>". Strings as a literal table in the test file; would catch any symmetric swap.
  11. test_json_shape_pinned: assert ScannerSkipped(reason="tool_missing").model_dump(mode="json") == {"kind": "skipped", "reason": "tool_missing"} and TraceScenarioFailed(scenario_name="startup", reason=StraceUnavailable()).model_dump(mode="json") == {"kind": "failed", "scenario_name": "startup", "reason": {"kind": "strace_unavailable"}}. Catches symmetric kind → tag rename.
  12. test_unknown_discriminator_is_rejected: parametrized over all four unions (ScannerOutcome, ScenarioResult, TraceFailureReason, TraceSkipReason); each asserts TypeAdapter(<Union>).validate_python({"kind": f"bogus_{name}"}) raises pydantic.ValidationError.
  13. test_models_are_frozen_and_forbid_extra: parametrized over every variant of both unions × {mutate_kind, extra_field}; asserts each (variant, mutation) raises ValidationError. Mirrors S1-01.
  14. test_all_exports_are_pinned: assert set(scanner_outcome.__all__) == EXPECTED_SCANNER_NAMES and set(scenario_result.__all__) == EXPECTED_SCENARIO_NAMES, with the expected sets literal in the test.
  15. test_finding_metadata_jsonvalue_roundtrip: constructs ScannerRan(findings=[Finding(id="rule-1", severity="medium", metadata={"a": [1, 2.0, "x", True, None, {"nested": [{"deep": [None]}]}]})]); asserts byte-identical JSON round-trip AND decoded.findings[0].metadata == original.metadata.
  16. test_modules_have_no_model_construct: source-scan over the bytes of both new module files; asserts the literal substring model_construct does not appear (a docstring naming the ban is the only allowed occurrence; the test pins zero occurrences and the implementer omits the substring entirely from the source — name it pydantic ctor in docstrings instead).
  17. test_forbidden_patterns_extension_covers_shared_and_scenario_result (tests/unit/pre_commit/test_forbidden_patterns_phase2_extension.py — extends existing S1-11 test): parametrized over 4 source_form × 2 new path scopes (src/codegenie/probes/_shared/synth.py, src/codegenie/probes/layer_c/synth.py); each of the 8 combinations writes a synthetic .py file under tmp_path, runs scripts/check_forbidden_patterns.py via subprocess, asserts exit non-zero AND output contains both 02-ADR-0010 §Decision and production ADR-0033 §3. Negative coverage: same source_form writing under probes/layer_a/synth.py and probes/layer_b/synth.py MUST exit zero.
  18. test_mypy_warn_unreachable_is_repo_wide: parses pyproject.toml and asserts [tool.mypy] warn_unreachable == True; asserts no [[tool.mypy.overrides]] block has exclude matching _shared/scanner_outcome or layer_c/scenario_result (both modules covered by default).
  19. tests/property/test_sum_types_roundtrip.py (AC-15): Hypothesis strategies registered for every variant of both unions; the property given(scanner_outcomes()) assert round_trip(v) == v and type_preserves(v, round_trip(v)) and the equivalent for scenario_results(). Bounds: exit_code ∈ [0, 255], stderr_tail length ∈ [0, 4096], scenario_name printable ASCII length ∈ [1, 64], findings list length ∈ [0, 16]; metadata is recursive(json_value()) with depth bound 4 (matches Phase 1's JSONValue depth cap).

Green:

  1. Create the two modules with the variant models, the Annotated[Union, Field(discriminator)] unions, and the helper validators (stderr_tail truncation; Literal reasons).
  2. Make every test pass without touching any consumer probe.
  3. Extend scripts/check_forbidden_patterns.py's _is_under_phase2_banned_package (or its applies_when predicate) so the existing model_construct rule fires for probes/_shared/** and probes/layer_c/scenario_result.py.

Refactor:

  1. Extract JSONValue type alias usage to match Phase 0's existing JSONValue import from codegenie.parsers (do not re-define).
  2. Add module docstrings naming: (a) the consumers (S5-02 / S5-04 / S6-06 / S6-07 / S6-08); (b) the producer/consumer assert_never ladder discipline ("this module is the producer; consumers match exhaustively and a new variant requires coordinated edits on every consumer + an ADR amendment to 02-ADR-0006 / a follow-up ADR"); (c) the per-outcome STDERR_TAIL_CAP_BYTES = 4096 cap vs. the S3-03 writer's 64 MB cap.
  3. Confirm __all__ exports are the union + the variant names + the placeholder reason types — the union is the public surface; variants are public only for construction.
  4. Confirm STDERR_TAIL_CAP_BYTES: Final[int] = 4096 is exposed at module level and consumed by the field_validator (no magic number duplication).

Files to touch

  • New: src/codegenie/probes/_shared/__init__.py, src/codegenie/probes/_shared/scanner_outcome.py, src/codegenie/probes/layer_c/__init__.py, src/codegenie/probes/layer_c/scenario_result.py.
  • New tests: tests/unit/probes/_shared/__init__.py, tests/unit/probes/_shared/test_scanner_outcome.py, tests/unit/probes/layer_c/__init__.py, tests/unit/probes/layer_c/test_scenario_result.py, tests/property/__init__.py (if absent), tests/property/test_sum_types_roundtrip.py.
  • Extend (required): scripts/check_forbidden_patterns.py — extend the model_construct rule's applies_when predicate (or _is_under_phase2_banned_package set) to cover probes/_shared/** and probes/layer_c/scenario_result.py. Verified at validation time: NOT currently covered.
  • Extend (required): tests/unit/pre_commit/test_forbidden_patterns_phase2_extension.py — add parametrization for the two new path scopes (8 new positive + 2+ negative cases), mirroring S1-11 AC-2 / AC-3.
  • NO EDIT: pyproject.toml [tool.mypy]warn_unreachable = true is already repo-wide since Phase 0 S1-02 (pyproject.toml line 141). Per-module overrides would be redundant. The test_mypy_warn_unreachable_is_repo_wide test asserts the repo-wide flag is unchanged.

Out of scope

  • Any probe implementation that constructs these values (RuntimeTraceProbe = S5-02; SyftProbe / GrypeProbe = S5-04; Layer G scanners = S6-06 / S6-07 / S6-08).
  • The Finding shape's eventual full schema — placeholder model lands here; real shape evolves with consumers.
  • Writer composition through RedactedSlice (ScannerOutcome flows through S3-03's writer signature — the writer already accepts RedactedSlice, and this story doesn't change that).
  • Any change to Probe ABC or ProbeContext (banned in Phase 2 except for S1-09's image_digest_resolver).

Notes for the implementer

  • This is the 2nd canonical sum-type story in Phase 2 (S1-01 IndexFreshness was the 1st). The validator-hardened S1-01 is the precedent template: discriminator-string pinning, JSON-shape pinning, nested-type roundtrip, exhaustive match at every level, Hypothesis property test, source-scan for model_construct, __all__ pinning. Read _validation/S1-01-index-freshness-sum-type.md before implementing — it explains why each test exists and which mutation it catches.
  • Variant-set extension is deliberately NON-Open/Closed (mirrors S1-01 / 02-ADR-0006). The proliferation of @register_* decorators elsewhere in Phase 2 (@register_probe, @register_index_freshness_check, @register_dep_graph_strategy) is NOT license to make these unions pluggable. Adding a fifth ScannerOutcome variant or a fifth TraceFailureReason variant is an ADR amendment to 02-ADR-0006 (or a follow-up ADR), not a registry-by-addition. The assert_never arms on every consumer's match are the structural enforcement; silent Union widening is impossible without breaking every consumer at mypy time.
  • Producer/consumer assert_never ladder discipline (mirrors IndexFreshnessconfidence_section.py). This module is the producer; consumers are S5-02 (RuntimeTraceProbe), S5-04 (SyftProbe/GrypeProbe), S6-06 (SemgrepProbe), S6-07 (GitleaksProbe), S6-08 (coverage mapping + freshness registry). The module docstring MUST name them. Every consumer's match ladder must assert_never on the otherwise branch; mypy --warn-unreachable (repo-wide since Phase 0 S1-02) enforces the discipline once consumers land.
  • Documentation-debt acknowledgement. phase-arch-design.md §"Data model" line 731 still pins [internal] codegenie/probes/layer_g/scanner_outcome.py — the location was revised during High-level-impl.md authoring (§172) when it became clear that Layer C's SyftProbe/GrypeProbe (S5-04) ALSO consume ScannerOutcome. _shared/ is the correct location; the arch document is stale on this point. Do NOT edit the arch (Rule 3 — surgical changes); the High-level-impl.md location is authoritative. If a reviewer asks, point to this Validation note and S1-04 / S6-06 / S6-07 / S6-08 consumers from layer_g/ + S5-04 from layer_c/.
  • Smart constructor at the cap boundary (DF-2): ScannerFailed.stderr_tail is capped at construction by a field_validator — Pydantic's smart-constructor idiom. The cap value MUST live as STDERR_TAIL_CAP_BYTES: Final[int] = 4096 at module level; tests, the validator, AND any future consumer that needs to compare bounds all import it. Do NOT inline the literal 4096 anywhere else in the module.
  • scenario_name newtype deferral. scenario_name: str crosses ≥ 2 module boundaries (this module → S5-02 → S5-05 → S8-01); the closed set of 5 default scenarios (startup, smoke_test, healthcheck, shutdown, error_path) per High-level-impl.md §165 suggests a Literal[...] or NewType("ScenarioName", str). S1-05 is the canonical newtype story (identifiers-newtypes); pre-empting that scope here is creep. Use raw str for now; S5-02 may extract to S1-05's newtype kernel once cross-module usage is concrete.
  • The choice to put scanner_outcome.py under _shared/ (not layer_c/ and not layer_g/) is load-bearing — Layer C's SyftProbe/GrypeProbe and Layer G's curated scanners (SemgrepProbe, GitleaksProbe, etc.) must import the same ScannerOutcome type. Duplicating it under layer_c/scanner_outcome.py and layer_g/scanner_outcome.py would re-introduce the structural drift Phase 2 is rejecting. If you find yourself wanting two locations, surface it in "Notes" and stop — that's an ADR-amend trigger.
  • Finding is intentionally minimal here. Resist the urge to model semgrep / gitleaks / grype finding shapes now — S5-04 / S6-06 / S6-07 each emit their own metadata payload and the union's job in this story is only to round-trip. If you find yourself adding scanner-specific fields, you've slipped into S6-06 / S6-07 / S6-08.
  • The reason field on TraceScenarioFailed is itself a discriminated union — not a string. This is deliberate per phase-arch-design.md §"Edge cases" rows 5/6 + final-design.md §"Components" #6: macOS's permanent path emits StraceUnavailable() and the consumer (S5-05's freshness check + S8-01's renderer) must match on it as a typed value. A stringly-typed reason: str would silently lose the mypy --warn-unreachable enforcement and was the exact "anti-pattern" called out in §"Anti-patterns avoided".
  • ScannerSkipped.reason is the one place a Literal makes more sense than a sum type — three closed alternatives, no payload differs. If a fourth reason needs structured payload (e.g., "upstream_unavailable" needs the upstream slice name), promote to a discriminated union in a follow-up ADR rather than adding a metadata: dict escape hatch.
  • macOS dtruss is not used; we emit StraceUnavailable() for any non-Linux host. The macOS path is permanent — final-design.md §"Where security/best-practices traded off perf" makes this explicit.
  • Open-question echo: the S1-11 per-module override list should include both new modules. If it doesn't (because S1-11 landed before this story was scoped), this story's PR extends pyproject.toml minimally; document the diff in PR description so the S1-11 ADR's "Consequences" can be reviewed.
  • The deliberate-case-deletion exhaustiveness smoke test is documented in "Acceptance criteria" but not committed in deleted state. Treat it as a developer-runnable check: remove one case, run mypy --warn-unreachable, confirm the error, restore the case. This is part of S8-01's renderer Implementation risk #4 verification — landing the discipline here de-risks Step 8.
  • If pytest-subprocess or any other test dep is needed, it lands as [project.optional-dependencies] dev = […] and is verified by Phase 0 fence (it's not an LLM dep).