S6-04 attempt log¶
Attempt 1 — 2026-05-18 (phase-story-executor)¶
Result: GREEN. 15 unit tests pass (tests/unit/probes/layer_d/test_external_docs.py).
Full suite: 2857 passed (+15 net), 30 skipped, 2 xfailed in 59s.
Lint (ruff check), format (ruff format --check), mypy --strict,
lint-imports — all clean. make fence test file itself: 9/9 pass
(the make fence target fails the coverage gate when invoked on the
single fence file in isolation — the documented CLAUDE.md
narrow-subset coverage-gate caveat; the assertions themselves are
green).
Per-AC evidence¶
| AC | Evidence |
|---|---|
AC-1 (__all__, slice rename) |
external_docs.py:57 — __all__ = ["ExternalDocsProbe", "NotOptedInExternalDocsSlice"] (alphabetical). |
| AC-2 (Open Q 4 phrase in docstring) | test_module_docstring_contains_open_q_4_phrase (whitespace-normalised match — see "Test-fix #2"). |
| AC-3 (slice fields, frozen, extra="forbid") | test_slice_rejects_opted_in_true_at_pydantic_level, test_slice_rejects_extra_fields. |
AC-4 (full frozen ABC field set + _PROBE_ID) |
test_probe_id_constant_exists; module-level _PROBE_ID: Final[ProbeId] = ProbeId("external_docs") at external_docs.py:59. |
AC-5 (async def run, six-field ProbeOutput, no I/O beyond raw artifact) |
test_run_returns_high_confidence_not_opted_in_by_default, test_run_performs_no_repo_or_ctx_config_reads. |
| AC-NEW-1 (single raw artifact, atomic write) | test_run_returns_high_confidence_not_opted_in_by_default asserts raw_artifacts == [ctx.output_dir / "external_docs.json"]; external_docs.py:106-108 uses .tmp → os.replace. |
AC-NEW-2 (_PROBE_ID Final constant) |
test_probe_id_constant_exists. |
| AC-6 (no HTTP/socket imports) | test_no_forbidden_http_or_socket_imports. |
| AC-7 (no speculative schema) | test_no_speculative_allowlist_schema (regex word-boundary match — see "Test-fix #1"). |
| AC-8 (sub-schema with strict additionalProperties) | test_slice_matches_subschema_with_strict_additional_properties against src/codegenie/schema/probes/external_docs.schema.json. |
AC-9 (heaviness="light" registry-verified) |
test_registry_heaviness_is_light. |
| AC-NEW-3 (registry membership) | test_registry_membership_present. |
| AC-10 (determinism — byte-identical runs) | test_two_consecutive_runs_byte_identical. |
AC-11 (mypy --strict) |
.venv/bin/mypy --strict src/codegenie/probes/layer_d/external_docs.py → "Success: no issues found in 1 source file". |
| AC-12 (fence re-check) | Fence test file passes (9/9); no new forbidden imports added. |
| AC-13 (grep-able deferral) | grep -rn "Open Q 4" src/codegenie/ finds external_docs.py. |
| AC-14 (two-place documentation) | test_manifest_readme_documents_deferral; manifest already had ExternalDocsProbe + opt-in at stories/README.md:90,187,194,276 from S6-04's harden pass. |
AC-NEW-4 (discriminator="opted_in" in source) |
test_module_source_names_opted_in_discriminator. The string lives in the slice's docstring. |
| AC-NEW-5 (no subclass-based extension) | test_no_subclass_extension_path (AST walk; asserts no class X(ExternalDocsProbe)). |
Story → reality deltas (logged for the validator + future-stories)¶
-
Test-fix #1: AC-7 substring vs. word-boundary contradiction. The story's TDD plan asserts
"OptedInExternalDocsSlice" not in src, but AC-1 mandates the slice be namedNotOptedInExternalDocsSlice— whose source identifier contains the forbidden substring as a tail. The substring test makes ACs 1 and 7 jointly unsatisfiable. Resolution: rewrote the assertion asre.search(rf"\b{re.escape(token)}\b", src)— word-boundary match. Preserves AC-7's intent (forbid a standalone identifierOptedInExternalDocsSlicefrom being defined/referenced) without colliding with the AC-1 class name. The slice docstring was also re-phrased to drop a bareOptedInExternalDocsSlicereference that was speculative pre-commitment of the eventual sibling's class name (re-spelled as "opted-in sibling" / "future tagged union"). -
Test-fix #2: AC-2 phrase 123-char single line vs. ruff E501. The story's docstring places the AC-2 phrase on a single 123-char line; ruff
E501(line-length = 100) refuses it. Resolution: wrapped the docstring sentence across two lines and madetest_module_docstring_contains_open_q_4_phrasecollapse whitespace (re.sub(r"\s+", " ", ed.__doc__)) before checking for the normalised target. AC-13'sgrep -rn "Open Q 4" src/codegenie/discoverability semantics are unchanged ("Open Q 4" still appears verbatim in the file); AC-2's load-bearing assertion ("the phrase appears as a single semantic unit") still fires. -
Conftest extension —
_PROJECT_ROOTconstant. The story's TDD plan called for a_project_rootfixture; existing Layer-D conftest exposes_make_repo/_make_contextas plain functions (imported, not pytest fixtures). Per Rule 11 (match convention), added a module-level_PROJECT_ROOT: Path = Path(codegenie.__file__).resolve().parents[2]constant and imported it directly — same anti-brittleness goal as the fixture, conforms to the established conftest shape.
Files touched¶
| Path | Op | Notes |
|---|---|---|
src/codegenie/probes/layer_d/external_docs.py |
create (115 LOC) | Probe + slice. |
src/codegenie/schema/probes/external_docs.schema.json |
create | Flat schema layout per AC-8. |
src/codegenie/probes/__init__.py |
edit (+2 lines) | Explicit-import registration + __all__ alphabetised. |
tests/unit/probes/layer_d/test_external_docs.py |
create (15 tests) | All ACs covered. |
tests/unit/probes/layer_d/conftest.py |
edit (+8 lines) | _PROJECT_ROOT constant for AC-14. |
Lessons for future Phase 2 stories¶
- Substring vs. word-boundary forbidden-token tests. When a test
asserts an identifier is absent from source AND the codebase
legitimately contains a longer identifier with that name as a tail,
the test must use regex word boundaries (
\b). A barein srccheck fires on the legitimate identifier too. The S6-04 case (OptedInExternalDocsSlicesubstring ofNotOptedInExternalDocsSlice) is the textbook trigger; future "forbid this identifier" arch tests should default to\b<token>\b. E501vs. exact-phrase docstring ACs. When an AC dictates an exact prose phrase that exceeds the line-length limit, the right move is to wrap the prose AND have the test whitespace-normalise before comparing. Don't# noqa: E501the docstring or split with a zero-width hack — the test should match the semantic assertion (one phrase, grep-findable) rather than the physical byte sequence.- Null Object stub keeps the registry seam honest. The probe ships
20 lines of "do-nothing"
run()but earns the registry slot so the coordinator/renderer/Planner never need anif probe == "external_docs"special case. The cost of the null-object stub (≈ 115 LOC including the docstring + slice) is much cheaper than the contamination of every downstream consumer with a skip-this-probe branch. Repeat this pattern when a future Phase 2 probe family wants to ship the kernel before the implementation.