ADR-0018: Hierarchical Planner — pure routing vs LLM-driven¶

Status: Deferred Date: 2026-05-11 Tags: orchestration · cost Related: ADR-0001, ADR-0002

Context¶

The Hierarchical Planner / Supervisor (Layer 1) reads intent and dispatches to the appropriate subgraph. The intent space in Phase 1 is small and well-structured: "Migrate to distroless," "Fix CVE-X," "Upgrade Y version," with structured metadata attached. Routing based on intent type is mostly a lookup.

Two implementation choices: - Pure routing — a deterministic classifier (intent.task_type → subgraph) plus structured rules. Cheap, predictable, deterministic. - LLM-driven supervisor — an LLM reads the request (potentially natural language) and decides which subgraph, optionally with task decomposition logic.

The SHERPA paper allows ML-driven decisions in state-machine routing; it does not require them.

Options considered¶

Pure routing. Intent.task_type is a structured field on the input event. Switch statement → subgraph. No LLM.
LLM-driven supervisor. LLM reads the request and emits structured routing output. Higher cost; can handle natural-language requests; can also decompose composite tasks ("Migrate every Node service AND upgrade them to Node 20").
Hybrid. Default to pure routing; fall back to LLM only when the intent.task_type is unrecognized or the request is natural-language.

Default until decided¶

Pure routing for Phase 1. The intent space is bounded (3–5 task types), the input comes from structured triggers (CVE feed, Stage 0 Discovery, scheduled jobs), and the supervisor's job is dispatch, not creativity.

LLM-driven supervisor is an upgrade path, not a Phase 1 requirement.

Evidence needed to resolve¶

Intent-distribution data. How many distinct intent types arise in production? If <10, pure routing trivially handles them.
Natural-language request volume. Are humans submitting prose requests, or do all triggers come structured? If prose volume > ~5% of total, LLM-supervisor becomes attractive.
Composite-task frequency. How often does one request decompose into multiple subworkflows? Pure routing handles this via Temporal child workflows; LLM-supervisor handles it more elegantly.
Cost data. What does each LLM-supervisor invocation cost at the planned token budget?

Reversibility (of the eventual choice)¶

Low cost. The supervisor is a single LangGraph node — its implementation can be swapped between pure-routing and LLM-driven without touching subgraphs. The Pydantic state ledger carries the routing decision either way.

Evidence / sources¶

../design.md §4.1 (Layer 1 — Hierarchical Planner)
../design.md §7 (Open questions — Hierarchical Planner implementation)
arXiv 2509.00272 — SHERPA paper allows ML-driven decisions in routing