Skip to content

ADR-0018: Hierarchical Planner — pure routing vs LLM-driven

Status: Deferred Date: 2026-05-11 Tags: orchestration · cost Related: ADR-0001, ADR-0002

Context

The Hierarchical Planner / Supervisor (Layer 1) reads intent and dispatches to the appropriate subgraph. The intent space in Phase 1 is small and well-structured: "Migrate to distroless," "Fix CVE-X," "Upgrade Y version," with structured metadata attached. Routing based on intent type is mostly a lookup.

Two implementation choices: - Pure routing — a deterministic classifier (intent.task_type → subgraph) plus structured rules. Cheap, predictable, deterministic. - LLM-driven supervisor — an LLM reads the request (potentially natural language) and decides which subgraph, optionally with task decomposition logic.

The SHERPA paper allows ML-driven decisions in state-machine routing; it does not require them.

Options considered

  • Pure routing. Intent.task_type is a structured field on the input event. Switch statement → subgraph. No LLM.
  • LLM-driven supervisor. LLM reads the request and emits structured routing output. Higher cost; can handle natural-language requests; can also decompose composite tasks ("Migrate every Node service AND upgrade them to Node 20").
  • Hybrid. Default to pure routing; fall back to LLM only when the intent.task_type is unrecognized or the request is natural-language.

Default until decided

Pure routing for Phase 1. The intent space is bounded (3–5 task types), the input comes from structured triggers (CVE feed, Stage 0 Discovery, scheduled jobs), and the supervisor's job is dispatch, not creativity.

LLM-driven supervisor is an upgrade path, not a Phase 1 requirement.

Evidence needed to resolve

  • Intent-distribution data. How many distinct intent types arise in production? If <10, pure routing trivially handles them.
  • Natural-language request volume. Are humans submitting prose requests, or do all triggers come structured? If prose volume > ~5% of total, LLM-supervisor becomes attractive.
  • Composite-task frequency. How often does one request decompose into multiple subworkflows? Pure routing handles this via Temporal child workflows; LLM-supervisor handles it more elegantly.
  • Cost data. What does each LLM-supervisor invocation cost at the planned token budget?

Reversibility (of the eventual choice)

Low cost. The supervisor is a single LangGraph node — its implementation can be swapped between pure-routing and LLM-driven without touching subgraphs. The Pydantic state ledger carries the routing decision either way.

Evidence / sources

  • ../design.md §4.1 (Layer 1 — Hierarchical Planner)
  • ../design.md §7 (Open questions — Hierarchical Planner implementation)
  • arXiv 2509.00272 — SHERPA paper allows ML-driven decisions in routing