ADR-0012: microVM sandbox isolation for Trust-Aware gates¶
Status: Accepted Date: 2026-05-11 Tags: trust · sandbox · safety Related: ADR-0008, ADR-0019
Context¶
Trust-Aware gates evaluate agent output by running it: building the candidate image, executing the test suite, running SAST/DAST, capturing a runtime trace. This requires executing untrusted code (the agent's output is, by definition, code the system has not yet verified) on infrastructure that runs alongside production-adjacent systems.
The isolation boundary between "agent output" and "everything else" is therefore safety-critical. A weak sandbox lets an agent's mistake (or worse — a prompt-injection-induced malicious change) reach the Temporal cluster, the MCP servers, the credentials, or the host kernel.
Options considered¶
- Bare-metal / shared kernel containers. Run
docker buildon the same host as workers. Fastest, lowest isolation. Container escapes have been demonstrated against real kernels; this is unsafe for production. - Standard Docker containers with seccomp/AppArmor. Better than bare metal; still shared kernel. Acceptable for trusted workloads, not for agent-generated code.
- microVM (Firecracker / gVisor / nested QEMU). Each gate evaluation runs in its own kernel. Hardware-isolated boundary. Higher cold-start cost; significantly stronger isolation.
- Remote sandbox service (e.g., third-party). Outsources the safety boundary. Cost and data-residency tradeoffs; introduces vendor dependency.
Decision¶
Every Trust-Aware gate evaluation runs inside a microVM. Each gate invocation gets its own ephemeral microVM with no persistent storage and no network access except to the artifact registry (for base-image pulls) and the gate result reporting endpoint.
The specific microVM stack (Firecracker vs gVisor vs nested QEMU) is deferred to ADR-0019 — that decision depends on cold-start latency tolerance and kernel-feature requirements for strace/eBPF inside the sandbox.
Tradeoffs¶
| Gain | Cost |
|---|---|
| Hardware-grade isolation between agent output and the orchestrator | Cold-start latency higher than shared-kernel containers (depending on stack: 100ms Firecracker, seconds for nested QEMU) |
| A malicious or buggy agent output cannot escape into the host environment | Operational complexity — microVM clusters are not "just run docker" |
| Reproducible — sandbox state is ephemeral, every gate starts clean | Resource overhead: full kernel per gate, not shared |
| Container-escape CVEs in shared-kernel runtimes do not threaten the orchestrator | Cannot mount host directories trivially — must use explicit copy-in/copy-out |
| Compliance posture is dramatically simpler — "agent code runs in its own VM" is a clean story | Sandbox stack itself becomes attack surface; must be patched aggressively |
Consequences¶
- The
sandbox/package in the codebase wraps a microVM client. The Trust-Aware gate logic calls into it via a stable RPC contract — same contract regardless of which stack ADR-0019 picks. - microVM clusters are autoscaled independently from worker pods (
../design.md §8.4physical view). Capacity planning treats them as a distinct concern. - Cost model: each gate evaluation pays for the microVM lifecycle plus the actual build/test compute. At portfolio scale this is the dominant non-LLM cost.
- Network policy inside the sandbox: deny-all by default, with an allowlist for
cgr.dev,docker.io(or the org's internal registry), and the gate-result endpoint. No connections to the Temporal cluster, MCP servers, or anything else. - Pulling intermediate build artifacts back to the orchestrator goes through explicit copy-out steps, not shared volumes.
Reversibility¶
Medium. Replacing the microVM stack (e.g., Firecracker → gVisor) is well-scoped — change the sandbox client implementation, keep the RPC contract. Reversing the decision to sandbox (going back to shared-kernel containers) is high cost both technically (lose isolation guarantees) and socially (any security review of the system would re-flag the decision).
Evidence / sources¶
../design.md §4.1(Layer 3 — Trust-Aware Verification)../design.md §5(sandboxed reality checks subsection)../design.md §8.4(physical view — sandbox cluster as separate trust boundary)../../gemini-auto-agent-design.md §"Multi-Repository Orchestration and Control Planes"— Environment Agents in sandboxes- OpenHands V1 architecture — Docker-based isolation for agent code execution
- Firecracker production usage at AWS Lambda — microsecond cold start, hardware isolation