ADR-0015: Trust-score threshold calibration¶
Status: Deferred Date: 2026-05-11 Tags: trust · calibration Related: ADR-0008
Context¶
ADR-0008 commits to computing the trust score from objective signals only — sandbox build, test pass/fail, SAST findings, CVE delta, runtime-trace coverage, policy-engine block events, coverage of changed code. Each of those signals can be normalized to a score; the trust score is a weighted aggregate.
But: what are the weights, and what's the threshold below which a transition is blocked or routed to human review? Setting these a priori without production data is guesswork. Calibrating after production data lands is empirically defensible.
The user's initial proposal was T_conf ≤ 0.90 as the reject threshold. Whether that number is right depends on the distribution of objective-signal values across successful and failed migrations — which we don't yet have.
Options considered¶
- Set thresholds a priori. Pick
T ≤ 0.90and ship. Risk: miscalibrated, either too permissive (catastrophic merges) or too strict (every transition escalates). - Binary pass/fail until calibration. Until production data exists, gates use binary "did all critical signals pass?" — no scoring threshold. Conservative.
- Defer formula and threshold; calibrate against first N production migrations. Start binary; aggregate signal distributions; fit a threshold against post-merge incident data once N is large enough.
Default until decided¶
Binary pass/fail on the most direct objective signal at each gate. Build must pass; tests must pass; SAST must not find new high-severity issues; CVE delta must be non-positive. If any of these fail, the gate fails (route back / escalate per ADR-0014). No scoring threshold yet.
The trust-score formula — what signals contribute how much — is also deferred. Until calibration, gates evaluate signals independently with AND semantics.
Evidence needed to resolve¶
- N = 50 production migrations with full objective-signal traces and post-merge outcome data (merged-clean, merged-then-rolled-back, abandoned).
- Signal distribution analysis. Which signals are noisy in successful migrations? Which are reliable predictors of post-merge incidents?
- Threshold calibration. Receiver operating characteristic (ROC) curve against post-merge incidents to pick a threshold that balances false-positive (overflagged) and false-negative (missed regressions) rates.
- Per-task-class refinement. Vulnerability patches likely tolerate higher gate threshold than convenience migrations; calibrate separately.
Reversibility (of the eventual calibration)¶
Low cost. Once a threshold is set, adjusting it is a config change. The gate logic is signal-agnostic; the threshold is a knob.
Evidence / sources¶
../design.md §4.6(push-back, objective signals)../design.md §5(Trust score and gates subsection)../design.md §7(Open questions — Trust-score threshold calibration)- ADR-0008 (the underlying commitment to objective signals)