We introduce KSL (kill-switch latency) as a primary SLO for production agents. Across 9 deployments we observe KSL correlates better than accuracy with operator trust and retention. We propose an eval harness and release baseline numbers.
What is KSL (Kill-Switch Latency)?
KSL measures the delay between:
1 . A human (or system) issuing a stop command
2. The agent actually halting execution
In simple terms:
“How fast can I stop this thing when it starts going wrong?”
Why KSL is critical for long-horizon agents
Long-horizon agents (multi-step tasks, workflows, tool usage, etc.) have a higher risk surface:
- They call multiple tools
- They operate over time
- They can drift from intent
- They may produce compounding errors
In these systems, failure is not binary — it unfolds over time.
That means control becomes more important than correctness.
Key insight from our deployments
Across 9 production deployments, we observed:
KSL correlates stronger with operator trust than accuracy
Faster KSL → users are more willing to:
delegate tasks
run longer workflows
retry after failures
Slow or unreliable stop → users abandon the system entirely
👉 Users don’t trust systems they can’t interrupt.
Why accuracy is not enough
An agent can be:
95% accurate
but still unusable
If:
it ignores stop signals
it finishes long tool chains after cancellation
it keeps mutating state after being “stopped”
From a UX perspective, this feels like loss of control.
Designing for low KSL
To make KSL a first-class SLO, you need to treat stopping as a core feature — not an afterthought.
Some practical approaches:
1. Interruptible execution
- Break tasks into small steps
- Check cancellation signals between steps
2. Tool-level cancellation
Ensure external calls (APIs, DB ops) can be aborted
Use timeouts aggressively
3. Streaming + early exit
Don’t wait for full completion
Allow stop during generation or reasoning
4. Idempotent side effects
Avoid irreversible actions before confirmation
Support rollback where possible
Eval harness for KSL
We propose measuring KSL using a simple evaluation setup:
Trigger agent on a multi-step task
Send stop signal at random timestamps
Measure:
time to halt
number of extra actions after stop
side effects produced post-stop
This gives you a realistic picture of how the system behaves under interruption.
Baseline expectations
From our observations:
< 200ms → feels instant, high trust
200ms – 1s → acceptable, but noticeable
> 1s → users start to hesitate
> 3s → trust drops significantly
Final thought
As agents become more autonomous, the question shifts from:
“Is the agent correct?”
to:
“Can I control it when it’s wrong?”
KSL gives you a measurable way to answer that.
And in production systems, control is what builds trust.
If you want, I can also help you turn this into a LinkedIn-style post or add diagrams (like agent lifecycle + stop signal flow).
/ revisions
- Published · 29 Apr 2026
- POST-000
- CC BY 4.0