RunGuard
Breaker tripped · 2026-03-18 02:41 UTC

browser-research-bot (browser-use 0.2.1) · $437.19 burned in 43 minutes across 812 tool calls.

Context overflowed at turn 87. The model silently truncated its system prompt, then looped retry_on_error against a 429 for another 36 minutes — until the team lead woke up.

Your agent just spent $437 in a loop. Here's how we'd have stopped it in 40ms.

RunGuard is a runtime circuit breaker for AI agents. One line of install trips it the moment a tool-call pattern shows a loop, a context-window truncation, or a budget blow-through — so the run fails closed instead of running up a four-figure invoice over the weekend.

One line

// TypeScript · also: pip install runguard
import { guard } from '@runguard/sdk';
const run = guard({ budget: 10, maxLoopReps: 3 });
await run(() => myAgent.invoke(task));

Drops into browser-use, CrewAI, LangGraph, agentkit, or your own runner. Node and Python.

What trips the breaker

Langfuse shows you the crime scene. RunGuard stops the crime.

Langfuse, LangSmith, and Braintrust are trace viewers — they show you what went wrong after the invoice lands. RunGuard runs in-process and trips before the next tool call. They're complementary: use whatever observability you already have, and add RunGuard for the failure modes that cost money in real time.

FAQ

How is RunGuard different from Langfuse, LangSmith, or Braintrust?
Those are trace viewers — they record what happened so you can debug it later. RunGuard is a runtime guard — it sits inside your agent process and refuses the next tool call when a loop, a context truncation, or a budget breach is in flight. They're complementary: keep your trace viewer for forensics, add RunGuard so you don't need forensics on a $437 incident.
Will RunGuard add latency to my agent runs?
The breaker check is in-process and resolves in well under a millisecond per guarded call — no network hop, no remote service. The only operation that takes ~40ms is the trip itself: the moment a loop pattern is confirmed, RunGuard halts the run, dispatches the alert, and returns a structured error. On the happy path you won't notice it.
What languages and frameworks does it support?
TypeScript and Python at launch. Tested wrappers ship for browser-use, CrewAI, LangGraph, and agentkit; for any custom runner the guard() primitive wraps a single async function so it works with whatever you've built.
How do you tell a loop from a legitimate retry?
RunGuard fingerprints each tool-call by name + argument signature and watches for the same fingerprint repeating inside one run. Legitimate retries against transient 429s or network blips are usually 1–2 attempts with backoff — well under the default maxLoopReps: 3. You can raise that per call site (or pass an explicit retryable: true) when a tool genuinely needs more bites at the apple.
What happens when the breaker trips — does my run die?
The guarded function throws a structured RunGuardTripped error back to your caller, with the trip reason (loop, context_truncated, or budget_exhausted), the offending pattern, and a trip ID. Your code decides whether to surface it, fall back, or page someone — RunGuard also fires the Slack / PagerDuty webhook (Team plan) and writes the trip to the audit log. The run halts; the bill stops.

Also from the factory