browser-research-bot (browser-use 0.2.1) · $437.19 burned in 43 minutes across 812 tool calls.
Context overflowed at turn 87. The model silently truncated its system prompt, then looped retry_on_error against a 429 for another 36 minutes — until the team lead woke up.
Your agent just spent $437 in a loop. Here's how we'd have stopped it in 40ms.
RunGuard is a runtime circuit breaker for AI agents. One line of install trips it the moment a tool-call pattern shows a loop, a context-window truncation, or a budget blow-through — so the run fails closed instead of running up a four-figure invoice over the weekend.
One line
// TypeScript · also: pip install runguard
import { guard } from '@runguard/sdk';
const run = guard({ budget: 10, maxLoopReps: 3 });
await run(() => myAgent.invoke(task));
Drops into browser-use, CrewAI, LangGraph, agentkit, or your own runner. Node and Python.
What trips the breaker
- Loops. A repeated tool-call signature inside a run. Breaker trips in 40ms, returns a structured error to the caller, pages Slack. No silent $400 bill.
- Context blowouts. When the model drops your system prompt, RunGuard sees the truncation before the output does — instead of letting the agent free-associate for another 2,000 turns.
- Budget ceilings. Set
$X per runor$Y per hour. Breaker trips on the token that crosses it. You won't find out on Monday morning.
Langfuse shows you the crime scene. RunGuard stops the crime.
Langfuse, LangSmith, and Braintrust are trace viewers — they show you what went wrong after the invoice lands. RunGuard runs in-process and trips before the next tool call. They're complementary: use whatever observability you already have, and add RunGuard for the failure modes that cost money in real time.
FAQ
- How is RunGuard different from Langfuse, LangSmith, or Braintrust?
- Those are trace viewers — they record what happened so you can debug it later. RunGuard is a runtime guard — it sits inside your agent process and refuses the next tool call when a loop, a context truncation, or a budget breach is in flight. They're complementary: keep your trace viewer for forensics, add RunGuard so you don't need forensics on a $437 incident.
- Will RunGuard add latency to my agent runs?
- The breaker check is in-process and resolves in well under a millisecond per guarded call — no network hop, no remote service. The only operation that takes ~40ms is the trip itself: the moment a loop pattern is confirmed, RunGuard halts the run, dispatches the alert, and returns a structured error. On the happy path you won't notice it.
- What languages and frameworks does it support?
- TypeScript and Python at launch. Tested wrappers ship for browser-use, CrewAI, LangGraph, and agentkit; for any custom runner the
guard()primitive wraps a single async function so it works with whatever you've built. - How do you tell a loop from a legitimate retry?
- RunGuard fingerprints each tool-call by name + argument signature and watches for the same fingerprint repeating inside one run. Legitimate retries against transient 429s or network blips are usually 1–2 attempts with backoff — well under the default
maxLoopReps: 3. You can raise that per call site (or pass an explicitretryable: true) when a tool genuinely needs more bites at the apple. - What happens when the breaker trips — does my run die?
- The guarded function throws a structured
RunGuardTrippederror back to your caller, with the trip reason (loop,context_truncated, orbudget_exhausted), the offending pattern, and a trip ID. Your code decides whether to surface it, fall back, or page someone — RunGuard also fires the Slack / PagerDuty webhook (Team plan) and writes the trip to the audit log. The run halts; the bill stops.
Also from the factory
- keeptier.com — Keep your tier. Lose the Apple tax.
- chairhold.com — A $9 link that holds the chair — take a deposit before the appointment.
- vialfile.com — The tracker the post-RFK peptide era needs.
- clinicalingo.com — Spanish for the shift you're working tomorrow.
- catalogscan.com — Is your store invisible to ChatGPT?
- hourtab.com — Stop emailing clients "how many hours do I have left?"
- rentceiling.com — Know your legal max. Serve the notice. Keep the receipts.
- mcpreplay.com — Record. Replay. Catch every MCP regression.
- foothold.community — Your paid Slack community has a first-week problem. We fix it.
- glosscap.com — Captions that know your jargon.
- alivemcp.com — Is your MCP server alive? We ping it every 60 seconds so you know before your users do.
- keybrake.com — Put the brakes on your agent's keys.
- skillaudit.dev — The trust layer for Claude skills and MCP servers.
- claimhour.com — Claim every hour you bill.
- whychose.com — The log of how you decided — auto-written from your AI chats.
- therapydraft.com — HIPAA by architecture, not by contract.
- glyphward.com — See what text-only scanners miss.