Runtime runaway prevention for the Claude Agent SDK

The Claude Agent SDK ships maxTurns on every query() call and on every ClaudeAgent session, and people read it as a budget knob. It is not. It is a count of model-side turns, not a cap on the run’s bill. One turn on claude-opus-4-7 can read a 30K-line repo file with Read, fan out to three Task subagents, run a Grep with multiline: true that streams 200KB of matches, and shell out to a Bash(run_in_background: true) that floods stdout for the next minute — every byte of those tool results lands in the next turn’s input, every retry on a schema-failed tool call is a full paid turn, every subagent spawned by the Task tool gets its own maxTurns budget by default, and the SDK has no per-run dollar cap, no rolling-window throttle, no onBudgetExceeded hook. maxTurns: 25 is twenty-five times whatever the heaviest turn cost; permissionMode gates which tools execute, not how many dollars they cost; the result message’s total_cost_usd field is the morning-after audit, not the breaker. None of them look at cumulative dollars spent so far in this run and none of them stop the next turn before it fires. This page is the runtime runaway-prevention guard we ship and how it slots around a Claude Agent SDK call in eight lines.

Where the dollars actually accumulate inside a Claude Agent SDK run

Tool results re-bill as input on every following turn. The SDK threads the assistant’s tool-use block and the tool_result back into the next turn’s message list. A single Read of a 30K-line file lands ~150K tokens; a Grep with multiline: true across the repo can return 50–500KB of content; a Bash that runs npm install floods 20K tokens of progress output. On claude-opus-4-7 at $15 per million input tokens, that one Read alone is $2.25 of input bill on every subsequent turn until auto-compaction kicks in — and the auto-compaction call itself bills the full pre-compaction context as input plus the summary as output before the new context is in place.
Subagents inherit no parent budget by default. Task with subagent_type: "general-purpose" spawns a fresh ClaudeAgent with its own maxTurns count, its own system prompt, its own tool set. The parent agent sees one tool result; the bill sees the full subagent transcript. A parent that fan-outs three Task calls in a single turn (the canonical “research these in parallel” pattern) hides three child sessions behind one parent step. A subagent that spawns its own Task hides another fan-out behind a fan-out. A run with three nested fan-outs of three is one parent maxTurns times 27 child sessions of bill, none of which the parent’s maxTurns: 25 ever sees.
Hook retries on schema-failed tool calls are billed retries. When the model emits arguments that the SDK’s InputValidationError path rejects — a missing required field, a wrong-shape JSON, an unknown tool name — the canonical handling is to feed the validation error back as a tool result and let the model correct course. That correction is a fresh paid turn: same system prompt, same accumulated history, fresh output tokens. A subtly-wrong tool argument can chew three or four paid turns before the model corrects; a tool whose schema the model genuinely cannot satisfy chews retries until maxTurns saves you.
MCP server tool outputs are uncapped. An mcp__github__list_issues that returns the full body of 200 issues stuffs the next turn’s input with the entire payload. An mcp__database__query that returns 50K rows lands as 50K rows of input on the next turn. The SDK does not size-limit MCP responses by default; it ships them through to the model the same way it ships a Bash stdout or a Read body. A misconfigured MCP server that paginates by accident (returning everything on every call) compounds that on every turn.
Background tasks keep waking the agent. Bash with run_in_background: true returns control to the model immediately, then notifies the SDK on each new line of stdout. Each notification is a fresh user turn injected into the session — another paid round-trip for the model to acknowledge or react. A long-running build with verbose output (npm install, cargo build, a jest --watch session) can fire dozens of these wake-ups in a minute, none of which the agent’s maxTurns counter pre-budgeted for.
Session resume re-bills the prior context on cache miss. The Anthropic API’s prompt cache has a five-minute TTL. ClaudeAgent.resume(session_id) after a sleep, a ScheduleWakeup(delaySeconds=1800), an overnight pause, or an out-of-band crash re-hydrates the entire prior message history into the next request and pays full input rates on it — a 200K-token context is $3 of input bill on the resume turn alone, before the model emits a single output token. The cache shows up in response.usage.cache_read_input_tokens and cache_creation_input_tokens; on a cold resume both are zero and input_tokens is the full count.
Stop reasons that don’t stop billing. stop_reason: "max_tokens" means the model hit its per-turn output cap mid-thought — the agent loop typically retries the same turn with a continuation prompt, billing the same input plus another tranche of output. stop_reason: "pause_turn" on extended-thinking models means the model paused for a tool call and is about to keep going — the next request is full-cost. stop_reason: "refusal" means the model declined — output is short, but full input was billed, and the agent loop’s default is to nudge and retry. None of these shorten the bill the way stop_reason: "end_turn" does.
System prompt and CLAUDE.md re-bill on every cold turn. The SDK assembles a system prompt that includes the agent definition, the available tool schemas, every CLAUDE.md the harness loads from the working directory and parent folders, and any appended skill bodies. Cached, this is a single cache-read charge across an active session. Cold — first turn, post-TTL turn, post-restart turn — it bills as fresh input. A 30K-token system prompt on Opus 4.7 is $0.45 of input on every cold turn, multiplied by every restart of every session of every subagent.

What the Claude Agent SDK’s knobs give you and what they don’t

The Claude Agent SDK’s knobs are correct in shape and wrong in unit. maxTurns: 25 is a turn count, not a dollar cap — one turn on claude-opus-4-7 with a 100K-token tool-result history and a Task fan-out can cost $5; twenty-five of those is $125 before maxTurns trips. The trip happens after the offending turn returns, by which time the offending turn’s LLM call has been billed in full, and the SDK has no built-in onMaxTurnsExceeded hook to page Slack — the host code finds out by checking the result message’s num_turns against the cap. permissionMode ("default", "acceptEdits", "plan", "bypassPermissions") gates which tool calls execute and which prompt the host for approval, but it does not bound the bill — an approved Read of a 50K-line file is the same input-token rebill on the next turn whether it executed in default or bypassPermissions. cwd, system_prompt, tools, mcp_servers are configuration knobs, not budget rules; they shape what the agent can do, not how much that costs. abort_signal on the SDK’s query() is a wall-clock cancel — the inner Anthropic API request has already billed its input and emitted whatever tokens it had at cancel; the abort drops the rest, not the bill. model: "claude-haiku-4-5" drops the per-token rate by ~5x but doesn’t change the cost shape; a runaway loop on Haiku still climbs unbounded, just slower. The result message’s total_cost_usd tells you what the run cost after it ran; the per-message usage in each assistant turn tells you what that turn cost after it returned; neither stops the next turn before it fires. The Anthropic console’s monthly soft cap is org-wide; it stops new requests org-wide once tripped, but it doesn’t stop the in-flight turn that crossed the line, and one runaway agent script can lock out every other workload sharing the org. None of these look at cumulative dollars spent so far in this run, none of them stop the next turn before it fires, and none of them tell you which turn inside your loop is the one that finally crossed the cap.

What a runtime runaway-prevention guard actually has to do

Detect the cycle on a fingerprint, not a turn count. The same prompt fired three times in a row is a stuck retry — same system, same trailing user message, same tool result handler. The same prompt fired three times against three different Task subagent types is a real fan-out doing real work. A turn-count guard can’t tell them apart; a signature guard can. The detector takes a per-turn signature — the model name plus the canonicalised tail of the assistant’s last tool-use plus the next user message’s tool-result hash, optionally a hash of the active tool definitions, optionally the agent type — and looks for any cycle of length 1–8 repeating 3+ times in the most-recent 32 turns. Length 1 catches the stuck retry on the same tool call. Length 2 catches the agent/tool ping-pong (model proposes Bash X, tool returns error E, model re-proposes Bash X). Length 3 catches the draft-critique-revise refinement that stops actually revising.
Track real dollars, not turn count. A turn on claude-opus-4-7 at $15 per million input plus $75 per million output costs roughly five times what a turn on claude-sonnet-4-6 at $3 per million input plus $15 per million output costs; claude-haiku-4-5 at $0.80 per million input plus $4 per million output is another four times cheaper. A turn with cache-read input at $1.50 per million is ten times cheaper than the same turn with cold input. The tracker reads response.usage.input_tokens, cache_creation_input_tokens, cache_read_input_tokens, and output_tokens, multiplies by the published per-token rate for the chosen model and the right cache tier, and adds the result to a rolling-window or cumulative ledger. The SDK’s assistant message usage field is the canonical place to read these.
Trip before the next turn fires, not after. The check is in-process, on a numeric accumulator and a small ring buffer. It runs in microseconds. When the cap is crossed or the cycle threshold is hit, the next call into the wrapped function raises a typed error and the host halts — the next query() never executes, the next Task subagent never spawns, the next hook-validation retry never schedules. The previous turn’s result is preserved on the host’s side; the cap-crossing turn simply never executes.
Be a primitive, not a framework opinion. The same wrap should compose with the streaming query() generator, with the ClaudeAgent.send() per-turn API, with the Task subagent spawn point, with whatever the SDK adds next quarter. A breaker that ships as a Claude SDK monkey-patch or a bespoke ClaudeAgentWithGuard client class is brittle; a breaker that wraps any callable is portable.

Wrapping a Claude Agent SDK turn with `runguard`

// claude agent sdk + runguard. Wrap the per-turn step so the loop detector
// and budget tracker see every paid turn before the next.
import { ClaudeAgent } from "@anthropic-ai/claude-agent-sdk";
import { guard, BudgetExceededError, LoopDetectedError } from "@runguard/sdk";

const agent = new ClaudeAgent({ model: "claude-opus-4-7", maxTurns: 25 });
const RATE_IN = 15e-6, RATE_CACHE = 1.5e-6, RATE_OUT = 75e-6;  // opus 4.7

async function _step(userMsg) {
  const resp = await agent.send(userMsg);
  const u = resp.usage;
  const usd = (u.input_tokens - u.cache_read_input_tokens) * RATE_IN
            + u.cache_read_input_tokens * RATE_CACHE
            + u.output_tokens * RATE_OUT;
  const tool = resp.tool_use?.name ?? "end_turn";
  const args = JSON.stringify(resp.tool_use?.input ?? {}).slice(0, 64);
  return { resp, usd, sig: `claude:${agent.model}:${tool}:${args}` };
}

const guardedStep = guard(_step, {
  signature: (_args, out) => out.sig,
  budget: { maxUsd: 5, windowMs: 60_000 },
  loop: { repeats: 3, maxCycleLen: 8 },
  cost: (_args, out) => out.usd,
  onTrip: (e) => console.log("[runguard]", e.reason, e.spent, "of", e.cap),
});

try {
  while (!agent.done) await guardedStep(agent.nextUserMsg());
} catch (e) {
  if (e instanceof BudgetExceededError) console.log("halted: budget", e);
  if (e instanceof LoopDetectedError) console.log("halted: loop", e);
}

The loop primitive is the LoopDetector shipped at product/sdk/src/loop-detector.ts: defaults windowSize: 32, minCycleLen: 1, maxCycleLen: 8, repeats: 3 — a push(signature) the wrap calls per turn, a scan() that returns a typed match, a reset() for fresh runs, and constructor-time validation that rejects repeats < 2 and windowSize < maxCycleLen * repeats. The budget primitive is the BudgetTracker at product/sdk/src/budget.ts: maxUsd for the cap, optional windowMs for rolling-window throttles, an add(usd) the host calls post-call (which silently no-ops on zero, if (usd === 0) return), and an exceeded() the wrap reads pre-call. The BudgetTracker file is 84 lines; the LoopDetector is 111 lines — both are pure in-process primitives, no daemon, no telemetry. The fingerprint-and-window approach is documented at how to detect LLM tool-call loops in production; the LangChain AgentExecutor wrap is here; the multi-agent CrewAI wrap is here; the browser-use wrap is here; the OpenAI AgentKit wrap is here; the LangGraph StateGraph wrap is here; the bare-OpenAI-SDK wrap is here.

How the breaker behaves around `ClaudeAgent.send()`

Cost accumulates after each turn returns. The wrap reads the usd field on the inner function’s output object and pushes it into the BudgetTracker. Successful turns under the cap pass through transparently — the host sees the assistant message, dispatches the tool calls, builds the next user message. Zero-cost calls (a fully cache-read continuation, an early refusal that cost nothing because the SDK short-circuited) never trip the budget; the tracker explicitly skips zero entries via if (usd === 0) return.
The first turn over the cap throws before its API call goes out. BudgetExceededError is constructed with the cumulative spend, the cap, and a reason field. It propagates out of the wrap before the next agent.send() fires — no in-flight HTTP request, no bandwidth out, no tokens billed, no Bash shell-out, no Task subagent spawned. The previous turn’s assistant message is preserved on the host’s side; the cap-crossing turn simply never executes.
The loop detector trips on the third repeat of any signature cycle. The wrap pushes the signature into a 32-entry sliding window after each turn and scans for a length-1 to length-8 cycle that’s repeated three or more times in a row at the tail. LoopDetectedError carries the cycle length, the pattern itself, and the repeat count — the calling code dispatches on the type. A length-1 trip is the canonical “model proposes the same Bash three times in a row after the same error” loop. A length-2 trip is the agent/tool ping-pong — Read X → result → Read X → result → Read X, with the model never moving on. A length-3 trip is the draft-critique-revise refinement that stops actually revising. The same primitive caught our own X-API 402 CreditsDepleted retry loop at signature three (see below).
Your onTrip hook fires before the throw. Page Slack with the spend curve, the offending cycle pattern, the model name, the active tool, the last tool input snippet — whatever you wire. Sync hooks run inline; async hooks are awaited. An onTrip exception propagates instead of the trip error, by design (the host explicitly opted in to side-effecting on trip).
Reset is explicit. When a fresh agent run starts, call guardedStep.reset() to clear both the spend ledger and the loop window. The tracker is per-guarded-fn, not per-process — you can wrap one _step for the parent agent and another _subagentStep for Task-spawned subagents with independent budgets, or share one guard() across both for a global per-run cap. Pair the wrap with maxTurns: 50 as a coarse backstop — the breaker is the per-dollar fence; maxTurns is the per-turn safety net for pathologies the dollar guard misses.

Tuning for the Claude Agent SDK cost shape

A typical assistant turn on claude-sonnet-4-6 with 30K tokens of cached context, a 1K-token user message, and a 500-token tool-use response lands around $0.06 of bill ($0.045 cache-read + $0.003 cold input + $0.0075 output). On claude-opus-4-7 the same turn is around $0.30; on claude-haiku-4-5 around $0.012. The default maxUsd: 5 on the budget tracker corresponds to roughly 80 turns on Sonnet, 16 on Opus, 400 on Haiku — a normal job finishes well inside the cap; a stuck retry loop trips the breaker before the bill triples. For interactive applications behind a per-user request handler (a chat UI, a coding-agent endpoint, a one-off research helper), set windowMs: 60_000 with the same maxUsd: 5: the cap rolls; old spend evicts; the cumulative invoice over an hour is unbounded but the per-minute spike is bounded. For unattended automation that runs for hours overnight (a documentation re-indexer, a nightly code-review pass, a knowledge-base re-embedder), drop windowMs entirely — you want a hard cumulative cap on the whole job. For high-stakes high-cost work where an over-spend is worse than an under-spend (production data enrichment hitting Opus, paid-content moderation passes, large-context legal extraction), drop to maxUsd: 1 — a tighter cap costs you one re-run on legitimate workflows; a looser cap costs you one weekend incident. Stack the budget guard with the loop detector on the same wrap: a stuck tool retry usually trips the loop guard first (same model plus same tool name plus same arg-snippet hashes to the same signature on each retry), but a slow-burn drift on subagents that legitimately do different work each turn trips the budget instead — both stop the run, both leave a typed error, both are cheap to retry. Keep maxTurns set high (50–100) as a coarse backstop; the breaker is the per-dollar fence, maxTurns is the per-turn safety net.

The subagent, hook-retry, and MCP-output shapes on the same wrap

Signature is the run fingerprint. The default claude:<model>:<tool_name>:<tool_input[:64]> covers the canonical stuck-retry — the model proposes the same Bash with the same args three turns in a row, the breaker halts before the fourth attempt. For runs that legitimately call the same tool with different args every turn (a multi-file Read sweep, a Grep across distinct patterns), the tool_input[:64] slice is enough to distinguish. For Task subagent spawns, signature on (model, "Task", subagent_type, prompt[:64]) — the same subagent type with the same opening prompt is a stuck respawn; three different subagent types are a real fan-out. For hook retries on schema-failed tool calls, the model emits the same tool name with similar-but-not-identical args — the tool_input[:64] slice catches the “same first 64 chars” case (which is the canonical retry shape, since the model is correcting one field at a time).
Trip event tells you which fired. reason: "loop" for a cycle hit; reason: "budget" for a cost cap; reason: "context" if you also pass a context-window guard for input-token bloat (e.g. when the message history is monotonically growing because the model never calls a summary tool). The typed error is one of LoopDetectedError, BudgetExceededError, ContextLimitError — the calling code dispatches on the type, not on string parsing.
Per-wrap or shared. One guard() per logical agent gives you per-agent isolation — the parent has one budget, each Task subagent has another, the cleanup pass at the end has a third. One shared guard() across the whole run (every wrapped callable references the same guard() instance) gives you cross-agent loop detection — useful when a parent and a subagent keep tripping the same prompt against the same tool error and no single agent is repeating in isolation. Shared budgets also catch the “three subagents fan-out, each under their own cap, but cumulatively a runaway” case the per-agent breaker can’t see.
Plays nicely with session resume. The wrap is in-process and stateless across runs — it doesn’t persist its loop window or spend ledger across processes. A ClaudeAgent.resume(session_id) after a sleep is a fresh run from the breaker’s point of view, which is what you want for the legitimate-resume case (the prior session’s budget is irrelevant; this morning’s budget is what matters). For the breaker-state-must-survive-restart case (a long-running daemon, a worker pool that recycles processes, a session that the orchestrator pauses and resumes overnight), persist the trip event yourself in your onTrip hook and refuse to resume the session if the trip event is still open.
Zero outbound calls. The whole check is pure data flow inside your Node (or Python) process. No telemetry, no daemon, no SaaS, nothing leaves your VPC. The wrap is the only thing in your process that knows the turn is loop-stuck or over-budget; the only place it surfaces is the typed error, the onTrip hook you wrote, and a structured event in the trip log.

The first loop our SDK caught was ours — on a Claude Agent SDK harness

It wasn’t a query() call — it was our own launch script firing a six-tweet thread against a paid X API, scheduled by a Claude Agent SDK session running once a day. The first attempt came back with HTTP 402 CreditsDepleted. Six consecutive sessions later, six identical signatures — post_tweet:402:CreditsDepleted — were sitting in a flat JSON file on disk. The seventh session loaded the six-row history into the detector at startup and exited at signature three with a RunGuardTripped preflight before a single HTTP request went out. The session rebooted itself, re-loaded the history, re-tripped, exited — for thirty-five consecutive sessions and counting. Read the dogfood story on the 30-day log; the same pattern slots into a Claude Agent SDK turn loop when the model keeps proposing the same Bash against the same shell error three turns in a row, when a Task subagent keeps respawning with the same opening prompt because its parent never sees the failure resolve, or when an mcp__ server keeps returning the same paginated payload because the pagination cursor never advances and the model never notices.

What this is not

Not a replacement for maxTurns. Keep maxTurns set on every agent — it’s a coarse upper bound on a session’s shape, complementary to the per-run dollar cap. The two are different units: maxTurns bounds turn count; guard() bounds cumulative dollars. Set maxTurns high enough that legitimate runs finish (50–100); set maxUsd tight enough that runaway loops trip first. The SDK at product/sdk/src/budget.ts is 84 lines; the loop detector at product/sdk/src/loop-detector.ts is 111 lines; both are in-process primitives.
Not a monkey-patch on the Claude Agent SDK. RunGuard does not subclass ClaudeAgent, ship a ClaudeAgentWithGuard drop-in, or hook into the SDK’s internals. It wraps the underlying callable that calls the SDK. That is the design — the same wrap composes with query(), with ClaudeAgent.send(), with the Task subagent spawn point, with the streaming JSON event API, with whatever the SDK adds next quarter. A breaker that depends on the SDK’s shape is a maintenance liability the first time the SDK pivots; a breaker that wraps any callable is portable.
Not Langfuse, Helicone, or the Anthropic console. Those answer “what did the run do yesterday and how much did it cost?”. A runtime runaway-prevention guard answers “should the next paid turn fire?”. The two are complementary — one for finance, one for prevention. Run both. The trace is your morning-after audit; the breaker is your tonight-before-bed insurance. The Anthropic console’s monthly soft cap is an org-wide blast radius; the per-run maxUsd is the per-job blast radius. Both are useful; only one stops the runaway script before it locks out every other workload sharing the org.
Not a server, not a proxy. No outbound network, no telemetry, no cookies, no daemon, no LLM-proxy gateway in front of your SDK calls. The check is pure data flow inside your Node or Python process. The same in-process discipline shows up in the embed-preview widget; the policy is one repo away in llms.txt. If your security review asked “does this guard ship our prompts off-prem?”, the answer is the wrap reads the assistant message’s usage off your SDK’s response object and that’s the entire data flow.

The minimum Claude Agent SDK integration

One npm i @runguard/sdk (or pip install runguard for the Python SDK), one guard() wrap around a thin _step that calls agent.send() and returns {resp, usd, sig}, and one onTrip that pages the channel you actually read. Eight lines of wrap, no ClaudeAgent subclass to register, no SDK monkey-patch, no proxy gateway in front of every call. The breaker trips on the dollar cap or the third repeat of any turn signature, halts the run, and leaves a structured event and a typed error behind for the post-mortem — long before maxTurns would have bounded the next turn count, long before the Anthropic console’s monthly soft cap fires, and long before the bill arrives. RunGuard ships it as @runguard/sdk on npm and runguard on PyPI — same primitive, both runtimes, in-process, zero deps. Same wrap composes with the bare Anthropic Messages API on this page’s pattern, the Claude Agent SDK’s ClaudeAgent.send() per-turn API, the streaming query() generator, and the Task subagent spawn point; pick whichever level your code lives at and the breaker reads the same usage in the end.