Loop detection for Anthropic Computer Use

Anthropic’s Computer Use beta gives Claude a computer tool whose actions are screenshot, left_click, right_click, middle_click, double_click, mouse_move, type, key, scroll, and cursor_position. Every screenshot action returns a base64 PNG that the SDK threads back into the next messages.create request as an image content block, the model takes that image as input on the next turn, and at Anthropic’s tokens ≈ (width × height) / 750 vision rate a 1280×800 screenshot lands ~1,366 tokens, a 1920×1080 screenshot lands ~2,765 tokens, and a 2560×1440 screenshot lands ~4,915 tokens — on every turn until the model stops needing the prior history. Fifty screenshots at 1920×1080 is ~138K tokens of vision input that re-bills on every following turn; on claude-sonnet-4-6 at $3 per million input tokens that’s about $0.41 of input bill per turn from turn fifty onward, on claude-opus-4-7 at $15 per million it’s ~$2.07 per turn. The Anthropic Computer Use beta’s knobs — display_width_px, display_height_px, disable_parallel_tool_use, the per-turn max_tokens on output, the four versioned tool types computer_20241022 / computer_20250124 / computer_20250429 — shape what the agent sees and does, not what it costs over time. There is no max_screenshots knob, no per-run dollar cap, no onBudgetExceeded hook, and no onLoopDetected hook. response.usage.input_tokens tells you what the turn billed after it returned; stop_reason: "tool_use" means the model wants another tool call which the SDK will dutifully execute and re-bill. None of this looks at cumulative dollars spent so far in this run and none of it stops the next turn before it fires. This page is the runtime loop detector and budget guard we ship and how it slots around an Anthropic Computer Use call in eight lines.

Where the dollars actually accumulate inside an Anthropic Computer Use run

Screenshots accumulate as vision input on every following turn. The Anthropic Computer Use loop is request → assistant emits tool_use with {action: "screenshot"} → client runs the action, captures a PNG, base64-encodes it → client sends the next messages.create request with the assistant’s prior tool_use block plus a fresh user-role message containing a tool_result whose content is a single image block. That image is now permanently in the message history. The next turn’s input is the system prompt + every prior assistant turn + every prior user turn including every prior screenshot. At Anthropic’s vision tokenization (~width×height/750), a 1280×800 screenshot is ~1,366 tokens; ten of them stacked is 13,660 tokens of repeated input on every turn from then on; thirty is 40,980 tokens; fifty is 68,300 tokens at that resolution and ~138,250 tokens at 1920×1080.
Click-then-screenshot is two billed turns, not one. Each Computer Use action is a separate tool_use + tool_result pair. When the model decides to click and then verify, that’s a left_click turn (model emits {action: "left_click", coordinate: [512, 384]}, client runs it, returns a tiny text confirmation), followed by a screenshot turn (model emits {action: "screenshot"}, client returns the PNG). Two assistant turns, two billed input passes over the entire growing history, and the second turn is the one that pays vision rates on the freshly-captured screenshot from then on.
The retype-on-focus-lost loop. The model emits type with the text it wants to enter, the OS-level focus is on the wrong window or the input lost focus on click-through, the screenshot the model takes next shows the text didn’t land where expected, the model retypes the same text, screenshot shows the same wrong state, model retypes again. Three type turns plus three screenshot turns in six billed paid passes, plus the screenshots from each round permanently in history. The signature is identical: same model, same action type, same text. The breaker catches it at signature three.
The misclick loop. The model emits left_click with coordinate: [512, 384] believing it’s clicking a button. The actual button moved a pixel because of a CSS hover transition or the click landed on the button’s label, not its target. Screenshot looks identical to the model. Model emits left_click at [512, 384] again. Three identical (action, coordinate) tuples in a row. Same signature each time. Breaker trips on signature three.
The CAPTCHA / auth-wall trap. The model navigates a flow that surfaces a CAPTCHA, a 2FA prompt, a “please solve this puzzle” gate, or a redirect to a login page it doesn’t have credentials for. It can’t pass the gate. Default Computer Use behavior is to take a screenshot, reason about it, try a click, take another screenshot, try a different click. Most CAPTCHAs intentionally don’t pass; the agent burns turns until max_tokens on a single turn or the host’s loop counter saves it. Each round is a screenshot rebill on every following turn.
Bash and text_editor tool outputs accumulate too. Computer Use is rarely shipped alone — the canonical reference implementation pairs it with the bash tool (for shell access to the same VM) and the text_editor tool (for file-level edits). Both return text on every call and that text threads back into the next turn’s input. A bash that runs npm install stuffs 20K tokens of progress output; a text_editor view of a 2K-line file lands ~10K tokens. None of this is uniquely a Computer Use cost shape, but it stacks on top of the screenshot rebilling.
Display resolution silently doubles cost. Anthropic’s reference implementations default to 1024×768 (the literal image size in their example Docker container). Many production setups bump display_width_px / display_height_px to 1920×1080 because the actual app being driven is laid out for it. That’s a 2.6x increase in vision tokens per screenshot and therefore in cumulative rebill per turn — the same agent shape costs 2.6x more, with no visible change in the run.
Versioned tool types share the cost shape. computer_20241022 (the original Sonnet 3.5 launch type, mouse + keyboard + screenshot only), computer_20250124 (added scroll, key combinations, hold actions), computer_20250429 (the current type for Sonnet 4.6 / Opus 4.7) all return PNG screenshots and all bill the screenshots into the next turn’s input the same way. The exact type string in your tools array changes the action vocabulary the model can emit; it does not change how the client threads results back nor how the bill stacks.

What the Anthropic Computer Use knobs give you and what they don’t

The Anthropic Computer Use beta’s knobs are correct in shape and wrong in unit for runtime cost control. display_width_px and display_height_px shape per-screenshot tokens (lower resolution = cheaper per shot) but don’t bound how many shots accumulate; a session that takes one hundred 1024×768 screenshots still pays ~104K tokens of input on every turn from screenshot one hundred onward. disable_parallel_tool_use on the computer-use-2025-04-29 beta header (or whichever beta header the chosen model needs) prevents the model from emitting two computer tool calls in one turn — useful for sequential UI flows where parallel actions race against the same screen state, irrelevant to whether the next turn fires when you’ve already crossed your dollar cap. The per-turn max_tokens on the request bounds what one assistant turn can output; it doesn’t bound the next turn or the next or the next. stop_reason on the response is your only signal — "tool_use" means “the model wants another action, please run it and call messages.create again”, "end_turn" means “done, stop”, "max_tokens" means “the assistant ran out of output budget mid-thought, please call again with a continuation”. None of those stop the host loop on a dollar threshold. response.usage.input_tokens tells you what this turn billed after it returned; response.usage.cache_creation_input_tokens and response.usage.cache_read_input_tokens tell you what tier; nothing in the usage field is a cumulative ledger across turns of the same run. The model parameter (claude-sonnet-4-6, claude-opus-4-7, claude-haiku-4-5) shifts the per-token rate — Haiku is roughly 4–5x cheaper than Sonnet, Opus is roughly 5x more expensive than Sonnet — but doesn’t change the cost shape; a 200-screenshot run on Haiku is still a 200-screenshot run, just at a quarter the bill. The Anthropic console’s monthly soft cap is org-wide; one runaway Computer Use script that loops on a CAPTCHA can lock out every other workload sharing the org. None of these look at cumulative dollars spent so far in this run, none of them stop the next turn before it fires, and none of them tell you which click in your loop is the one that finally crossed the cap.

What a runtime loop detector for Computer Use actually has to do

Detect the cycle on a fingerprint, not a turn count. The same (action, coordinate) tuple emitted three times in a row is a misclick loop — same model, same trailing tool name, same arg payload. Three screenshot actions in a row, taken between three real type actions that legitimately enter different text, are fine. A turn-count guard can’t tell them apart; a signature guard can. The detector takes a per-turn signature — the model name plus the action name plus a hash of the coordinate (for clicks/moves) or the text (for type) or the key (for key) or the scroll direction-and-amount (for scroll) or the prior action (for bare screenshot) — and looks for any cycle of length 1–8 repeating 3+ times in the most-recent 32 turns. Length 1 catches the stuck retry on the same coordinate. Length 2 catches the canonical click→screenshot→click→screenshot UI-stuck pattern. Length 3 catches type→screenshot→click pattern that doesn’t advance.
Track real dollars across rebilled vision input, not turn count. A turn that pays for ten accumulated screenshots is more expensive than a turn that pays for one, even if both turns emit the same single action. The tracker reads response.usage.input_tokens, cache_creation_input_tokens, cache_read_input_tokens, and output_tokens, multiplies by the published per-token rate for the chosen model and the right cache tier, and adds the result to a rolling-window or cumulative ledger. Vision tokens are billed at the same rate as text input tokens on Claude models — the count Anthropic surfaces in input_tokens already includes the vision tokens; you don’t need a separate vision rate.
Trip before the next turn fires, not after. The check is in-process, on a numeric accumulator and a small ring buffer. It runs in microseconds. When the cap is crossed or the cycle threshold is hit, the next call into the wrapped function raises a typed error and the host halts — the next messages.create never executes, the next screenshot is never captured, the next click is never simulated. The previous turn’s tool_use result is preserved on the host’s side; the cap-crossing turn simply never executes.
Be a primitive, not an SDK opinion. The same wrap should compose with the Anthropic Python SDK’s client.messages.create, with the TypeScript SDK’s anthropic.messages.create, with the streaming variant messages.stream, with the Bedrock-routed AnthropicBedrock client, with whatever Anthropic adds next quarter. A breaker that ships as a Computer Use SDK monkey-patch or a bespoke ComputerUseAgentWithGuard client class is brittle; a breaker that wraps any callable is portable.

Wrapping an Anthropic Computer Use turn with `runguard`

// anthropic computer use + runguard. Wrap the per-turn step so the loop
// detector and budget tracker see every paid turn before the next.
import Anthropic from "@anthropic-ai/sdk";
import { guard, BudgetExceededError, LoopDetectedError } from "@runguard/sdk";

const client = new Anthropic();
const RATE_IN = 3e-6, RATE_CACHE = 0.3e-6, RATE_OUT = 15e-6;  // sonnet 4.6
const tools = [{ type: "computer_20250429", name: "computer", display_width_px: 1280, display_height_px: 800 }];

async function _step(messages) {
  const resp = await client.messages.create({
    model: "claude-sonnet-4-6", max_tokens: 2048, tools, messages,
    betas: ["computer-use-2025-04-29"],
  });
  const u = resp.usage;
  const usd = (u.input_tokens - u.cache_read_input_tokens) * RATE_IN
            + u.cache_read_input_tokens * RATE_CACHE
            + u.output_tokens * RATE_OUT;
  const tu = resp.content.find(b => b.type === "tool_use");
  const action = tu?.input?.action ?? "end_turn";
  const arg = JSON.stringify(tu?.input?.coordinate ?? tu?.input?.text ?? tu?.input?.key ?? "").slice(0, 64);
  return { resp, usd, sig: `anthropic-cu:sonnet-4-6:${action}:${arg}` };
}

const guardedStep = guard(_step, {
  signature: (_args, out) => out.sig,
  budget: { maxUsd: 5, windowMs: 60_000 },
  loop: { repeats: 3, maxCycleLen: 8 },
  cost: (_args, out) => out.usd,
  onTrip: (e) => console.log("[runguard]", e.reason, e.spent, "of", e.cap),
});

try {
  while (!done) await guardedStep(messages);
} catch (e) {
  if (e instanceof BudgetExceededError) console.log("halted: budget", e);
  if (e instanceof LoopDetectedError) console.log("halted: loop", e);
}

The loop primitive is the LoopDetector shipped at product/sdk/src/loop-detector.ts: defaults windowSize: 32, minCycleLen: 1, maxCycleLen: 8, repeats: 3 — a push(signature) the wrap calls per turn, a scan() that returns a typed match, a reset() for fresh runs, and constructor-time validation that rejects repeats < 2 and windowSize < maxCycleLen * repeats. The budget primitive is the BudgetTracker at product/sdk/src/budget.ts: maxUsd for the cap, optional windowMs for rolling-window throttles, an add(usd) the host calls post-call (which silently no-ops on zero, if (usd === 0) return), and an exceeded() the wrap reads pre-call. The BudgetTracker file is 84 lines; the LoopDetector is 111 lines — both are pure in-process primitives, no daemon, no telemetry. The fingerprint-and-window approach is documented at how to detect LLM tool-call loops in production; the LangChain AgentExecutor wrap is here; the multi-agent CrewAI wrap is here; the browser-use wrap is here; the OpenAI AgentKit wrap is here; the LangGraph StateGraph wrap is here; the bare-OpenAI-SDK wrap is here; the Claude Agent SDK wrap is here.

How the breaker behaves around `messages.create`

Cost accumulates after each turn returns. The wrap reads the usd field on the inner function’s output object and pushes it into the BudgetTracker. Successful turns under the cap pass through transparently — the host sees the assistant message, dispatches the computer tool action (taking a screenshot, simulating a click, typing text), captures the result, builds the next user message containing the tool_result, and calls the wrap again. Zero-cost calls (a fully cache-read continuation, an early refusal that cost nothing because the SDK short-circuited) never trip the budget; the tracker explicitly skips zero entries via if (usd === 0) return.
The first turn over the cap throws before its API call goes out. BudgetExceededError is constructed with the cumulative spend, the cap, and a reason field. It propagates out of the wrap before the next client.messages.create fires — no in-flight HTTP request, no bandwidth out, no tokens billed, no screenshot captured, no click simulated. The previous turn’s assistant message is preserved on the host’s side; the cap-crossing turn simply never executes.
The loop detector trips on the third repeat of any signature cycle. The wrap pushes the signature into a 32-entry sliding window after each turn and scans for a length-1 to length-8 cycle that’s repeated three or more times in a row at the tail. LoopDetectedError carries the cycle length, the pattern itself, and the repeat count — the calling code dispatches on the type. A length-1 trip on Computer Use is the canonical “model emits the same (left_click, [512, 384]) tuple three times in a row” misclick loop. A length-2 trip is the click→screenshot→click→screenshot→click pattern — same coordinate, alternating with the screenshot the model takes after each click to see if anything changed. A length-3 trip is the type→screenshot→left_click pattern that legitimately tries each round but never makes progress.
Your onTrip hook fires before the throw. Page Slack with the spend curve, the offending cycle pattern, the model name, the active tool action, the last coordinate or text snippet, the count of accumulated screenshots in the message history — whatever you wire. Sync hooks run inline; async hooks are awaited. An onTrip exception propagates instead of the trip error, by design (the host explicitly opted in to side-effecting on trip).
Reset is explicit. When a fresh Computer Use task starts — user says “okay, do this next thing”, host clears the message history, agent starts from a fresh screenshot — call guardedStep.reset() to clear both the spend ledger and the loop window. The tracker is per-guarded-fn, not per-process — you can wrap one _step for the parent agent and another _subagentStep for nested calls (an agent that spawns a sub-flow to handle a CAPTCHA challenge separately, for instance) with independent budgets. Pair the wrap with a host-side max-iteration count of 50–100 as a coarse backstop — the breaker is the per-dollar fence; the host loop’s iteration cap is the per-turn safety net for pathologies the dollar guard misses.

Tuning for the Anthropic Computer Use cost shape

A typical Computer Use turn at message-history index ten on claude-sonnet-4-6 with five accumulated 1280×800 screenshots (~6,830 tokens of vision), a 1K-token system prompt, a 1K-token user message, and a 200-token tool-use response lands around $0.03 of bill. By turn thirty with thirty accumulated screenshots that same shape lands around $0.13 because the vision tokens triple. By turn fifty with fifty accumulated screenshots it’s ~$0.21. On claude-opus-4-7 multiply by five; on claude-haiku-4-5 divide by ~four. The default maxUsd: 5 on the budget tracker corresponds to roughly 50–100 typical-shape turns on Sonnet, 10–20 on Opus, 200–400 on Haiku — a normal task finishes well inside the cap; a stuck retry loop trips the breaker before the bill triples. For interactive applications behind a per-user request handler (a UI-driving chat agent, a one-off browser-control flow, a customer-facing automation), set windowMs: 60_000 with the same maxUsd: 5: the cap rolls; old spend evicts; the cumulative invoice over an hour is unbounded but the per-minute spike is bounded. For unattended automation that runs Computer Use overnight (a regression-test harness driving a real desktop app, a nightly screenshot-diffing pass on production URLs, a knowledge-base re-screening that drives a browser through hundreds of pages), drop windowMs entirely — you want a hard cumulative cap on the whole job. For high-stakes Opus runs (production data-entry agents driving SaaS dashboards, paid-content moderation that drives a CMS, large-context legal-document UI-reading), drop to maxUsd: 1 — a tighter cap costs you one re-run on legitimate workflows; a looser cap costs you one weekend incident. Stack the budget guard with the loop detector on the same wrap: a misclick loop usually trips the loop guard first (same model plus same action plus same coordinate hashes to the same signature on each retry), but a slow-burn screenshot-accumulation drift on a session that legitimately takes new actions every turn trips the budget instead — both stop the run, both leave a typed error, both are cheap to retry. Drop your screenshot resolution if you can: 1280×800 is half the per-shot tokens of 1920×1080 and most desktop UIs render fine at the lower resolution; you save 50% of vision rebill on every turn.

The misclick, retype, and CAPTCHA shapes on the same wrap

Signature is the action fingerprint. The default anthropic-cu:<model>:<action>:<arg> covers the canonical stuck-retry — the model proposes the same left_click with the same coordinate three turns in a row, the breaker halts before the fourth attempt. For runs that legitimately call the same action with different args every turn (a multi-position click sweep, a typed-text entry that varies each turn), the arg[:64] slice on the coordinate or text is enough to distinguish. For bare screenshot actions (which legitimately repeat with no varying arg), signature on the prior tool action in the message history — three back-to-back screenshot turns each preceded by a left_click at the same coordinate hash to the same length-2 cycle and trip the breaker. For type actions, the text[:64] slice catches the canonical retype-on-focus-lost loop where the same string is fed three turns in a row.
Trip event tells you which fired. reason: "loop" for a cycle hit; reason: "budget" for a cost cap; reason: "context" if you also pass a context-window guard for input-token bloat (which on Computer Use is typically the screenshot-stack-rebill case rather than a text-history blowup — same primitive, different cause). The typed error is one of LoopDetectedError, BudgetExceededError, ContextLimitError — the calling code dispatches on the type, not on string parsing.
Per-wrap or shared. One guard() per Computer Use task gives you per-task isolation — the morning’s data-entry run has one budget, the afternoon’s screenshot-diff run has another. One shared guard() across a multi-task agent that drives a desktop session gives you cross-task loop detection — useful when an agent keeps trying the same flow against the same dialog box in different tasks and no single task is repeating in isolation. Shared budgets also catch the “ten short tasks fan-out, each under their own cap, but cumulatively a runaway” case the per-task breaker can’t see.
Plays nicely with screenshot-history pruning. Many Computer Use harnesses (the Anthropic reference implementation, third-party frameworks like Anthropic’s “computer-use-demo” reference repo, internal tools) prune old screenshots from the message history above some threshold — keep the most recent N screenshots, drop earlier ones, replace them with text summaries. The breaker is orthogonal: pruning shrinks the per-turn input but doesn’t stop a misclick loop (the loop is the same three-action cycle whether the history has 50 screenshots or 5). Run pruning and the breaker; pruning bounds linear growth, the breaker bounds pathological growth.
Zero outbound calls. The whole check is pure data flow inside your Node (or Python) process. No telemetry, no daemon, no SaaS, nothing leaves your VPC. The wrap is the only thing in your process that knows the turn is loop-stuck or over-budget; the only place it surfaces is the typed error, the onTrip hook you wrote, and a structured event in the trip log.

The first loop our SDK caught was ours — same primitive, different surface

It wasn’t a Computer Use call — it was our own launch script firing a six-tweet thread against a paid X API, scheduled by a Claude Agent SDK session running once a day. The first attempt came back with HTTP 402 CreditsDepleted. Six consecutive sessions later, six identical signatures — post_tweet:402:CreditsDepleted — were sitting in a flat JSON file on disk. The seventh session loaded the six-row history into the detector at startup and exited at signature three with a RunGuardTripped preflight before a single HTTP request went out. The session rebooted itself, re-loaded the history, re-tripped, exited — for thirty-five consecutive sessions and counting. Read the dogfood story on the 30-day log; the same pattern slots into a Computer Use turn loop when the model keeps proposing the same left_click against the same on-screen rendering three turns in a row, when a CAPTCHA gate keeps surfacing the same screenshot because the model can’t pass it, or when a retype loop fires the same text into a window that lost focus three turns in a row.

What this is not

Not a replacement for the host’s iteration cap. Keep your Computer Use harness’s per-task max iteration count set — it’s a coarse upper bound on a session’s shape, complementary to the per-run dollar cap. The two are different units: max-iterations bounds turn count; guard() bounds cumulative dollars. Set the iteration count high enough that legitimate runs finish (50–100); set maxUsd tight enough that runaway loops trip first. The SDK at product/sdk/src/budget.ts is 84 lines; the loop detector at product/sdk/src/loop-detector.ts is 111 lines; both are in-process primitives.
Not a monkey-patch on the Anthropic SDK. RunGuard does not subclass Anthropic, ship an AnthropicWithGuard drop-in, or hook into the SDK’s internals. It wraps the underlying callable that calls messages.create. That is the design — the same wrap composes with the Python SDK, the TypeScript SDK, the streaming variant messages.stream, the Bedrock-routed AnthropicBedrock client, the Vertex-routed AnthropicVertex client, and whatever Anthropic adds next quarter. A breaker that depends on the SDK’s shape is a maintenance liability the first time the SDK pivots; a breaker that wraps any callable is portable.
Not a CAPTCHA solver, not a screenshot pruner. The breaker doesn’t pass the gate, doesn’t prune the screenshot history, doesn’t change the model’s next action — it stops the next action when the action is loop-stuck or over-budget. Pair it with whatever screenshot-pruning policy your harness ships (the Anthropic reference implementation drops all but the last three image content blocks and replaces them with a text marker, which is reasonable) and whatever CAPTCHA-handoff policy your product has (escalate to a human, switch to a vision-only model, fall back to a non-UI flow). The breaker is the floor, not the ceiling.
Not Langfuse, Helicone, or the Anthropic console. Those answer “what did the run do yesterday and how much did it cost?”. A runtime loop detector answers “should the next paid screenshot fire?”. The two are complementary — one for finance, one for prevention. Run both. The trace is your morning-after audit; the breaker is your tonight-before-bed insurance. The Anthropic console’s monthly soft cap is an org-wide blast radius; the per-run maxUsd is the per-job blast radius. Both are useful; only one stops the runaway script before it locks out every other workload sharing the org.
Not a server, not a proxy. No outbound network, no telemetry, no cookies, no daemon, no LLM-proxy gateway in front of your messages.create calls. The check is pure data flow inside your Node or Python process. The same in-process discipline shows up in the embed-preview widget; the policy is one repo away in llms.txt. If your security review asked “does this guard ship our screenshots off-prem?”, the answer is the wrap reads response.usage off your SDK’s response object — integers only — and that’s the entire data flow.

The minimum Anthropic Computer Use integration

One npm i @runguard/sdk (or pip install runguard for the Python SDK), one guard() wrap around a thin _step that calls client.messages.create and returns {resp, usd, sig}, and one onTrip that pages the channel you actually read. Eight lines of wrap, no Anthropic subclass to register, no SDK monkey-patch, no proxy gateway in front of every Computer Use call. The breaker trips on the dollar cap or the third repeat of any action signature, halts the run, and leaves a structured event and a typed error behind for the post-mortem — long before your host’s iteration cap would have bounded the next turn count, long before the Anthropic console’s monthly soft cap fires, and long before the bill arrives. RunGuard ships it as @runguard/sdk on npm and runguard on PyPI — same primitive, both runtimes, in-process, zero deps. Same wrap composes with the bare Anthropic Messages API, with the Computer Use beta’s computer_20250429 tool type (and earlier computer_20241022 / computer_20250124 versions), with the streaming messages.stream variant, with the Bedrock-routed and Vertex-routed clients, and with whatever wrapper sits above (the Claude Agent SDK’s ClaudeAgent.send(), an internal tool-use harness, a third-party framework); pick whichever level your code lives at and the breaker reads the same usage in the end.