Cohere Command R agent budget limit: adding a cost cap and loop detector to Command R tool-use agents

Cohere’s Command R and Command R+ models support native tool use via the /v1/chat API. The tool-use interface is well-designed: the model returns structured tool_calls objects, you execute the tools and return tool_results, and the model continues reasoning. What Cohere’s API does not provide is a per-run cost cap or a loop detector. An agent built on Command R+ can enter a tool-call loop — calling the same tool repeatedly with the same arguments when the result doesn’t advance its goal — and run until your Cohere billing quota is exhausted. At Command R+ pricing, a 500-turn loop costs roughly $2–$10 depending on context size — not catastrophic for a single incident, but when this happens at production scale across multiple concurrent users, the bill arrives faster than you can intervene. This page shows how to add RunGuard’s circuit breaker to a Command R agent in Python.

Cohere Command R tool-use: how agents loop

Adding a budget limit and loop detector to a Command R agent

Budget calibration for Command R agents

Cohere API defaults vs. RunGuard

ControlCohere API defaultRunGuard
Tool-call loop detectionNot supportedloop: repeats=3 fires on 3rd repeat of same pattern
Per-run cost capNot supported (account-level quota only)budget: max_usd fires before each chat() call
Max turnsNot supportedImplicit via loop + budget caps
Context-window guard400 after request sentPre-call ContextOverflowError before request sent
Slack/PagerDuty alert on tripNot supportedalerts: slack_webhook or pagerduty_key
RAG-specific loop detectionNot supportedSame loop detector — RAG repeated-retrieval pattern is a period-1 cycle