smolagents CodeAgent generates code to call tools. RunGuard catches the loops before the code keeps running.

HuggingFace’s smolagents library takes a code-first approach to AI agents. Instead of emitting structured tool-call JSON, the CodeAgent class generates executable Python code on each step: the model writes a snippet that calls one or more tool functions and assigns their outputs to variables. The generated code is executed in a sandboxed interpreter, the outputs are fed back into the agent’s context, and the model generates the next code snippet. ToolCallingAgent uses the more conventional JSON tool-call format. Both agents support a max_steps parameter that limits the number of agent iterations, similar to AutoGen’s MaxMessageTermination. Like all step-count terminators, max_steps fires when the iteration budget is exhausted, not when a repeated pattern is detected: a CodeAgent that generates result = web_search("AI safety 2026") on step 2, then the identical code on step 4, then again on step 6, pays for every generation in full before max_steps fires at step 10 (or wherever you set it). RunGuard wraps the model’s generate call at the layer smolagents uses internally, so the loop detector sees each step’s code generation as a tool-call signature and fires on the third repeated pattern — before step 7’s generation goes out and before the LLM cost for that step lands on your invoice.

How smolagents works and where loops form

Where to wrap RunGuard in a smolagents stack

smolagents agents use a model object (an HfApiModel, OpenAIServerModel, LiteLLMModel, or a custom model class) that exposes a __call__ method. The agent calls model(messages, stop_sequences=...) on each step, and the model object handles the HTTP call to the LLM provider. The correct place to add RunGuard is as a wrapper around the model object’s __call__ method, before the HTTP call goes out. You subclass or monkey-patch the model to run the guard on each call, extract the usd and sig from the response, and either return the response (if the guard passes) or raise LoopDetectedError / BudgetExceededError (if the guard fires). The agent’s step loop will receive the exception from the model call and can route it to a graceful error handler. Alternatively, you can subclass CodeAgent or ToolCallingAgent and override the step() method to wrap the model call with guard() inline, but the model-object approach is simpler and works without touching the agent’s internal logic.

Implementation: smolagents with RunGuard budget and loop guard

The code-generation loop: a unique signature challenge

smolagents’ CodeAgent generates free-form Python rather than structured tool-call JSON. This creates a nuance for signature-based loop detection: the “tool call” is implicit in the code (the function name and arguments that appear in the generated Python snippet) rather than explicit in a structured response field. To extract a signature from a code block, you need to parse the generated code and find the first tool function call. The examples above extract the tool name from the model’s response object (the formatted message that smolagents’ model client returns) which, for ToolCallingAgent, is a structured tool call. For CodeAgent, the response is raw text; in that case, a practical signature is the first function call token in the generated code, extracted with a simple regex: re.search(r"\b(\w+)\(", generated_code). The first function call in the code is usually the tool the agent is trying to invoke; if the agent is calling the same tool with the same arguments on consecutive steps, the first-function-call signature will repeat and the loop detector will fire. For finer-grained detection, include a 64-byte hash of the function call arguments: first_call_name + ":" + md5(first_call_args[:64]).hexdigest(). The examples above use smolagents’ response format where the tool name is available directly; adapt the signature extraction to your specific model client’s response format.

What this is not