AutoGen’s termination conditions stop conversations by turn count. RunGuard stops them by loop pattern.
Microsoft AutoGen (both v0.2 and the newer v0.4/AgentChat API) is a multi-agent conversation framework. You compose ConversableAgent instances — or AssistantAgent and UserProxyAgent wrappers around them — and orchestrate them via GroupChat and GroupChatManager or the newer RoundRobinGroupChat / SelectorGroupChat teams. Each agent can have registered tools (functions the LLM can call via tool-use), and the conversation proceeds until a termination condition is met. AutoGen provides built-in termination: MaxMessageTermination(10) stops after 10 agent turns, StopMessageTermination() stops when any agent says a keyword, ExternalTermination() accepts a programmatic signal. These termination conditions are turn-level instruments: they count how many times the conversation has cycled, or they check string content, or they respond to an external signal. What they do not do is analyze the tool-call signature sequence within each agent’s turns and detect whether the sequence has entered a repeating cycle. An AutoGen assistant that calls web_search("latest AI safety regulations") on turn 3, then again on turn 5, then again on turn 7 — same tool, same arguments, same results — will reach the MaxMessageTermination ceiling after however many turns you configured, not after the third repeated search. The cost of the four redundant searches (turns 5, 7, 9, and 11 in a 12-turn budget) is paid in full before the termination condition fires. RunGuard’s LoopDetector fires at the third repeated search, before turn 7’s search goes out, and throws an exception that your error handler routes to a graceful fallback rather than letting the conversation continue to the turn limit.
AutoGen’s termination conditions: what they catch and what they miss
MaxMessageTermination: turn-count ceiling, not pattern detector.MaxMessageTermination(max_messages=10)fires when the conversation has accumulated 10 messages (agent turns). This is a useful backstop: it prevents infinite conversations that never satisfy a string-based termination. But it counts every turn equally, whether it makes progress (new information, new tool results, forward movement toward the task) or not (same tool, same arguments, same results as two turns ago). A loop that repeats on turns 3, 5, 7, 9 contributes 4 repeated turns beforeMaxMessageTermination(10)fires, paying for 4 redundant generations in full. A signature-based loop detector fires at the third repeated turn (turn 7 in this example) and prevents turn 9’s generation entirely.StopMessageTermination: keyword in message content, not in tool-call sequence.StopMessageTermination(stop_words=["TERMINATE", "DONE"])fires when any agent’s message contains a configured keyword. This is useful for instructing agents to signal completion explicitly ("Response: ... TERMINATE"), but it depends on the agent correctly recognizing that the task is done and including the keyword. An agent in a tool-call loop has already failed to recognize that it is stuck — it keeps calling the same tool because it believes the tool output will eventually be different. That agent will not say"TERMINATE"; it will say something like"Let me search again for more specific information."The string-based termination is powerless against a model that never produces the stop keyword because it is caught in a reasoning loop.ExternalTermination(): manual or signal-based, not automatic.ExternalTerminationgives you a programmatic handle to terminate the conversation from outside the agent loop. You can callexternal_termination.set()from a monitoring thread, a timeout callback, or a cost-alert webhook. This is the right design for human-in-the-loop monitoring, but it requires an external system to detect the runaway condition first. If your monitoring fires a webhook when daily spend exceeds $500 and you callset()in response, you are stopping the conversation after the spend threshold has already been crossed. RunGuard fires before the call that would push the run pastmaxUsd, not after the fleet-level threshold fires.- The right combination: termination conditions as backstops, RunGuard as pattern detector. The correct architecture is both. Keep your
MaxMessageTermination(50)as the outer ceiling for conversations that make genuine progress but need a hard limit. Add RunGuard around the model client call inside each agent’smodel_clientor inside the function that your agent registers as a tool. Let RunGuard catch the structural loop and cost runaway early; letMaxMessageTerminationcatch the cases where RunGuard doesn’t fire (legitimate long conversations that eventually exhaust the turn budget).
Where to add RunGuard in an AutoGen v0.4 stack
AutoGen v0.4 (the AgentChat API) uses ChatCompletionClient implementations (OpenAIChatCompletionClient, AnthropicChatCompletionClient) as the model-calling layer. The agents pass their message history to the client and receive a completion back. The cleanest place to add RunGuard is at the model client call, not at the agent level. You wrap the async function that calls model_client.create(messages) with guard_async(), provide usd (computed from the token counts in the response) and sig (the name of the first tool call in the response, or "end_turn") in the return value, and the guard’s LoopDetector and BudgetTracker run before each model call. Because the guard is at the model-client level, it sees the full tool-call sequence across all agents in a GroupChat — not just the tool calls of a single agent — provided you share the guard instance across all agents that use the same model client. A shared guard instance across agents is the correct setup for detecting cross-agent loops (agent A calls search_tool, delegates to agent B, agent B calls search_tool with the same arguments, delegates back to A, A calls it again — a loop that spans multiple agents and is invisible to a per-agent guard).
Implementation: AutoGen v0.4 with AssistantAgent
-
Python (AutoGen v0.4 AgentChat API)
import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.conditions import MaxMessageTermination from autogen_ext.models.openai import OpenAIChatCompletionClient from runguard import guard_async, LoopDetectedError, BudgetExceededError # Wrap the model client's create method original_client = OpenAIChatCompletionClient(model="gpt-4o") _original_create = original_client.create async def guarded_create(messages, **kwargs): response = await _original_create(messages, **kwargs) # Compute cost from usage usage = getattr(response, "usage", None) usd = 0.0 if usage: usd = (usage.prompt_tokens * 2.5 + usage.completion_tokens * 10) / 1_000_000 # Extract tool-call signature content = response.content if hasattr(response, "content") else [] sig = "end_turn" if isinstance(content, list): for item in content: if hasattr(item, "name"): # FunctionCall sig = item.name break return {"response": response, "usd": usd, "sig": sig} guarded = guard_async( guarded_create, budget={"max_usd": 5}, loop={"repeats": 3, "max_cycle_len": 8}, ) # Monkey-patch the client to use the guarded version async def _patched_create(messages, **kwargs): result = await guarded(messages, **kwargs) return result["response"] original_client.create = _patched_create # Build agents using the guarded client researcher = AssistantAgent( "researcher", model_client=original_client, system_message="You are a research agent. Use the search tool to answer questions.", ) writer = AssistantAgent( "writer", model_client=original_client, system_message="You write summaries based on research findings.", ) team = RoundRobinGroupChat( [researcher, writer], termination_condition=MaxMessageTermination(max_messages=20), ) async def main(): try: result = await team.run(task="Research the latest developments in AI agent safety.") print(result.messages[-1].content) except LoopDetectedError as e: print(f"Loop detected: {e.pattern} repeated {e.repeats}x") # log, escalate, or return partial result except BudgetExceededError as e: print(f"Budget exceeded: spent ${e.spent:.4f}") asyncio.run(main()) -
Python (AutoGen v0.2 ConversableAgent)
from autogen import ConversableAgent, config_list_from_json from runguard import guard, LoopDetectedError, BudgetExceededError import copy # Standard AutoGen LLM config llm_config = { "config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}], "timeout": 120, } # Wrap openai.chat.completions.create before AutoGen calls it import openai as _openai _orig_create = _openai.chat.completions.create def my_create(messages, **kwargs): response = _orig_create(messages=messages, **kwargs) usage = response.usage usd = (usage.prompt_tokens * 2.5 + usage.completion_tokens * 10) / 1_000_000 tool_calls = getattr(response.choices[0].message, "tool_calls", None) sig = tool_calls[0].function.name if tool_calls else "end_turn" return {"response": response, "usd": usd, "sig": sig} _guarded = guard(my_create, budget={"max_usd": 5}, loop={"repeats": 3}) def patched_create(messages, **kwargs): return _guarded(messages, **kwargs)["response"] _openai.chat.completions.create = patched_create # Build agents as normal assistant = ConversableAgent("assistant", llm_config=llm_config) user_proxy = ConversableAgent("user_proxy", human_input_mode="NEVER", code_execution_config={"use_docker": False}) try: user_proxy.initiate_chat( assistant, message="Research and summarize the latest AI safety frameworks.", max_turns=15, ) except LoopDetectedError as e: print(f"Loop detected in AutoGen conversation: {e.pattern}") except BudgetExceededError as e: print(f"Budget exceeded: ${e.spent:.4f}")
GroupChat: sharing the guard instance across agents
When multiple agents in a GroupChat or RoundRobinGroupChat use the same model, sharing a single guard instance ensures the loop detector sees the full cross-agent tool-call sequence. If each agent has its own independent guard instance, a loop that spans two agents — researcher calls search("AI safety"), passes to writer, writer calls search("AI safety"), passes back to researcher, researcher calls search("AI safety") — would not be detected because each agent’s guard only sees one repetition in its own history (below the repeats: 3 threshold). A shared guard instance in the monkey-patched model client call sees all three search("AI safety") calls and fires on the third one. The monkey-patching approach in the examples above automatically shares the guard because it patches the global OpenAI or Anthropic client — all agents that use the same global client share the same guarded function, and therefore the same guard instance. If you are using per-agent client instances, instantiate the guard once outside the agent definitions and pass the guarded function to each agent’s patched client.
AutoGen’s tool registration and where RunGuard fits
- Tool-registered functions vs. model-client calls. AutoGen agents call tools (registered Python functions) based on the model’s tool-call response. The tool functions themselves are not where RunGuard should be placed: wrapping each tool function with a guard would give each tool its own budget and loop history, which is not what you want. The loop detector should see the full sequence of tool calls across all tools (a pattern like
[search, search, search]is a loop even if no single tool repeats three times in its own history). Place RunGuard at the model-client call level, where it sees every tool-call decision the model makes, regardless of which specific tool is called. - AutoGen’s built-in tool call result caching. AutoGen v0.4 includes optional tool result caching (same arguments → same result returned from cache, no re-execution of the tool function). Tool result caching reduces the execution cost of repeated tool calls but does not reduce the LLM call cost: the model still generates a tool-call response for each turn, and that generation is billed at full price even if the tool execution is cached. RunGuard fires at the model-call level before the generation cost is incurred, so it catches the loop before the generation happens regardless of whether tool result caching is active.
- The
sigfield and how to compute it for AutoGen. Thesigyour guarded function returns should encode the tool name plus a truncated (first 64 bytes) snapshot of the tool arguments. For AutoGen v0.4 responses, the tool call information is inresponse.contentas a list ofFunctionCallobjects. For OpenAI-format responses, it is inresponse.choices[0].message.tool_calls[0].function.nameandresponse.choices[0].message.tool_calls[0].function.arguments[:64]. The exact extraction depends on which AutoGen model client you are using; the examples above show the correct extraction for the OpenAI and AgentChat formats.
What this is not
RunGuard is not an AutoGen plugin or an AutoGen-specific integration. It wraps the model-client call layer, below AutoGen’s agent abstraction, which means it works with any AutoGen version (v0.2, v0.3, v0.4) and any model client (OpenAI, Anthropic, Azure OpenAI, Ollama) without requiring AutoGen to expose a custom circuit-breaker hook. RunGuard does not understand AutoGen’s agent protocols, team structures, or message routing — it only sees the model-client calls that flow through the patched function. This is intentional: the loop detection algorithm works on tool-call signatures, which are model-level signals, not agent-level signals. A future AutoGen version that changes its agent protocol will not affect RunGuard’s loop detection as long as the model-client call is still the innermost LLM-calling layer. RunGuard is also not a replacement for AutoGen’s termination conditions. Use MaxMessageTermination as the outer ceiling and RunGuard as the pattern-aware inner guard. The CrewAI loop detection page, LangChain circuit breaker page, and LangGraph infinite loop guard page cover the same RunGuard integration pattern for other multi-agent frameworks. RunGuard ships as @runguard/sdk on npm and runguard on PyPI. The full API is in llms.txt.