Haystack agent loop detection: circuit breaker for deepset pipelines
Haystack (by deepset) is a production-grade framework for building NLP pipelines and, since Haystack 2.x, full LLM agent systems. Its component-based architecture — where each node is a typed Python class with declared inputs and outputs — makes pipelines composable and testable. The same design creates a specific loop pattern: when your pipeline branches on a condition (e.g., “did the retriever find relevant documents?”), and the condition never resolves to the happy path, the pipeline re-enters the branch indefinitely. Each iteration calls your LLM to re-evaluate or re-rank. Haystack’s Pipeline.run() does not impose an iteration ceiling by default. This page covers the two loop archetypes in Haystack — retrieval-rerank loops and agent tool-call loops — and shows how RunGuard interrupts both patterns before they generate a four-figure API invoice.
The two loop patterns in Haystack
Pattern 1: Retrieval-rerank loop. Haystack pipelines often chain a Retriever, a Reranker, and a threshold check: if the top-ranked document scores below N, re-retrieve with a rephrased query. This pattern is legitimate and effective — until the document store has no relevant content for the query. At that point, every rephrase attempt returns the same low-scoring documents, the threshold check fails, and the pipeline loops. Each loop iteration calls both the embedding model (re-vectorize the rephrased query) and the reranker (LLM or cross-encoder). If you are using a generative LLM as the reranker, each loop iteration is a full LLM call.
Pattern 2: Haystack Agent tool-call loop. Haystack 2.x introduced the Agent component, which wraps an LLM and a set of tools in a standard Reason+Act loop. When a tool returns a generic error message (as a string, because Haystack tools cannot raise exceptions mid-pipeline), the LLM treats it as a partial result and generates a follow-up tool call. This can continue for as many iterations as the agent’s max_steps allows — which defaults to 10, generating 10 LLM calls on what is effectively a broken tool call.
Haystack’s native loop controls
Haystack provides two relevant controls:
Agent(max_steps=N)— caps the total Reason+Act iterations for theAgentcomponent. This prevents runaway loops but not inefficient ones. An agent that detects a failing tool at step 1 and retries the same tool call 9 more times hits no ceiling until step 10 — at which point 9 redundant LLM calls have already fired.Pipeline.run(max_runs_per_component=N)— introduced in Haystack 2.3+. Caps how many times a single component can be invoked perPipeline.run()call. This is the most direct native protection against retrieval-rerank loops. The default is often set high (or unlimited in older versions). Even with this cap, there is no detection of whether the repeated runs are making progress — just a raw count.
Neither control detects repeated identical tool-call fingerprints, and neither enforces a dollar budget. RunGuard adds both.
Wrapping Haystack tools with RunGuard loop detection
For Haystack Agent components, the integration point is the tool functions you register with the agent. Wrap each function with RunGuard’s guard() before passing it to the agent’s tools list.
from haystack.components.agents import Agent
from haystack.components.generators import OpenAIGenerator
from haystack.tools import Tool
from runguard import guard, BudgetTracker, LoopDetectedError
# Define your tool functions
def search_documents(query: str) -> str:
results = document_store.filter_documents(...)
return "\n".join(d.content for d in results) if results else "no results"
def fetch_page(url: str) -> str:
return scraper.fetch(url)
# Wrap with RunGuard — trips on 2nd identical (args, result) fingerprint
guarded_search = guard(search_documents, loop_window=6, loop_threshold=2)
guarded_fetch = guard(fetch_page, loop_window=6, loop_threshold=2)
budget = BudgetTracker(max_usd=1.50)
agent = Agent(
generator=OpenAIGenerator(model="gpt-4.1-mini"),
tools=[
Tool(name="search_documents", function=guarded_search, description="Search internal docs"),
Tool(name="fetch_page", function=guarded_fetch, description="Fetch a web page"),
],
max_steps=15,
)
try:
result = agent.run(messages=[ChatMessage.from_user("What is our refund policy?")])
except LoopDetectedError as e:
print(f"Loop in {e.tool_name}: same call repeated {e.count}x")
# e.signature has the repeated args for debugging
Guarding a Haystack retrieve-rerank pipeline
For retrieval-rerank loops in Pipeline (not Agent), RunGuard provides a PipelineLoopGuard component you can insert as a conditional gate in your pipeline graph. It tracks how many times the branch has re-entered and raises after the configured maximum.
from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.rankers import TransformersSimilarityRanker
from runguard.haystack import PipelineLoopGuard
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipeline.add_component("reranker", TransformersSimilarityRanker())
# PipelineLoopGuard is a Haystack component that counts branch passes
# It raises LoopDetectedError after max_passes identical low-score iterations
pipeline.add_component(
"loop_guard",
PipelineLoopGuard(max_passes=3, score_improvement_threshold=0.05)
)
# Wire: retriever → reranker → loop_guard → retriever (if score too low)
pipeline.connect("retriever", "reranker.documents")
pipeline.connect("reranker", "loop_guard.ranked_docs")
pipeline.connect("loop_guard.retry_query", "retriever.query")
# loop_guard.documents goes to your generator if score threshold is met
Haystack native vs RunGuard: loop control comparison
| Scenario | Haystack native | RunGuard |
|---|---|---|
| Agent tool-call loop | max_steps cap (count) | LoopDetectedError on 2nd identical fingerprint |
| Retrieval-rerank loop | max_runs_per_component (count) | PipelineLoopGuard with score-improvement detection |
| Per-run dollar cap | No | BudgetTracker raises before next LLM call |
| Error string masking | No detection | Fingerprints include result content, catches repeated error strings |
| Alert on trip | No | Slack/PagerDuty webhook on LoopDetectedError |
| Context window fill | LLM raises at limit | ContextOverflowError at configurable threshold |
Haystack 1.x vs 2.x: integration differences
Haystack 1.x used a different component model (Nodes, Pipelines with run() routing). If you’re still on Haystack 1.x, RunGuard integrates at the BaseComponent.run() method level via a subclass mixin. Haystack 2.x’s explicit tool functions and the Agent component make the guard() wrapper the cleaner integration point. Both paths are documented in the RunGuard Python SDK readme.
The most common migration blocker is that Haystack 1.x tools were Node subclasses with run() methods, not standalone functions. RunGuard’s guard_method() helper wraps a Node’s run() in-place for 1.x compatibility:
# Haystack 1.x — wrap a Node's run() method
from runguard import guard_method
class GuardedRetriever(EmbeddingRetriever):
def run(self, query_emb, filters=None, top_k=None):
return guard_method(
super().run,
loop_window=5,
loop_threshold=2,
)(query_emb=query_emb, filters=filters, top_k=top_k)
Add loop detection to your Haystack pipeline today
RunGuard’s Python SDK installs with pip install runguard. For Haystack 2.x agents, wrap tool functions with guard(). For retrieval-rerank pipelines, drop in the PipelineLoopGuard component. Both approaches add protection in minutes with no architectural changes to your existing pipeline.
Get started with RunGuard — or see the same pattern for PydanticAI, Phidata / Agno, and Python agents generally.