Haystack agent loop detection: circuit breaker for deepset pipelines

Haystack (by deepset) is a production-grade framework for building NLP pipelines and, since Haystack 2.x, full LLM agent systems. Its component-based architecture — where each node is a typed Python class with declared inputs and outputs — makes pipelines composable and testable. The same design creates a specific loop pattern: when your pipeline branches on a condition (e.g., “did the retriever find relevant documents?”), and the condition never resolves to the happy path, the pipeline re-enters the branch indefinitely. Each iteration calls your LLM to re-evaluate or re-rank. Haystack’s Pipeline.run() does not impose an iteration ceiling by default. This page covers the two loop archetypes in Haystack — retrieval-rerank loops and agent tool-call loops — and shows how RunGuard interrupts both patterns before they generate a four-figure API invoice.

The two loop patterns in Haystack

Pattern 1: Retrieval-rerank loop. Haystack pipelines often chain a Retriever, a Reranker, and a threshold check: if the top-ranked document scores below N, re-retrieve with a rephrased query. This pattern is legitimate and effective — until the document store has no relevant content for the query. At that point, every rephrase attempt returns the same low-scoring documents, the threshold check fails, and the pipeline loops. Each loop iteration calls both the embedding model (re-vectorize the rephrased query) and the reranker (LLM or cross-encoder). If you are using a generative LLM as the reranker, each loop iteration is a full LLM call.

Pattern 2: Haystack Agent tool-call loop. Haystack 2.x introduced the Agent component, which wraps an LLM and a set of tools in a standard Reason+Act loop. When a tool returns a generic error message (as a string, because Haystack tools cannot raise exceptions mid-pipeline), the LLM treats it as a partial result and generates a follow-up tool call. This can continue for as many iterations as the agent’s max_steps allows — which defaults to 10, generating 10 LLM calls on what is effectively a broken tool call.

Haystack’s native loop controls

Haystack provides two relevant controls:

Agent(max_steps=N) — caps the total Reason+Act iterations for the Agent component. This prevents runaway loops but not inefficient ones. An agent that detects a failing tool at step 1 and retries the same tool call 9 more times hits no ceiling until step 10 — at which point 9 redundant LLM calls have already fired.
Pipeline.run(max_runs_per_component=N) — introduced in Haystack 2.3+. Caps how many times a single component can be invoked per Pipeline.run() call. This is the most direct native protection against retrieval-rerank loops. The default is often set high (or unlimited in older versions). Even with this cap, there is no detection of whether the repeated runs are making progress — just a raw count.

Neither control detects repeated identical tool-call fingerprints, and neither enforces a dollar budget. RunGuard adds both.

Wrapping Haystack tools with RunGuard loop detection

For Haystack Agent components, the integration point is the tool functions you register with the agent. Wrap each function with RunGuard’s guard() before passing it to the agent’s tools list.

from haystack.components.agents import Agent
from haystack.components.generators import OpenAIGenerator
from haystack.tools import Tool
from runguard import guard, BudgetTracker, LoopDetectedError

# Define your tool functions
def search_documents(query: str) -> str:
    results = document_store.filter_documents(...)
    return "\n".join(d.content for d in results) if results else "no results"

def fetch_page(url: str) -> str:
    return scraper.fetch(url)

# Wrap with RunGuard — trips on 2nd identical (args, result) fingerprint
guarded_search = guard(search_documents, loop_window=6, loop_threshold=2)
guarded_fetch = guard(fetch_page, loop_window=6, loop_threshold=2)

budget = BudgetTracker(max_usd=1.50)

agent = Agent(
    generator=OpenAIGenerator(model="gpt-4.1-mini"),
    tools=[
        Tool(name="search_documents", function=guarded_search, description="Search internal docs"),
        Tool(name="fetch_page", function=guarded_fetch, description="Fetch a web page"),
    ],
    max_steps=15,
)

try:
    result = agent.run(messages=[ChatMessage.from_user("What is our refund policy?")])
except LoopDetectedError as e:
    print(f"Loop in {e.tool_name}: same call repeated {e.count}x")
    # e.signature has the repeated args for debugging

Guarding a Haystack retrieve-rerank pipeline

For retrieval-rerank loops in Pipeline (not Agent), RunGuard provides a PipelineLoopGuard component you can insert as a conditional gate in your pipeline graph. It tracks how many times the branch has re-entered and raises after the configured maximum.

from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.rankers import TransformersSimilarityRanker
from runguard.haystack import PipelineLoopGuard

pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipeline.add_component("reranker", TransformersSimilarityRanker())

# PipelineLoopGuard is a Haystack component that counts branch passes
# It raises LoopDetectedError after max_passes identical low-score iterations
pipeline.add_component(
    "loop_guard",
    PipelineLoopGuard(max_passes=3, score_improvement_threshold=0.05)
)

# Wire: retriever → reranker → loop_guard → retriever (if score too low)
pipeline.connect("retriever", "reranker.documents")
pipeline.connect("reranker", "loop_guard.ranked_docs")
pipeline.connect("loop_guard.retry_query", "retriever.query")

# loop_guard.documents goes to your generator if score threshold is met

Haystack native vs RunGuard: loop control comparison

Scenario	Haystack native	RunGuard
Agent tool-call loop	max_steps cap (count)	LoopDetectedError on 2nd identical fingerprint
Retrieval-rerank loop	max_runs_per_component (count)	PipelineLoopGuard with score-improvement detection
Per-run dollar cap	No	BudgetTracker raises before next LLM call
Error string masking	No detection	Fingerprints include result content, catches repeated error strings
Alert on trip	No	Slack/PagerDuty webhook on LoopDetectedError
Context window fill	LLM raises at limit	ContextOverflowError at configurable threshold

Haystack 1.x vs 2.x: integration differences

Haystack 1.x used a different component model (Nodes, Pipelines with run() routing). If you’re still on Haystack 1.x, RunGuard integrates at the BaseComponent.run() method level via a subclass mixin. Haystack 2.x’s explicit tool functions and the Agent component make the guard() wrapper the cleaner integration point. Both paths are documented in the RunGuard Python SDK readme.

The most common migration blocker is that Haystack 1.x tools were Node subclasses with run() methods, not standalone functions. RunGuard’s guard_method() helper wraps a Node’s run() in-place for 1.x compatibility:

# Haystack 1.x — wrap a Node's run() method
from runguard import guard_method

class GuardedRetriever(EmbeddingRetriever):
    def run(self, query_emb, filters=None, top_k=None):
        return guard_method(
            super().run,
            loop_window=5,
            loop_threshold=2,
        )(query_emb=query_emb, filters=filters, top_k=top_k)

Add loop detection to your Haystack pipeline today

RunGuard’s Python SDK installs with pip install runguard. For Haystack 2.x agents, wrap tool functions with guard(). For retrieval-rerank pipelines, drop in the PipelineLoopGuard component. Both approaches add protection in minutes with no architectural changes to your existing pipeline.

Get started with RunGuard — or see the same pattern for PydanticAI, Phidata / Agno, and Python agents generally.