LLM output validation loop prevention: why validate-retry cycles become infinite and how to stop them

A common agent pattern is structured output validation: ask the model to produce JSON (or another structured format), parse the output, validate it against a schema, and if validation fails, send the error back to the model and ask it to fix the output. This works well when the model occasionally produces minor formatting errors that it can self-correct in one or two retries. It becomes a runaway cost problem when the model enters a validation failure loop — consistently producing output that fails the same schema constraint, retrying indefinitely, paying for each retry. The failure mode is subtle: the model is not “stuck” in the sense of producing identical output; it may produce different invalid outputs on each retry, which means a loop detector based purely on output deduplication will not catch it. What makes it a loop is the pattern: “generate → validate → fail → generate → validate → fail,” regardless of the specific error message. This guide covers how to detect validation loops by their structural pattern rather than their content, how to enforce hard retry limits, and how to use RunGuard’s loop detector with a validation-specific signature scheme to catch these storms before they exhaust your API budget.

Why validation loops are hard to catch with standard retry logic

The model produces different invalid outputs on each retry. A model asked to produce a JSON object with a required status field of type "success" | "failure" might produce "status": "ok" on retry 1, "status": "completed" on retry 2, and "status": "done" on retry 3. Each is a different string, so output-deduplication-based loop detection reports no loop. But the underlying constraint violation is the same on all three retries: the value is not one of the allowed enum values. The loop is real; the detector missed it because it was looking at values, not constraint violations.
Validation error messages are not stable identifiers. If you route the validation error back to the model as part of the retry prompt, different invalid outputs produce different error messages (“'ok' is not a valid value for status” vs “'completed' is not a valid value for status”). Hashing the full error message as the loop signature produces a different signature each time, so three retries never accumulate to the threshold needed to trip the breaker.
The fix: signature by constraint violation category, not by specific value. Rather than using the full validation error message as the loop signature, extract the constraint violation type: enum_violation:status, missing_required_field:user_id, type_error:amount:expected_number. This signature is stable across retries that fail for the same structural reason, even when the specific invalid values differ. Three retries of enum_violation:status trip the loop breaker; three retries of three different error messages do not.

Python: validation loop detection with constraint-violation signatures

Python: structured output validator with RunGuard loop detection

import json
import re
import anthropic
from runguard import LoopDetector, LoopDetectedError, BudgetExceededError, guard
from typing import Any

client = anthropic.Anthropic()

# Constraint violation categories — stable loop signatures
def extract_violation_signature(validation_error: str) -> str:
    """
    Convert a validation error message to a stable violation-category signature.
    This prevents the loop detector from being fooled by varying error messages
    that represent the same structural constraint failure.
    """
    error_lower = validation_error.lower()

    # Enum / allowed values violation
    m = re.search(r"['\"](\w+)['\"] (?:is not|not) (?:a valid|an? allowed|one of)", error_lower)
    if m:
        # Try to extract field name from the error
        field = re.search(r"for (?:field|key|property) ['\"]?(\w+)", error_lower)
        field_name = field.group(1) if field else "unknown_field"
        return f"enum_violation:{field_name}"

    # Missing required field
    m = re.search(r"(?:required|missing) (?:field|key|property) ['\"]?(\w+)", error_lower)
    if m:
        return f"missing_required:{m.group(1)}"

    # Type error
    m = re.search(r"(?:expected|should be) (\w+).*?['\"]?(\w+)['\"]?", error_lower)
    if m:
        return f"type_error:{m.group(2)}:expected_{m.group(1)}"

    # JSON parse failure
    if "json" in error_lower and ("parse" in error_lower or "decode" in error_lower or "invalid" in error_lower):
        return "json_parse_error"

    # Fallback: use first 40 chars of error stripped to safe chars
    safe = re.sub(r"[^a-zA-Z0-9_]", "_", error_lower[:40]).strip("_")
    return f"validation_error:{safe}"


def validate_task_output(output: str, schema: dict) -> tuple[bool, str]:
    """
    Validate output against schema. Returns (is_valid, error_message).
    Simple demo validator; replace with jsonschema.validate() in production.
    """
    try:
        data = json.loads(output)
    except json.JSONDecodeError as e:
        return False, f"JSON parse error: {e}"

    for required_field in schema.get("required", []):
        if required_field not in data:
            return False, f"Missing required field '{required_field}'"

    properties = schema.get("properties", {})
    for field, rules in properties.items():
        if field not in data:
            continue
        value = data[field]
        if "enum" in rules and value not in rules["enum"]:
            return False, f"'{value}' is not a valid value for field '{field}'. Allowed: {rules['enum']}"
        if "type" in rules:
            py_type = {"string": str, "number": (int, float), "integer": int, "boolean": bool}.get(rules["type"])
            if py_type and not isinstance(value, py_type):
                return False, f"Field '{field}' expected {rules['type']}, got {type(value).__name__}"

    return True, ""


def generate_with_validation(
    prompt: str,
    schema: dict,
    max_retries: int = 4,
    budget_usd: float = 0.10,
) -> dict[str, Any]:
    """
    Generate structured output with validation-loop protection.
    Uses RunGuard's LoopDetector with constraint-violation signatures
    to detect systematic validation failures before exhausting the retry budget.
    """
    loop_detector = LoopDetector(max_repeats=2, max_cycle_len=3)
    total_spent = 0.0
    HAIKU_IN  = 0.25 / 1_000_000
    HAIKU_OUT = 1.25 / 1_000_000

    messages = [{"role": "user", "content": prompt}]
    schema_description = json.dumps(schema, indent=2)

    for attempt in range(max_retries + 1):
        if total_spent >= budget_usd:
            raise BudgetExceededError(f"Validation budget exhausted: ${total_spent:.5f} >= ${budget_usd}")

        resp = client.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=512,
            system=f"Always respond with valid JSON matching this schema:\n{schema_description}",
            messages=messages,
        )
        output = resp.content[0].text
        total_spent += resp.usage.input_tokens * HAIKU_IN + resp.usage.output_tokens * HAIKU_OUT

        is_valid, error_msg = validate_task_output(output, schema)
        if is_valid:
            return {"output": json.loads(output), "attempts": attempt + 1, "spent_usd": total_spent}

        # Extract violation signature and check for loop
        violation_sig = extract_violation_signature(error_msg)
        match = loop_detector.record(violation_sig)
        if match:
            raise LoopDetectedError(
                f"Validation loop detected after {attempt + 1} attempts: "
                f"repeated violation '{violation_sig}' — model cannot satisfy constraint. "
                f"Error: {error_msg}"
            )

        if attempt < max_retries:
            # Add error feedback to conversation for next attempt
            messages.append({"role": "assistant", "content": output})
            messages.append({
                "role": "user",
                "content": f"The output failed validation: {error_msg}\nPlease fix and return valid JSON.",
            })

    raise RuntimeError(f"Validation failed after {max_retries + 1} attempts. Last error: {error_msg}")


# Example usage
TASK_SCHEMA = {
    "type": "object",
    "required": ["status", "confidence", "reasoning"],
    "properties": {
        "status": {"type": "string", "enum": ["success", "failure", "uncertain"]},
        "confidence": {"type": "number"},
        "reasoning": {"type": "string"},
    },
}

try:
    result = generate_with_validation(
        prompt="Classify this customer support ticket: 'My order hasn't arrived after 3 weeks.'",
        schema=TASK_SCHEMA,
        max_retries=4,
        budget_usd=0.05,
    )
    print(f"Result after {result['attempts']} attempts: {result['output']}")
    print(f"Cost: ${result['spent_usd']:.5f}")
except LoopDetectedError as e:
    print(f"[LOOP] {e}")
except BudgetExceededError as e:
    print(f"[BUDGET] {e}")

The loop detector trips on constraint violation category, not specific error text. If the model produces "status": "ok", then "status": "resolved", then "status": "done", all three produce the signature enum_violation:status. On the third attempt, loop_detector.record("enum_violation:status") returns a match, and the loop is detected and terminated. Without this signature normalization, all three retries produce different signatures and the detector never trips.

Validation retry strategies and their loop risk

Strategy	Loop detection quality	Cost risk	Recommended cap
Hard retry limit only	Good — terminates after N retries	Low if N is small (3–5)	max_retries=3 for most validation tasks
Loop detector with full error message as signature	Poor — same constraint, different messages = missed loop	High — detector never trips on varying errors	Do not use without signature normalization
Loop detector with violation-category signature	Excellent — catches same-constraint failures regardless of value	Low — trips after 2–3 retries of same constraint failure	max_repeats=2 trips on 3rd same-constraint failure
Structured output APIs (JSON mode, tool use)	Prevention, not detection — reduces validation failure rate	Low — parser enforces schema at API level	Prefer over free-text + validation when schema is fixed
Budget cap only	None — spends to cap before stopping	Medium — limits total spend but not loop efficiency	Always pair with loop detector, never use alone

For the general retry storm pattern, see AI agent retry storm prevention. For loop detection in tool call chains, see how to detect LLM tool call loops in production.

Stop validation retry storms before they drain your budget

RunGuard’s LoopDetector is the right primitive for validation loop prevention when used with normalized violation-category signatures. Set max_repeats=2 to trip after the third occurrence of the same constraint violation type, and pair it with a hard budget_usd cap via guard(). The combination ensures you never pay for more than a few validation retries regardless of whether the loop is detected structurally (same violation category) or by cost exhaustion (budget cap).

RunGuard pricing: Solo plan at $19/month for individual developers. Team plan at $79/month adds Slack and PagerDuty webhook alerts, shared dashboards, and audit log. Both plans include a 14-day free trial — no credit card required.

Start your 14-day free trial — or explore related: AI agent retry storm prevention, detect LLM tool call loops, stop AI agent infinite loops in TypeScript, AI agent infinite loop in Python, and autonomous agent cost control best practices.