LangSmith doesn’t have a runtime loop detector. Here’s what does.

LangSmith is LangChain’s observability and evaluation platform. You add the LangSmith tracer to your AgentExecutor, StateGraph, or LCEL chain, and every tool call, every LLM invocation, every node transition, and every token count flows into LangSmith’s trace view in real time. After the run you can replay the trace, build an eval dataset from it, run your evaluators across it, and configure online scoring pipelines that sample production runs and flag quality regressions. LangSmith is deeply integrated with LangChain, LangGraph, and LCEL, and the @traceable decorator and traceable() wrapper make it straightforward to add to any Python or TypeScript function that is not already a LangChain object. All of this is genuinely useful. What LangSmith is not is a runtime loop detector: there is no onLoopDetected callback on the tracer, no hook the CallbackManager calls before each node re-executes to ask “has this signature appeared before?”, no in-process mechanism that throws LoopDetectedError when the same tool is called with the same arguments for the third consecutive time, and no pre-call gate that halts the next client.messages.create before it goes out. LangSmith records every step; it does not decide whether the next step should fire. The gap is real, measurable in dollars, and this page explains what closes it.

What LangSmith actually gives you

What LangSmith does not have, and why the gap costs money

The LangSmith trace view for a looping AgentExecutor run is illuminating: you can see every on_tool_start event, every on_tool_end, the tool name, the tool input, the tool output, and the LLM turn that generated each action. If the agent calls search(query="2026 AI safety summit agenda") in tool-use turn 3, turn 4, turn 5, turn 6, and turn 7 — the same query, the same tool, the same 200-token result each time — that pattern is perfectly visible in the LangSmith trace. The trace is an excellent record of what happened. What the LangSmith trace does not have is a langsmith.check_loop_before_tool_call(history) function you can call inside on_tool_start that would have raised an exception before turn 5’s tool call went out. The CallbackHandler interface does not expose a way to raise an exception from a callback that bubbles out of the chain and halts the run: the callback handlers are designed to be fire-and-forget, logging their events and returning. If you tried to raise inside on_tool_start, LangChain’s callback runner would catch it and continue. There is no langsmith.stop_run(run_id) API that retroactively terminates a running agent — LangSmith can mark a run as failed via the feedback API after it completes, but it cannot interrupt one mid-execution. The same gap exists for budget: the LangSmith Python SDK does not expose a get_current_run_cost() synchronous method you can call inside your agent loop. Token counts are in the trace, server-side, after the flush. If you want a synchronous in-process check that says “this run has now spent more than $5, stop before the next LLM call”, you write it yourself: accumulate the cost from each usage block into a local variable, compare, raise. LangSmith will faithfully trace the last generation before you raise, but it is not the mechanism that raised. This matters for the same reason it matters everywhere in LLM agent infrastructure: the generation that crosses $5 into $6, or the tool call that executes for the fortieth time, is a call that has already been made. The cost is already on the invoice. A tracer that records the fortieth call is useful for the post-mortem; a circuit breaker that prevents the fourth call (and thus the fifth through fortieth) is useful for the budget.

LangGraph loop shapes that LangSmith traces clearly and RunGuard prevents

RunGuard alongside LangSmith: the eight-line wrap for a LangGraph step

// Use both: LangSmith for post-run traces, RunGuard for pre-call loop detection.
import Anthropic from "@anthropic-ai/sdk";
import { traceable } from "langsmith/traceable";
import { guard, LoopDetectedError, BudgetExceededError } from "@runguard/sdk";

const client = new Anthropic();

// Inner step: LangSmith traces every call via traceable()
const _step = traceable(
  async (messages) => {
    const resp = await client.messages.create({
      model: "claude-sonnet-4-6", max_tokens: 1024, messages,
    });
    const u = resp.usage;
    const usd = u.input_tokens * 3e-6 + u.output_tokens * 15e-6;
    const tu = resp.content.find(b => b.type === "tool_use");
    return { resp, usd, sig: `sonnet:${tu?.name ?? "end_turn"}:${JSON.stringify(tu?.input ?? {}).slice(0, 64)}` };
  },
  { name: "agent-step", project_name: process.env.LANGCHAIN_PROJECT }
);

// Outer guard: RunGuard trips before the next _step() call if a loop is detected
const guardedStep = guard(_step, {
  signature: (_args, out) => out.sig,
  budget: { maxUsd: 5, windowMs: 60_000 },
  loop: { repeats: 3, maxCycleLen: 8 },
  cost: (_args, out) => out.usd,
  onTrip: (e) => console.log("[runguard]", e.reason, e.message),
});

try {
  while (!done) await guardedStep(messages);
} catch (e) {
  if (e instanceof LoopDetectedError)  console.log("halted: loop", e.pattern);
  if (e instanceof BudgetExceededError) console.log("halted: budget", e.spent, "/", e.cap);
}

The composition is: traceable() wraps the inner function so LangSmith records every call; guard() wraps the traced function so RunGuard checks the signature and budget before each call goes out. Order of operations: guard fires first (pre-call decision), LangSmith traces inside the call (post-call record). If the breaker opens, the inner traceable() function never executes — LangSmith records no new span for the halted turn, so the trace ends cleanly at the last successful step. LangSmith’s record of the run up to the trip point is intact; RunGuard’s breaker prevented the turn that would have extended it.

The tracing-vs-guarding distinction from a LangSmith angle

The LangGraph recursion_limit and why it is not the same as a loop detector

LangGraph has a built-in recursion_limit parameter (default 25 for most configurations) that raises GraphRecursionError when the graph has executed more steps than the limit. This is a useful backstop — without it, a looping graph would run until the process is killed or the account balance reaches zero. It is not the same as a loop detector for three reasons. First, it counts steps, not patterns: a graph that alternates between two nodes (A, B, A, B, A, B...) uses 25 steps before the limit fires, burning 12 or 13 full LLM calls; a loop detector that recognizes the (A, B) cycle fires at step 4 (the first repeat of the full cycle), burning 2. Second, recursion_limit is a blunt instrument: it fires regardless of whether the graph is making progress (a long legitimate multi-step research run also burns its step budget) or stuck in a tight cycle (same node, same state, same output). A signature-based loop detector only fires when it detects an actual repetition pattern in the tool-call history, leaving long legitimate runs untouched. Third, the recursion_limit is a hard stop with a non-descriptive error; the LoopDetectedError from RunGuard includes the full cycle pattern (e.pattern), the number of times it repeated (e.repeats), and the index in the window where it was detected — which means your error handler has the information it needs to log a structured event, route to a fallback, or email the on-call engineer with the exact tool-call sequence that triggered the trip. LangSmith’s trace at the time of the LoopDetectedError shows every generation up to the trip point; the e.pattern string from RunGuard tells you exactly which ones formed the cycle. Used together, you get both the precise structured error signal (RunGuard) and the full run context that explains why the loop formed (LangSmith).

Using LangSmith and RunGuard together: practical integration points

The LangSmith observability stack and where RunGuard sits in it

LangSmith is the primary tracing and eval layer for teams that build on LangChain and LangGraph. It integrates with the broader LangChain ecosystem: LangChainTracer auto-instruments AgentExecutor and LCEL chains, the LangSmith Hub provides shared prompt versioning, and the LangSmith Python and TypeScript SDKs expose traceable() for any non-LangChain function. The LangSmith dashboard shows cost trends, latency percentiles, eval score distributions, and per-user aggregations. For teams running large LangGraph pipelines in production, LangSmith’s run-level debugging is often the first tool opened when an agent produces unexpected output. RunGuard sits one layer below LangSmith in the call stack — inside the innermost function that makes the LLM API call — and makes a pre-call decision that LangSmith never sees if the decision is “trip”. The two products are not in the same category: LangSmith is observability infrastructure, RunGuard is safety infrastructure. Adding RunGuard to a LangSmith-instrumented stack does not change what LangSmith records (it records every call that proceeds past the guard) and does not change the guard’s behaviour based on what LangSmith has seen (the guard reads only its own in-process state). The LangGraph infinite loop guard page covers the StateGraph-specific wrap in more detail, including how to share a single guard instance across multiple graph nodes so the loop detector sees the full cross-node signature sequence rather than per-node histories. The LangChain circuit breaker page covers AgentExecutor and LCEL chain wrapping. The Langfuse alternative page explains the same observer-vs-guardian distinction for the Langfuse SDK. If your stack uses both LangSmith and LangGraph, all three pages apply: the LangSmith integration is the annotation layer, the LangGraph wrap is the structural layer, and RunGuard is the pre-call decision layer.

The first loop our SDK caught was ours — same gap, different surface

We built RunGuard while running a Claude Agent SDK session that posts a six-tweet launch thread via the X API once per day. The session had no LangSmith tracing at the time — it was a bespoke script, not a LangChain agent. The first attempt came back HTTP 402 CreditsDepleted. The next day: same error. Sessions three through six: same. Six consecutive sessions, same endpoint, same payload shape, same 402 response, same zero-progress result. The exact pattern a LangSmith trace would have shown clearly — identical post_tweet spans, identical error responses, a human reviewer seeing it would immediately recognise the loop. What was missing was the mechanism that fires before the seventh attempt. At session seven we loaded the six-entry history into our LoopDetector on startup and it found a length-1 cycle of depth 6 in the signature window, opened the breaker before any HTTP call went out, and exited cleanly. The seventh, eighth, and all subsequent sessions have exited the same way: preflight detects the persisted loop history, exit code 4, zero new API calls, zero new cost. The pattern is the same one any tool-call loop produces regardless of whether the loop is inside a LangGraph StateGraph, an AgentExecutor ReAct loop, or a bespoke daily script: same signature, repeated past the threshold, detectable before the next call fires. Read the full dogfood story on the 30-day log.

What this is not

The minimum integration alongside an existing LangSmith setup

One npm i @runguard/sdk (TypeScript) or pip install runguard (Python). One guard() wrap around the function that calls your LLM provider SDK — the same function where you already have the traceable() call for LangSmith, or where the LangChain tracer fires its on_llm_start event. Two new return fields: usd (compute from response.usage and your per-token rate — the same number you already compute for LangSmith cost tracking) and sig (the tool name plus a 64-byte slice of the tool input, or "end_turn" if the response was not a tool call). One budget option (maxUsd: 5) and one loop option (repeats: 3, maxCycleLen: 8). That is the entire integration delta. Your existing LangSmith setup — traces, eval datasets, Hub prompts, dashboard alerts — is unchanged. What changes is that the next LLM call after a budget crossing or a loop detection fires a typed exception rather than proceeding. LangSmith records every call up to the trip point; RunGuard prevents the call that would have extended the loop. RunGuard ships as @runguard/sdk on npm and runguard on PyPI. The canonical API surface is documented in llms.txt for LLM-assisted integration. The CrewAI loop detection page, browser-use cost cap page, and AgentKit budget alert page cover the same guard-wrap pattern applied to other agent frameworks. If you use LangSmith with any of those frameworks, the composition is the same: guard(traceable(innerFn)), LangSmith traces what proceeds, RunGuard prevents what shouldn’t.