Comparison

LangSmith vs RunGuard

This comparison is for AI teams deciding how to instrument their LangChain or LangGraph agents. LangSmith and RunGuard are not substitutes — they act at different points in a call lifecycle. Understanding the distinction prevents a common mistake: buying observability hoping it will catch loops before they bill you.

Quick verdict

Side by side

LangSmith RunGuard
Primary role Observability and eval Runtime circuit breaker
Fires Post-call (callbacks) Pre-call (synchronous)
Can halt a loop mid-run
Per-run USD budget cap
Context-window overrun guard
Works outside LangChain ecosystem Limited (traceable) ✓ (any async fn)
Trace replay and session explorer
Eval datasets and scoring
Prompt hub and versioning
Pricing model Developer free / $39/seat/mo Plus $0 trial / $19/mo Solo
Best for Post-run analysis and quality Preventing runaway runs

Why LangSmith callbacks cannot halt a loop

LangSmith instruments LangChain agents through LangChain's CallbackManager. When you add a LangChainTracer to your AgentExecutor or StateGraph, LangChain fires on_tool_start, on_llm_start, on_agent_action, and similar callbacks at each event. These callbacks fire after the event completes, cannot return a value that influences whether the next node executes, and have any raised exceptions caught by LangChain's callback runner rather than propagated to your agent code.

This is by design — callbacks are an observer pattern, not a guard pattern. The gap is structural, not a missing feature to request.

How recursion_limit compares to RunGuard

LangGraph's recursion_limit is a step counter: the graph halts after N total steps regardless of what each step does. It is a blunt instrument. A 30-step research workflow hits the limit as easily as a 30-iteration loop.

RunGuard counts pattern repetitions, not total steps. A legitimate 30-step agent with diverse tool calls will not trigger RunGuard. A 4-step loop (call A → call B → call A → call B) triggers RunGuard at the second cycle (step 5). When RunGuard fires, it throws LoopDetectedError with e.pattern (the repeating signature) and e.repeats (how many times it appeared) — actionable information your agent code can catch and log.

FAQ

If I use RunGuard, do I still need LangSmith?
Depends on what you need. If you need post-run trace analysis, eval datasets, or prompt versioning, yes — RunGuard provides none of those. If your only concern is preventing runaway loops and budget overruns, RunGuard alone is sufficient. Most production teams want both: RunGuard catches the bad runs before they complete; LangSmith analyzes what got through.
Does RunGuard integrate with LangSmith's trace view?
RunGuard's LoopDetectedError is a standard JavaScript/Python exception. When LangSmith's callback handler catches it (via on_chain_error or on_tool_error), it records it as an error span in the trace view — a useful row in your eval dataset showing which call signature caused the trip.

Get early access

Add the pre-call gate your LangSmith setup is missing

Join the waitlist →