Comparison
LangSmith vs RunGuard
This comparison is for AI teams deciding how to instrument their LangChain or LangGraph agents. LangSmith and RunGuard are not substitutes — they act at different points in a call lifecycle. Understanding the distinction prevents a common mistake: buying observability hoping it will catch loops before they bill you.
Quick verdict
- Choose LangSmith if you need post-run visibility, eval dataset management, prompt versioning, annotation queues, or quality regression tracking. LangSmith is the de-facto observability layer for LangChain/LangGraph agents, and its eval tooling is the most mature available.
- Choose RunGuard if you need to prevent a run from completing when a loop is detected — halt the agent at turn 3 instead of turn 25, cap a run's USD spend before it exceeds your budget, or fire a Slack alert the moment a context-window overrun is imminent. RunGuard is a synchronous pre-call gate, not a logging system.
- Use both if you want observability on runs that complete and prevention on runs that should not. They compose without conflict:
traceable(guard(fn)).
Side by side
| LangSmith | RunGuard | |
|---|---|---|
| Primary role | Observability and eval | Runtime circuit breaker |
| Fires | Post-call (callbacks) | Pre-call (synchronous) |
| Can halt a loop mid-run | ✗ | ✓ |
| Per-run USD budget cap | ✗ | ✓ |
| Context-window overrun guard | ✗ | ✓ |
| Works outside LangChain ecosystem | Limited (traceable) | ✓ (any async fn) |
| Trace replay and session explorer | ✓ | ✗ |
| Eval datasets and scoring | ✓ | ✗ |
| Prompt hub and versioning | ✓ | ✗ |
| Pricing model | Developer free / $39/seat/mo Plus | $0 trial / $19/mo Solo |
| Best for | Post-run analysis and quality | Preventing runaway runs |
Why LangSmith callbacks cannot halt a loop
LangSmith instruments LangChain agents through LangChain's CallbackManager. When you add a LangChainTracer to your AgentExecutor or StateGraph, LangChain fires on_tool_start, on_llm_start, on_agent_action, and similar callbacks at each event. These callbacks fire after the event completes, cannot return a value that influences whether the next node executes, and have any raised exceptions caught by LangChain's callback runner rather than propagated to your agent code.
This is by design — callbacks are an observer pattern, not a guard pattern. The gap is structural, not a missing feature to request.
How recursion_limit compares to RunGuard
LangGraph's recursion_limit is a step counter: the graph halts after N total steps regardless of what each step does. It is a blunt instrument. A 30-step research workflow hits the limit as easily as a 30-iteration loop.
RunGuard counts pattern repetitions, not total steps. A legitimate 30-step agent with diverse tool calls will not trigger RunGuard. A 4-step loop (call A → call B → call A → call B) triggers RunGuard at the second cycle (step 5). When RunGuard fires, it throws LoopDetectedError with e.pattern (the repeating signature) and e.repeats (how many times it appeared) — actionable information your agent code can catch and log.
FAQ
- If I use RunGuard, do I still need LangSmith?
- Depends on what you need. If you need post-run trace analysis, eval datasets, or prompt versioning, yes — RunGuard provides none of those. If your only concern is preventing runaway loops and budget overruns, RunGuard alone is sufficient. Most production teams want both: RunGuard catches the bad runs before they complete; LangSmith analyzes what got through.
- Does RunGuard integrate with LangSmith's trace view?
- RunGuard's
LoopDetectedErroris a standard JavaScript/Python exception. When LangSmith's callback handler catches it (viaon_chain_errororon_tool_error), it records it as an error span in the trace view — a useful row in your eval dataset showing which call signature caused the trip.