Bolt.new Cost Control: WebContainer Build Loops, Terminal Error Context Injection, Full-File Snapshot Re-reads, and Native Module Install Spirals

Bolt.new is an AI-powered web application generation platform built by StackBlitz that designs, writes, runs, and deploys full-stack web applications entirely within the browser. The execution environment is WebContainer — StackBlitz's WASM-based Node.js runtime that boots a complete Node.js environment inside a browser tab, with no server-side compute required. A developer describes what they want to build ("a SaaS dashboard with user authentication, a data table, and a Stripe billing page"), and Bolt generates the complete application — React components, TypeScript types, API routes, database schema, environment configuration — then runs it live in a preview iframe, all without leaving the browser.

This architecture creates a different cost profile than cloud-hosted AI coding tools like Replit Agent or Devin. Because WebContainer runs inside the browser, every build step — TypeScript compilation, Vite bundling, npm dependency installation — executes in the WASM runtime using the user's local CPU. But the AI generation side consumes tokens from Bolt.new's API backend regardless of where compute runs. Bolt.new charges in tokens — a composite of AI model input and output consumption across the conversation thread for a project. Pro plan users get 10 million tokens per month; Teams users get 20 million per seat. When a Bolt session runs correctly, a feature addition typically costs 50,000–200,000 tokens. When it enters a loop, the same feature addition can consume 800,000–3,000,000 tokens before the user manually intervenes or hits their monthly ceiling.

The root cause is structural: Bolt.new's generation loop — generate code, trigger WebContainer build, read build output, generate next iteration — accumulates costs at four distinct points that each operate independently of the others. WebContainer build loops occur when Vite's always-on HMR triggers a full TypeScript recompilation after every file the AI writes, so a 20-file refactor with cascading TypeScript errors triggers 40–65 build cycles instead of 20. Terminal error context injection occurs when Bolt injects the full WebContainer terminal output — TypeScript diagnostics, Vite bundler stack traces, npm resolution logs — into the AI prompt at every retry, so three failed build attempts can inject 60,000–100,000 tokens of error text before a single fix is generated. Full-file VFS snapshot re-reads occur because WebContainer's virtual filesystem requires Bolt to snapshot all relevant file contents at every generation step rather than providing incremental diffs, so a small change request on a 30-file project re-sends all 30 files' contents as input tokens. And native module install spirals occur when users request packages with Node.js native add-ons — which can't compile in a WASM environment — causing Bolt to cycle through install attempts and alternative packages, each attempt generating verbose npm output injected into context.

What this post covers: Four cost amplification patterns specific to Bolt.new's WebContainer build loops, terminal error context injection, full-file VFS snapshot re-reads, and native module install spirals — and a runtime circuit breaker guard for each. The guards operate at the orchestration layer, giving you token spend ceilings without modifying Bolt's generation behavior for tasks that fit within budget.

Pattern 1: WebContainer Vite Build Loop Amplification

Bolt.new generates application code by writing files into WebContainer's virtual filesystem one at a time. After writing each file that affects the application structure — a TypeScript source file, a component, a route handler, a type definition — WebContainer's Vite dev server detects the file change and triggers Hot Module Replacement (HMR). HMR runs TypeScript type-checking on the modified file and its import graph, bundles the updated module, and refreshes the preview iframe. For a clean, linear feature addition where each file write produces no errors, this is fast and correct: 20 files × one clean build each = 20 build cycles.

The failure mode appears when a structural change — adding a new prop to a shared type, changing a function signature used across multiple components, restructuring a module's exports — introduces TypeScript errors in files that import the changed module. Bolt writes src/types/user.ts with the updated interface. Vite's HMR immediately type-checks all files that import user.ts: three components, two API route handlers, and a Zod validation schema. Five of those six files now have TypeScript errors because they reference the old interface shape. Bolt reads the type errors from WebContainer's terminal output, determines which files need updating, writes the next file, triggers HMR again. The cascade continues: updating UserCard.tsx resolves its own error but triggers a type mismatch in UserList.tsx that imports UserCard's props. A 20-file feature refactor with cascading TypeScript errors can trigger 40–65 HMR build cycles rather than 20.

The compounding factor is Vite's incremental type-check scope. TypeScript's incremental type-checker, when a file changes, checks all files in that file's transitive import graph — not just the direct imports. A change to src/lib/api.ts imported by 12 components triggers a type-check that spans all 12 component files even if only 3 actually have errors. In a medium-complexity Bolt project where shared utilities are imported broadly, each build cycle takes 8–18 seconds of WebContainer compute and generates 3,000–8,000 tokens of terminal output read by the AI. Across 60 build cycles on a stuck refactor, that's 180,000–480,000 tokens of build output injected into the conversation thread before the refactor completes.

Bolt.new React/TypeScript project, 20-file feature refactor (cascading type errors):
Clean implementation (1 build per file): 20 builds × 5,000 output tokens = 100,000 build output tokens
With cascading type errors (3.2× multiplier): 64 builds × 5,000 output tokens = 320,000 build output tokens
Additional AI generation iterations to fix cascade: +12 generation steps × 15,000 tokens = 180,000 tokens
Total session overhead vs. clean: +400,000 tokens per looping refactor
Pro plan ($20/mo, 10M tokens): cascade overhead = 4% of monthly budget per session
Teams (20 devs × 5 looping refactors/month): 2M tokens/month in build loop overhead alone

Python — BoltBuildGuard

import hashlib
from dataclasses import dataclass, field

@dataclass
class BoltBuildGuard:
    max_builds_per_session: int = 50
    max_consecutive_failures: int = 6
    max_build_output_tokens: int = 300_000
    error_fingerprint_window: int = 4
    tokens_per_line_estimate: float = 7.5
    _build_count: int = field(default=0, init=False)
    _consecutive_failures: int = field(default=0, init=False)
    _cumulative_output_tokens: int = field(default=0, init=False)
    _recent_error_fingerprints: list = field(default_factory=list, init=False)

    def _fingerprint(self, terminal_output: str) -> str:
        # Normalize TypeScript error codes and line numbers to detect
        # structurally identical errors repeating across cascade iterations
        import re
        normalized = re.sub(r':\d+:\d+', ':L:C', terminal_output)
        normalized = re.sub(r"'[^']*'", "'X'", normalized)[:640]
        return hashlib.sha256(normalized.encode()).hexdigest()[:12]

    def on_build_start(self, trigger_file: str = "") -> None:
        self._build_count += 1
        if self._build_count > self.max_builds_per_session:
            raise RuntimeError(
                f"BoltBuildGuard: build ceiling {self.max_builds_per_session} "
                f"reached (triggered by {trigger_file or 'unknown'}). "
                "Bolt is writing files one at a time and triggering Vite HMR "
                "after each write. Batch all remaining file writes and trigger "
                "a single build rather than incremental per-file builds."
            )

    def on_build_complete(
        self,
        success: bool,
        terminal_output: str,
        trigger_file: str = "",
    ) -> None:
        output_tokens = int(
            len(terminal_output.splitlines()) * self.tokens_per_line_estimate
        )
        self._cumulative_output_tokens += output_tokens

        if self._cumulative_output_tokens > self.max_build_output_tokens:
            raise RuntimeError(
                f"BoltBuildGuard: cumulative build output {self._cumulative_output_tokens:,} tokens "
                f"exceeds ceiling {self.max_build_output_tokens:,} across "
                f"{self._build_count} builds. "
                "Terminal output accumulation is the primary token driver — "
                "truncate error output to last 30 lines per build failure "
                "rather than injecting full terminal logs."
            )

        if not success:
            self._consecutive_failures += 1
            fp = self._fingerprint(terminal_output)
            self._recent_error_fingerprints.append(fp)
            if len(self._recent_error_fingerprints) > self.error_fingerprint_window:
                self._recent_error_fingerprints.pop(0)

            if (
                len(self._recent_error_fingerprints) == self.error_fingerprint_window
                and len(set(self._recent_error_fingerprints)) == 1
            ):
                raise RuntimeError(
                    f"BoltBuildGuard: identical TypeScript error across last "
                    f"{self.error_fingerprint_window} builds "
                    f"(file: {trigger_file or 'unknown'}). "
                    "Agent is in a circular type error loop — the cascade is not "
                    "resolving via local per-file fixes. Require a full type audit "
                    "before writing any additional files."
                )

            if self._consecutive_failures >= self.max_consecutive_failures:
                raise RuntimeError(
                    f"BoltBuildGuard: {self._consecutive_failures} consecutive "
                    f"WebContainer build failures. Halt per-file iteration — "
                    "plan the full type-safe change set before writing any more files."
                )
        else:
            self._consecutive_failures = 0
            self._recent_error_fingerprints.clear()

BoltBuildGuard trips on four conditions: the total build count ceiling (max_builds_per_session=50), the cumulative build output token ceiling (max_build_output_tokens=300_000), the consecutive failure ceiling (max_consecutive_failures=6), and the error fingerprint loop detector — which identifies when the last N build errors are structurally identical after normalizing TypeScript error codes and line numbers, indicating the agent is chasing a circular type cascade without making progress. Call on_build_start(trigger_file) before each WebContainer build and on_build_complete(success, terminal_output, trigger_file) after. The build output token ceiling is the most effective Bolt-specific intervention because WebContainer terminal output is the dominant token driver in looping Bolt sessions — far more so than the AI generation steps themselves.

Pattern 2: Terminal Error Context Injection into the AI Prompt

When a WebContainer build fails, Bolt.new injects the full terminal output from the failed build into the next AI generation request. The terminal output includes the TypeScript compiler's diagnostic messages, Vite's bundler output, any runtime errors from the Node.js process inside WebContainer, and npm's resolution and installation logs if a package operation was involved. For a simple TypeScript error in a single file, this output is 20–40 lines. For a cascading type error across a 15-component project, or an npm peer dependency conflict across a project with 100+ dependencies, the terminal output can be 3,000–15,000 lines.

The failure mode is cumulative. Bolt does not summarize or truncate the terminal output before injection — it sends the complete terminal log because the full context is required for the AI to correctly diagnose which files need changes and in what order. At the first failed build, 5,000 lines × 7.5 tokens/line = 37,500 tokens of terminal output are added to the context window. At the second failed build, another 37,500 tokens. By the third failed build, the context contains 112,500 tokens of terminal output from three failed builds — before accounting for the file contents, the conversation history, or the AI's reasoning steps. A Bolt session that hits 5 failed builds on a complex TypeScript refactor can inject 150,000–250,000 tokens of error text into the prompt before the first successful build cycle completes.

The Bolt-specific aggravation is that WebContainer captures both the Vite process's stderr and the TypeScript language server's diagnostic output in the same terminal stream. In a local development environment, these are separate processes you can filter independently. In WebContainer, they merge into a single terminal log that includes every file path, every line and column number, every diagnostic code, and every "note:" context line from the TypeScript compiler's error explanation. A single TypeScript strict-mode error in a file with 5 import paths generates 12–18 lines of diagnostic output; a cascade across 15 files generates 180–270 lines from TypeScript alone, before Vite adds its own bundler-layer error framing on top.

Bolt.new session with 5 consecutive failed builds (npm + TypeScript cascade):
Each failed build terminal output: ~4,000 lines × 7.5 tokens/line = 30,000 tokens
5 failed builds: 5 × 30,000 = 150,000 error context tokens
Plus AI generation per retry: 5 × 8,000 tokens = 40,000 tokens
Total error-driven session overhead: 190,000 tokens on error handling alone
If terminal output truncated to 50 lines per failure: 5 × 375 = 1,875 tokens
Savings from truncation: 148,125 tokens = 78% reduction in error context cost

Python — BoltErrorContextGuard

from dataclasses import dataclass, field

@dataclass
class BoltErrorContextGuard:
    max_error_tokens_per_injection: int = 8_000
    max_cumulative_error_tokens: int = 60_000
    max_retry_attempts: int = 5
    tokens_per_line_estimate: float = 7.5
    truncate_to_lines: int = 50
    _total_error_tokens: int = field(default=0, init=False)
    _retry_count: int = field(default=0, init=False)

    def prepare_error_for_injection(
        self,
        terminal_output: str,
        is_retry: bool = False,
    ) -> str:
        """
        Call this before injecting terminal error output into the AI prompt.
        Returns a (possibly truncated) version of terminal_output and raises
        RuntimeError if either ceiling is exceeded.
        """
        if is_retry:
            self._retry_count += 1
            if self._retry_count > self.max_retry_attempts:
                raise RuntimeError(
                    f"BoltErrorContextGuard: {self._retry_count} retry attempts "
                    f"(ceiling: {self.max_retry_attempts}). "
                    "Repeated injection of build error context is not resolving the failure. "
                    "Surface the current state to the user for manual diagnosis "
                    "rather than continuing to retry with full error context."
                )

        lines = terminal_output.splitlines()
        raw_tokens = int(len(lines) * self.tokens_per_line_estimate)

        if raw_tokens > self.max_error_tokens_per_injection:
            # Truncate to last N lines — the most recent errors are most actionable
            lines = lines[-self.truncate_to_lines:]
            terminal_output = (
                f"[Error output truncated: {len(lines)} of "
                f"{len(terminal_output.splitlines())} lines shown — "
                f"last {self.truncate_to_lines} lines]\n"
                + "\n".join(lines)
            )
            injected_tokens = int(
                len(terminal_output.splitlines()) * self.tokens_per_line_estimate
            )
        else:
            injected_tokens = raw_tokens

        self._total_error_tokens += injected_tokens

        if self._total_error_tokens > self.max_cumulative_error_tokens:
            raise RuntimeError(
                f"BoltErrorContextGuard: cumulative error context "
                f"{self._total_error_tokens:,} tokens exceeds ceiling "
                f"{self.max_cumulative_error_tokens:,} across "
                f"{self._retry_count} retries. "
                "Error injection is the dominant token driver this session. "
                "Switch to a structured error summary (file path + error code only) "
                "rather than full terminal output injection."
            )

        return terminal_output

    @property
    def summary(self) -> dict:
        return {
            "retry_count": self._retry_count,
            "total_error_tokens": self._total_error_tokens,
            "remaining_error_budget": max(
                0, self.max_cumulative_error_tokens - self._total_error_tokens
            ),
        }

BoltErrorContextGuard enforces two constraints and applies active truncation before either ceiling is exceeded. Call prepare_error_for_injection(terminal_output, is_retry) before each injection of WebContainer terminal output into the AI prompt — pass is_retry=True when the agent is explicitly retrying after a failed build. The method returns the terminal output string that should actually be injected: either the original (if within the per-injection ceiling of 8,000 tokens) or the last 50 lines with a truncation notice prepended. The per-injection ceiling prevents a single catastrophic build failure with 15,000 lines of output from consuming the session's error context budget in one shot; the cumulative ceiling catches the slower accumulation pattern where 5–8 moderate failures add up to the same total. When the cumulative ceiling is exceeded, the trip message suggests switching to a structured error summary — just the file path, line number, and TypeScript error code rather than the full diagnostic context — which reduces error injection cost by 80–90% while preserving the information the AI actually uses to plan fixes.

Pattern 3: Full-File VFS Snapshot Re-reads at Every Generation Step

Bolt.new's AI receives the current state of the project as context at every generation step. Because WebContainer's virtual filesystem is a snapshot-based in-memory filesystem rather than a stream-based diffing system, Bolt provides the AI with complete file contents for every file relevant to the current task — the component being modified, its imports, the types it references, the API routes it calls — rather than incremental diffs from the last generation step. This is architecturally necessary: the AI needs to see the actual current state of each file to generate a coherent update, since multiple prior generation steps may have modified it.

The cost accumulates as the project grows. On a small project (5 files, 100 lines each), re-sending all file contents at each generation step costs 5 × 100 × 7.5 = 3,750 tokens per step — negligible. On a medium project (30 files, 180 lines average, including TypeScript type files that tend to run long), the same operation costs 30 × 180 × 7.5 = 40,500 tokens per step. For a Bolt session that requires 15 generation steps to build a complex feature — add a new page, wire up the API, handle authentication, write tests — the file content re-reads alone contribute 15 × 40,500 = 607,500 tokens of input context before accounting for conversation history, the user's instructions, or the AI's generated code. At 15 generation steps, that's already 6% of a Pro user's monthly token budget consumed by file content re-transmission for a single session.

The pattern amplifies when the AI revises the same file multiple times. If src/components/Dashboard.tsx is read and rewritten at steps 3, 7, 9, 11, and 14 of a 15-step session, that file's content is transmitted as input tokens 10 times (5 reads + 5 writes × 200 lines × 7.5 tokens/line = 75,000 tokens from one component file across the session). A Dashboard component that is the central hub of the feature being built — the file that receives the most iterative refinement — can individually account for 15–25% of the session's total input token consumption.

Bolt.new session, 30-file project, 15 generation steps:
File context per step: 30 files × 180 avg lines × 7.5 tokens/line = 40,500 tokens
15 steps × 40,500 tokens = 607,500 input tokens in file re-reads
Hot file (Dashboard.tsx, 200 lines, revised 5× at steps 3/7/9/11/14):
10 read+write events × 200 lines × 7.5 = 15,000 tokens from one file
Pro plan token budget: 10M tokens/mo
One complex Bolt session: ~700K tokens = 7% of monthly Pro budget
Teams account: 10 developers × 5 complex sessions/month = 35M tokens vs. 20M plan ceiling

Python — BoltSnapshotGuard

from dataclasses import dataclass, field

@dataclass
class BoltSnapshotGuard:
    max_snapshot_tokens_per_step: int = 60_000
    max_cumulative_snapshot_tokens: int = 800_000
    max_revisions_per_file: int = 6
    max_total_file_revisions: int = 60
    tokens_per_line_estimate: float = 7.5
    _cumulative_snapshot_tokens: int = field(default=0, init=False)
    _step_count: int = field(default=0, init=False)
    _total_file_revisions: int = field(default=0, init=False)
    _revisions_per_file: dict = field(default_factory=dict, init=False)

    def on_generation_step_start(
        self,
        files_in_context: dict[str, int],  # {filepath: line_count}
    ) -> None:
        """Call at the start of each Bolt generation step with all files being sent as context."""
        self._step_count += 1
        step_tokens = sum(
            int(lines * self.tokens_per_line_estimate)
            for lines in files_in_context.values()
        )
        self._cumulative_snapshot_tokens += step_tokens

        if step_tokens > self.max_snapshot_tokens_per_step:
            large_files = sorted(
                [(f, int(l * self.tokens_per_line_estimate))
                 for f, l in files_in_context.items()],
                key=lambda x: x[1],
                reverse=True,
            )[:5]
            raise RuntimeError(
                f"BoltSnapshotGuard: step {self._step_count} snapshot "
                f"{step_tokens:,} tokens exceeds per-step ceiling "
                f"{self.max_snapshot_tokens_per_step:,}. "
                f"Largest files: {large_files}. "
                "Reduce context to only the files directly modified in this step "
                "rather than sending the full project snapshot."
            )

        if self._cumulative_snapshot_tokens > self.max_cumulative_snapshot_tokens:
            raise RuntimeError(
                f"BoltSnapshotGuard: cumulative file snapshot tokens "
                f"{self._cumulative_snapshot_tokens:,} exceed ceiling "
                f"{self.max_cumulative_snapshot_tokens:,} across "
                f"{self._step_count} generation steps. "
                "File re-reads are the dominant input token cost this session. "
                "Scope remaining changes to the fewest files needed and "
                "avoid re-reading stable files at every step."
            )

    def on_file_written(self, filepath: str, line_count: int) -> None:
        """Call each time Bolt writes a file to the WebContainer VFS."""
        self._total_file_revisions += 1
        self._revisions_per_file[filepath] = (
            self._revisions_per_file.get(filepath, 0) + 1
        )

        if self._revisions_per_file[filepath] > self.max_revisions_per_file:
            raise RuntimeError(
                f"BoltSnapshotGuard: file '{filepath}' has been written "
                f"{self._revisions_per_file[filepath]} times "
                f"(ceiling: {self.max_revisions_per_file}). "
                "Repeated rewrites of the same file drive both snapshot token cost "
                "and build loop amplification. Refactor the design rather than "
                "continuing to patch the current implementation."
            )

        if self._total_file_revisions > self.max_total_file_revisions:
            raise RuntimeError(
                f"BoltSnapshotGuard: total file write ceiling "
                f"{self.max_total_file_revisions} reached "
                f"({len(self._revisions_per_file)} unique files written). "
                "Excessive file writes indicate a scope problem — "
                "scope remaining work to the highest-value changes."
            )

    @property
    def hot_files(self) -> list[tuple[str, int]]:
        """Files written more than twice — primary drivers of snapshot token cost."""
        return sorted(
            [(f, c) for f, c in self._revisions_per_file.items() if c > 2],
            key=lambda x: x[1],
            reverse=True,
        )

BoltSnapshotGuard tracks four dimensions: the per-generation-step snapshot token count (max_snapshot_tokens_per_step=60_000), the cumulative snapshot tokens across the session (max_cumulative_snapshot_tokens=800_000), the per-file write count (max_revisions_per_file=6), and the total file write count (max_total_file_revisions=60). Call on_generation_step_start(files_in_context) at the beginning of each Bolt generation step, passing a dict of {filepath: line_count} for all files being included in the AI's context — the guard calculates the token cost and trips if either ceiling is exceeded. Call on_file_written(filepath, line_count) each time Bolt writes a file to WebContainer. Use the hot_files property to identify which files are being rewritten most frequently — these are the primary targets for design stabilization.

Pattern 4: Native Module Incompatibility Install Spirals in WebContainer

WebContainer's WASM-based Node.js runtime runs entirely inside the browser's JavaScript engine. This means it cannot execute Node.js native add-ons — compiled C++ extensions built via node-gyp or prebuild — because there is no way to compile or load machine-code binaries in a WASM environment. The list of packages that fail to install correctly in WebContainer because they depend on native add-ons is substantial: bcrypt (password hashing), sharp (image processing), canvas (server-side Canvas API), sqlite3 (SQLite), argon2 (password hashing), node-sass (Sass compilation), puppeteer (headless Chrome), fsevents (file system events on macOS), and many cryptography libraries that use native OpenSSL bindings.

When a user asks Bolt to "add authentication with bcrypt password hashing" or "add image upload processing with sharp," Bolt attempts to install the requested package. The install fails — WebContainer's npm either produces a node-gyp compilation error or installs successfully but fails at runtime when the native module attempts to load. Bolt reads the error, recognizes the issue, and tries the most common WASM-compatible alternative: bcryptjs instead of bcrypt, jimp or @squoosh/lib instead of sharp. The alternative installs, but may have a different API that requires updating every file that imported the original package. Bolt writes the updated imports, triggers HMR, and discovers a second issue — the alternative's TypeScript types have incompatible signatures from the original package's types, requiring a third round of changes.

Each install attempt in this cycle consumes 30–90 seconds of WebContainer compute and produces 500–3,000 lines of npm output injected into the AI context. A typical native module substitution spiral — original package fails, first alternative fails for a different reason, second alternative succeeds but needs API adaptation — runs 3–5 npm install invocations and generates 3,000–9,000 lines of npm output across the spiral. For users requesting multiple native-dependent packages (authentication + image processing + PDF generation in a single session), each package can trigger its own independent spiral, with the install output from each spiral stacking in the conversation context.

Bolt.new session, authentication + image processing (native module failures):
bcrypt attempt: npm output 800 lines × 7.5 = 6,000 tokens; runtime fail + AI regen = 10,000 tokens
bcryptjs fallback: install 200 lines + API adaptation generation = 8,000 tokens
sharp attempt: npm output 1,200 lines × 7.5 = 9,000 tokens; fail + AI regen = 12,000 tokens
jimp fallback: install 300 lines + API adaptation (different interface) = 14,000 tokens
jimp typing issues: 400 lines error output + fix generation = 9,000 tokens
Total spiral cost: 68,000 tokens for 2 packages that should cost ~4,000 tokens combined
Overhead multiplier: 17× expected cost for native module substitution

Python — BoltNativeModuleGuard

from dataclasses import dataclass, field

# Packages known to require native addons incompatible with WebContainer WASM
WEBCONTAINER_INCOMPATIBLE: frozenset = frozenset({
    "bcrypt", "argon2", "argon2-cffi", "argon2-cffi-bindings",
    "sharp", "canvas", "node-canvas",
    "sqlite3", "better-sqlite3",
    "node-sass", "sass-embedded",
    "puppeteer", "puppeteer-core",
    "fsevents",
    "cpu-features",
    "node-gyp",
    "node-addon-api",
    "bindings",
    "nan",
    "kerberos",
    "mongodb-client-encryption",
    "snappy",
    "re2",
})

KNOWN_ALTERNATIVES: dict = {
    "bcrypt": "bcryptjs",
    "argon2": "argon2-browser",
    "sharp": "jimp",
    "canvas": "@napi-rs/canvas",
    "sqlite3": "@electric-sql/pglite",
    "node-sass": "sass",
    "puppeteer": "playwright (note: also incompatible with WebContainer)",
}

@dataclass
class BoltNativeModuleGuard:
    max_install_attempts: int = 6
    max_install_output_tokens: int = 40_000
    max_npm_output_lines_per_injection: int = 80
    tokens_per_line_estimate: float = 7.5
    _install_count: int = field(default=0, init=False)
    _cumulative_install_tokens: int = field(default=0, init=False)
    _failed_packages: list = field(default_factory=list, init=False)

    def check_package_before_install(self, package_name: str) -> str | None:
        """
        Call before each npm install. Returns a warning string if the package
        is known to be incompatible with WebContainer, or None if safe to proceed.
        """
        base_name = package_name.split("@")[0].strip()
        if base_name in WEBCONTAINER_INCOMPATIBLE:
            alternative = KNOWN_ALTERNATIVES.get(base_name)
            warning = (
                f"BoltNativeModuleGuard: '{base_name}' requires native Node.js "
                f"add-ons compiled via node-gyp and cannot run in WebContainer's "
                f"WASM environment."
            )
            if alternative:
                warning += f" Use '{alternative}' instead."
            return warning
        return None

    def on_install_start(self, packages: list[str]) -> None:
        self._install_count += 1
        if self._install_count > self.max_install_attempts:
            raise RuntimeError(
                f"BoltNativeModuleGuard: npm install attempt ceiling "
                f"{self.max_install_attempts} reached "
                f"({self._install_count} attempts, "
                f"failed packages: {self._failed_packages}). "
                "Native module substitution spiral detected — "
                "declare all dependencies upfront with WASM-compatible alternatives "
                "rather than discovering incompatibilities iteratively."
            )

    def on_install_complete(
        self,
        success: bool,
        packages: list[str],
        npm_output: str,
    ) -> str:
        """
        Call after each npm install. Returns (possibly truncated) npm_output
        safe to inject into AI context, and raises if cumulative ceiling exceeded.
        """
        lines = npm_output.splitlines()
        if len(lines) > self.max_npm_output_lines_per_injection:
            # Keep last N lines — the error cause is always at the end of npm output
            lines = lines[-self.max_npm_output_lines_per_injection:]
            npm_output = (
                f"[npm output truncated to last {self.max_npm_output_lines_per_injection} lines]\n"
                + "\n".join(lines)
            )

        injected_tokens = int(len(lines) * self.tokens_per_line_estimate)
        self._cumulative_install_tokens += injected_tokens

        if not success:
            self._failed_packages.extend(packages)

            native_failures = [
                p for p in packages
                if p.split("@")[0] in WEBCONTAINER_INCOMPATIBLE
            ]
            if native_failures:
                alts = {p: KNOWN_ALTERNATIVES.get(p.split("@")[0], "unknown")
                        for p in native_failures}
                raise RuntimeError(
                    f"BoltNativeModuleGuard: native module install failure detected "
                    f"for {native_failures}. "
                    f"WebContainer cannot compile or load native Node.js add-ons. "
                    f"Switch to WASM-compatible alternatives: {alts}. "
                    "Do not retry with --ignore-scripts or --force — "
                    "the package will install but fail at runtime."
                )

        if self._cumulative_install_tokens > self.max_install_output_tokens:
            raise RuntimeError(
                f"BoltNativeModuleGuard: cumulative npm output "
                f"{self._cumulative_install_tokens:,} tokens exceed ceiling "
                f"{self.max_install_output_tokens:,} across "
                f"{self._install_count} install attempts. "
                "npm output injection is inflating session token cost — "
                "resolve all package dependencies in a single install command "
                "rather than iterative per-package installs."
            )

        return npm_output

BoltNativeModuleGuard addresses the native module problem at three levels. Before each install, check_package_before_install(package_name) screens against a known incompatibility list and returns an alternative recommendation before the install is attempted — skipping the entire install-fail-retry cycle for the most common offenders. During the install, on_install_start(packages) enforces the attempt ceiling (max_install_attempts=6). After the install, on_install_complete(success, packages, npm_output) truncates npm output to the last 80 lines before injection, detects native-module-specific failures by checking which failed packages are in the incompatibility set, and enforces the cumulative install output token ceiling (max_install_output_tokens=40_000). The WEBCONTAINER_INCOMPATIBLE and KNOWN_ALTERNATIVES sets are designed to be extended as you discover additional packages that fail in your Bolt projects — the pattern of native-module incompatibility is stable and grows as users request more complex package combinations.

Putting It Together: Bolt.new Guard Configuration

Each of the four guards addresses a distinct cost amplification point in Bolt.new's WebContainer generation loop. A complex Bolt session — building a feature that touches many files, requires new packages including at least one with native dependencies, hits TypeScript cascade errors, and requires multiple generation steps — can trigger all four patterns simultaneously. The combined overhead on a 20-file feature addition with a native module substitution spiral and 5 failed builds can be 10–20× the token cost of a clean, linear implementation of the same feature.

Guard	Primary trigger	Key threshold	Trip action
`BoltBuildGuard`	Build count, build output tokens, consecutive failures, circular TypeScript error fingerprint	`max_builds=50`, `max_output_tokens=300K`, `max_failures=6`	Batch remaining file writes, trigger single build; escalate circular errors to full type audit
`BoltErrorContextGuard`	Error tokens per injection, cumulative error tokens, retry attempt count	`max_per_injection=8K`, `max_cumulative=60K`, `max_retries=5`	Truncate to last 50 terminal lines; switch to structured error summary (file:line:code) on cumulative trip
`BoltSnapshotGuard`	Per-step file snapshot tokens, cumulative snapshot tokens, per-file write count, total writes	`max_step=60K`, `max_cumulative=800K`, `per_file=6`	Scope context to directly-modified files; stabilize hot files before writing more
`BoltNativeModuleGuard`	Install attempts, native module detection, npm output tokens per injection	`max_attempts=6`, `max_output_tokens=40K`, `lines_per_injection=80`	Pre-screen packages against incompatibility list; substitute WASM-compatible alternatives before install

Wire all four guards into your Bolt.new orchestration layer — whether you're calling Bolt's API programmatically, building automation on top of StackBlitz's WebContainer primitives, or instrumenting a custom agent that follows the same generate-write-build-check loop pattern. The guards give you observable, configurable token spend ceilings at each cost amplification point, surfacing overruns before they appear in the monthly Bolt.new token usage dashboard.

Frequently asked questions

Can I access Bolt.new's token consumption metrics programmatically?

Bolt.new exposes token usage in its UI dashboard at the session level, but does not currently provide a programmatic API for per-step or per-build token consumption. For guard integration, the most reliable approach is to instrument at the orchestration layer using the file line counts and terminal output line counts that your orchestration code already has access to. The token estimates in these guards (7.5 tokens/line for TypeScript with JSX; 6.0 tokens/line for plain JavaScript; 8.5 tokens/line for TypeScript with dense type annotations) are conservative approximations that tend to slightly undercount — use 10 tokens/line if you want a safety margin. If you are calling the underlying AI model API directly as part of a custom agent built on WebContainer, use the usage.input_tokens from the API response for precise counts rather than line-based estimates.

Is the 50-build ceiling too low for large projects with many interdependent files?

The ceiling is configurable and should be scaled to the project scope. For a monorepo with 80+ files in scope, set max_builds_per_session to 100. The more important ceiling is max_consecutive_failures — 6 consecutive build failures without a success indicates a loop regardless of the total build count. A session running 70 builds with no more than 3 consecutive failures at any point is making steady progress; a session that hits 6 consecutive failures after 8 total builds is in a cascade loop. Set the consecutive failure ceiling first and treat the total build count ceiling as a safety net for sessions that never fully stall but accumulate overhead through many small build cycles.

Why does the native module incompatibility list include packages like fsevents and cpu-features?

fsevents is a macOS-specific native file system events package and cpu-features is a native CPU detection library. Neither can run in WebContainer's WASM environment. They frequently appear as transitive dependencies pulled in by other packages (fsevents appears in many build tools that include optional macOS optimizations; cpu-features appears in some cryptography packages). In WebContainer, npm either fails to install them or installs the JavaScript fallback silently — but the install output often includes warning messages about native compilation failure that, if injected into the AI context, can confuse the model into thinking the primary package installation also failed. Adding them to the incompatibility list allows BoltNativeModuleGuard to filter their warning output from injection rather than treating it as a primary failure. Extend the list as you observe additional transitive native dependencies generating false-positive error context in your Bolt sessions.

How should I handle the BoltSnapshotGuard's per-step file context ceiling for large projects?

The 60,000-token per-step ceiling is calibrated for a project of roughly 25–30 files at 200 lines each. For larger projects, raise the ceiling proportionally — or, more effectively, reduce the number of files sent as context per step by scoping the context to only files directly relevant to the current change. If Bolt is generating a new API route, the context should include the route file, its shared types, and the router registration — not the entire project. Selective context inclusion is the highest-leverage optimization for large Bolt projects: reducing a 30-file snapshot to a 6-file relevant scope cuts the per-step snapshot cost from 40,500 tokens to 8,100 tokens — an 80% reduction with no loss in generation accuracy for narrowly-scoped tasks.

Do these guards apply to other browser-based AI coding tools like StackBlitz's other products or CodeSandbox AI?

Yes — the four patterns apply to any AI coding tool that combines AI generation with a browser-based WebContainer or similar sandboxed execution environment. BoltBuildGuard and BoltNativeModuleGuard apply directly to any StackBlitz-based product using WebContainer's WASM runtime. BoltErrorContextGuard generalizes to any tool that injects terminal output into AI prompts without truncation. BoltSnapshotGuard applies to any agent that sends full file contents at each generation step rather than diffs — this includes Replit Agent, Cursor, and any custom agent built on WebContainer primitives. The guard class names are specific to Bolt.new for clarity, but all four patterns and guards generalize across web-based AI coding platforms that run code in-browser.

Add token spend ceilings to your Bolt.new workflows

RunGuard is a runtime SDK that trips a circuit breaker the moment your AI agent's tool-call pattern shows a loop, context-window accumulation, or budget blow-through — before the token bill lands. One-line install for TypeScript and Python.

See pricing →