Make (Integromat) AI Agent Cost Control: Router Branch Multiplication, Iterator Fan-Out, and Data Store Loops
Make's pricing model is built around operations — one module processing one bundle equals one operation. A simple two-module scenario (webhook trigger → HTTP request) uses one operation per run. That math stays clean for linear automations. The moment you introduce AI into a Make scenario, the operation count stops being predictable.
AI steps change Make's operation model in ways that are easy to miss. An HTTP module calling an LLM API returns a response containing a list of items — say, five extracted entities. An Iterator module fans that list out to five bundles, each processed through the downstream modules. If downstream has four modules, a single scenario run just consumed 1 (AI call) + 5×4 (iterator fan-out) = 21 operations instead of the 5 you might have estimated. Add a Router module before the AI step that sends the bundle to two matching branches simultaneously — Make's Router fires all matching branches, not just the first match — and both branches run independently, each consuming their full downstream operation count.
Four failure modes make Make particularly tricky for AI workloads:
- Router branch multiplication — Make's Router fires every branch whose filter condition matches, running all downstream modules in parallel; two matching branches on a 6-module scenario doubles the operation count per run without any warning in the scenario editor.
- Iterator fan-out amplification — AI modules that produce lists (entity extraction, task decomposition, classification with multiple results) feed Iterator modules that multiply downstream operations by list length; a variable-length AI output causes variable and unbounded operation consumption.
- Instant trigger flood — Make's instant triggers (webhooks) fire immediately on every inbound event with no built-in rate limiting; a flood of inbound requests starts concurrent scenario runs that consume operations at full speed before any quota alert fires.
- Data Store self-trigger loops — A scenario triggered by a Make Data Store "Watch Records" module that also writes back to the same Data Store creates a circular trigger chain invisible from the scenario editor; each processed record creates a new record that re-triggers the scenario.
Failure Mode 1 — Router Branch Multiplication
Make's Router module is not a switch statement. It is a parallel dispatcher: every branch whose filter condition evaluates to true receives the bundle and runs independently. If three branches all match the incoming bundle, all three run — consuming operations for every module in every branch. This is the most common billing surprise for teams migrating from Zapier, where Paths routes to exactly one branch.
In AI scenarios, Router is frequently used to classify an LLM response and dispatch to different downstream handlers based on the classification. A Router with four branches — "high priority," "medium priority," "low priority," and "unclassified" — will have at most one branch match per run if the classification output is mutually exclusive. But if your LLM classification prompt is underspecified and returns a response that matches both "high priority" (contains the word "urgent") and "unclassified" (does not contain a clear category marker), both branches fire. Both run their downstream modules. You get billed for both.
The operation cost of Router branch multiplication compounds with downstream module depth. A scenario with a Router feeding three branches of 5 modules each produces up to 15 downstream operations per run if all three branches match, versus 5 if exactly one matches. At 10,000 operations per month on Make's Core plan, a scenario that runs 200 times per day consumes between 1,000 and 3,000 operations per day depending on how many branches match each time. The difference between one branch firing and three branches firing is the difference between staying within plan limits and exhausting them in three days.
The Router rule: Make's Router fires all matching branches, not just the first. Treat each additional matching branch as a separate scenario run with the same downstream operation count. Design branch filters to be mutually exclusive or count the operation cost of multiple simultaneous matches.
The guard pattern for Router branch multiplication is an operation-count webhook at the start of each branch that tracks how many branches are executing concurrently for the same source bundle. When multiple branches fire simultaneously, an external counter detects the parallel execution and can abort downstream expensive modules on branches beyond a configured ceiling:
import time
import sqlite3
import threading
from flask import Flask, request, jsonify
app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "make_router_guard.db"
def init_db():
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS branch_executions (
bundle_key TEXT NOT NULL,
branch_name TEXT NOT NULL,
started_at REAL,
PRIMARY KEY (bundle_key, branch_name)
)
""")
conn.execute("""
CREATE TABLE IF NOT EXISTS branch_config (
key TEXT PRIMARY KEY,
max_concurrent_branches INTEGER DEFAULT 2,
bundle_ttl_seconds INTEGER DEFAULT 300
)
""")
conn.execute(
"INSERT OR IGNORE INTO branch_config (key) VALUES ('default')"
)
class MakeRouterGuard:
"""
Tracks how many Router branches are executing concurrently for the same
source bundle. Aborts branches beyond the configured ceiling to prevent
unexpected operation multiplication from overlapping filter conditions.
"""
@staticmethod
def check_and_record(bundle_key: str, branch_name: str) -> dict:
now = time.time()
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
config = conn.execute(
"SELECT max_concurrent_branches, bundle_ttl_seconds "
"FROM branch_config WHERE key = 'default'"
).fetchone()
max_branches = config[0] if config else 2
ttl = config[1] if config else 300
# Clean up stale entries from prior bundles
conn.execute(
"DELETE FROM branch_executions WHERE started_at < ?",
(now - ttl,)
)
# Count branches already running for this bundle
active_count = conn.execute(
"SELECT COUNT(*) FROM branch_executions WHERE bundle_key = ?",
(bundle_key,)
).fetchone()[0]
if active_count >= max_branches:
active_branches = conn.execute(
"SELECT branch_name FROM branch_executions WHERE bundle_key = ?",
(bundle_key,)
).fetchall()
active_names = [r[0] for r in active_branches]
return {
"allow": False,
"reason": "router_branch_ceiling_exceeded",
"bundle_key": bundle_key,
"branch_name": branch_name,
"active_branches": active_names,
"active_count": active_count,
"ceiling": max_branches,
"message": (
f"Router branch ceiling exceeded for bundle {bundle_key!r}: "
f"{active_count} branches already executing "
f"(ceiling: {max_branches}). "
f"Active branches: {active_names}. "
"Stop this branch to prevent unexpected operation multiplication. "
"Review Router filter conditions for overlapping matches."
),
}
# Record this branch as executing
conn.execute(
"INSERT OR REPLACE INTO branch_executions "
"(bundle_key, branch_name, started_at) VALUES (?, ?, ?)",
(bundle_key, branch_name, now)
)
return {
"allow": True,
"bundle_key": bundle_key,
"branch_name": branch_name,
"active_count": active_count + 1,
"ceiling": max_branches,
}
@staticmethod
def mark_complete(bundle_key: str, branch_name: str) -> dict:
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
conn.execute(
"DELETE FROM branch_executions "
"WHERE bundle_key = ? AND branch_name = ?",
(bundle_key, branch_name)
)
return {"ok": True, "bundle_key": bundle_key, "branch_name": branch_name}
@app.route("/router/check", methods=["POST"])
def router_check():
data = request.get_json(force=True)
result = MakeRouterGuard.check_and_record(
bundle_key=data.get("bundle_key", ""),
branch_name=data.get("branch_name", ""),
)
return jsonify(result), 200 if result["allow"] else 429
@app.route("/router/complete", methods=["POST"])
def router_complete():
data = request.get_json(force=True)
return jsonify(MakeRouterGuard.mark_complete(
bundle_key=data.get("bundle_key", ""),
branch_name=data.get("branch_name", ""),
))
if __name__ == "__main__":
init_db()
app.run(port=8080)
Wire an HTTP module as the first module in each Router branch. Pass a bundle_key derived from the trigger payload (a unique event ID, timestamp + source hash, or Make's built-in {{1.id}} from the trigger module) and the branch_name for that branch. Use a Make Filter immediately after the HTTP module to stop the branch if allow is false. Wire a second HTTP call to /router/complete as the last module in each branch so the branch slot is released when the branch finishes. Set max_concurrent_branches to the number of branches your scenario design intentionally fires simultaneously — typically 1 if branches are meant to be mutually exclusive.
Failure Mode 2 — Iterator Fan-Out Amplification
Make's Iterator module takes a collection (an array from a previous module's output) and emits one bundle per collection item, each of which flows through all downstream modules independently. This is the correct tool for processing a list of items — but when the list is produced by an AI module, the list length is variable and potentially unbounded.
Consider a document processing scenario: a trigger receives a PDF, an HTTP module sends it to an LLM API requesting entity extraction, and the LLM returns a JSON array of extracted entities. An Iterator splits the array into individual bundles, each processed through four downstream modules (a Make Data Store lookup, an HTTP call to an external API, a text aggregation step, and a Data Store write). If the LLM extracts 20 entities from a long document, the scenario consumes 1 (trigger) + 1 (HTTP LLM call) + 20×4 (iterator fan-out) = 82 operations for a single document. A scenario you sized for 10 entities per document is consuming 8× more operations per run than your estimate when documents are longer than expected.
The failure mode is not that Iterator is wrong to use — it is the right tool. The failure mode is that the AI module has no knowledge of Make's billing model and will return as many items as the task warrants. A prompt asking for "all entities" on a dense legal document might return 80 items. Without an explicit cap, the Iterator processes all 80, and you discover the overconsumption in the next billing cycle when the plan is exhausted.
The fan-out rule: Operations from an Iterator = (items in array) × (downstream module count). At 10 downstream modules and an average of 15 items per AI response, each scenario run costs 150+ operations. Measure actual array lengths across 20 real inputs before estimating monthly operation needs — the typical underestimate is 5–10×.
The guard pattern for iterator fan-out is a two-stage cap: a prompt-level ceiling (instruct the LLM to return at most N items) combined with a Make Array Aggregator or a pre-iterator HTTP check that measures array length before the Iterator fires. The HTTP check approach works without modifying your LLM prompt and provides visibility into truncation events:
import time
import sqlite3
import json
import threading
from flask import Flask, request, jsonify
app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "make_iterator_guard.db"
def init_db():
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS iterator_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
scenario_id TEXT,
run_id TEXT,
original_count INTEGER,
truncated_to INTEGER,
truncated INTEGER DEFAULT 0,
recorded_at REAL
)
""")
class MakeIteratorGuard:
"""
Caps the array length fed into a Make Iterator module to prevent
unbounded operation fan-out from variable-length AI responses.
"""
MAX_ITEMS = 15 # ceiling: change to match your ops budget
WARN_ABOVE = 10 # log a warning when array exceeds this
@staticmethod
def check_and_cap(items: list, scenario_id: str = "", run_id: str = "") -> dict:
now = time.time()
original_count = len(items)
truncated = original_count > MakeIteratorGuard.MAX_ITEMS
if truncated:
capped_items = items[:MakeIteratorGuard.MAX_ITEMS]
truncated_to = MakeIteratorGuard.MAX_ITEMS
else:
capped_items = items
truncated_to = original_count
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
INSERT INTO iterator_events
(scenario_id, run_id, original_count, truncated_to, truncated, recorded_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (scenario_id, run_id, original_count, truncated_to, int(truncated), now))
response = {
"items": capped_items,
"original_count": original_count,
"item_count": len(capped_items),
"truncated": truncated,
"max_items": MakeIteratorGuard.MAX_ITEMS,
}
if truncated:
response["warning"] = (
f"AI response returned {original_count} items — "
f"truncated to {MakeIteratorGuard.MAX_ITEMS} to cap Iterator fan-out. "
f"Operations saved: {(original_count - MakeIteratorGuard.MAX_ITEMS)} × downstream_module_count. "
"Adjust MAX_ITEMS or prompt to control truncation behavior."
)
elif original_count > MakeIteratorGuard.WARN_ABOVE:
response["warning"] = (
f"AI response returned {original_count} items (above warn threshold "
f"of {MakeIteratorGuard.WARN_ABOVE}). Monitor operation consumption."
)
return response
@staticmethod
def stats(scenario_id: str = "") -> dict:
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
if scenario_id:
rows = conn.execute(
"SELECT original_count, truncated_to, truncated FROM iterator_events "
"WHERE scenario_id = ? ORDER BY recorded_at DESC LIMIT 100",
(scenario_id,)
).fetchall()
else:
rows = conn.execute(
"SELECT original_count, truncated_to, truncated FROM iterator_events "
"ORDER BY recorded_at DESC LIMIT 100"
).fetchall()
if not rows:
return {"events": 0}
orig_counts = [r[0] for r in rows]
truncations = [r for r in rows if r[2]]
return {
"events": len(rows),
"avg_original_count": round(sum(orig_counts) / len(orig_counts), 1),
"max_original_count": max(orig_counts),
"truncation_rate": round(len(truncations) / len(rows), 3),
"truncation_count": len(truncations),
}
@app.route("/iterator/check", methods=["POST"])
def iterator_check():
data = request.get_json(force=True)
items = data.get("items", [])
if not isinstance(items, list):
return jsonify({"error": "items must be an array"}), 400
result = MakeIteratorGuard.check_and_cap(
items=items,
scenario_id=data.get("scenario_id", ""),
run_id=data.get("run_id", ""),
)
return jsonify(result)
@app.route("/iterator/stats", methods=["GET"])
def iterator_stats():
scenario_id = request.args.get("scenario_id", "")
return jsonify(MakeIteratorGuard.stats(scenario_id))
if __name__ == "__main__":
init_db()
app.run(port=8081)
Add an HTTP module after your LLM HTTP call and before the Iterator module. Pass the array from the LLM response as items. The response contains a capped items array — map this to the Iterator's input array field instead of the raw LLM output. Set MAX_ITEMS in the guard to your per-run operation budget divided by your downstream module count, with a 20% buffer for retries. Poll /iterator/stats weekly to see your actual average and maximum array lengths — adjust both MAX_ITEMS and your LLM prompt's explicit count ceiling once you have real distribution data.
Failure Mode 3 — Instant Trigger Flood
Make distinguishes between scheduled triggers (scenarios that poll on a fixed interval — 15 minutes on free plans, 1 minute on Core+) and instant triggers (webhook-triggered scenarios that fire immediately when an inbound HTTP request arrives). Scheduled triggers have a natural rate ceiling: however many triggers fire in the polling window get processed at the next poll cycle. Instant triggers have no such ceiling.
When you build an AI scenario triggered by a webhook — an inbound form submission, a customer support request from an external chat tool, a notification from an external API — every inbound event immediately starts a new scenario run. If 300 form submissions arrive in a two-minute window (from a marketing campaign that just launched), 300 scenario runs start in parallel. Each run executes all modules. If each run consumes 20 operations (one HTTP call to your LLM API + Iterator fan-out + downstream processing), 300 simultaneous runs consume 6,000 operations in two minutes — potentially more than your entire monthly allocation.
Make's operation counting runs in near real-time, but quota enforcement has a lag. The platform typically does not hard-stop scenario execution mid-run when the monthly ceiling is crossed — it continues processing and bills at overage rates, alerting you via email after the fact. For teams on Core plan (10,000 operations/month), a single two-minute burst from a webhook flood can consume 60% of the monthly allocation before any alert fires.
The instant trigger rule: Instant triggers have no built-in rate limit. Your scenario runs immediately for every inbound request. Measure peak inbound request rate, not average rate, when estimating operation consumption — a marketing email blast can generate 10–100× average request volume in minutes.
The guard pattern for instant trigger floods is a webhook rate limiter deployed in front of Make's webhook URL, using Make's own HTTP webhook response module to acknowledge receipt while queuing execution. Alternatively, a rate-limiting proxy endpoint that either queues excess requests or rejects them with a 429 prevents unbounded parallel scenario execution:
import time
import sqlite3
import threading
import hashlib
import json
from flask import Flask, request, jsonify, Response
import urllib.request
app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "make_trigger_guard.db"
# Your Make webhook URL — the guard acts as a proxy
MAKE_WEBHOOK_URL = "" # set via environment variable in production
def init_db():
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS trigger_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
event_key TEXT,
started_at REAL,
source TEXT
)
""")
conn.execute(
"CREATE INDEX IF NOT EXISTS idx_started_at ON trigger_events (started_at)"
)
class MakeTriggerGuard:
"""
Sliding window rate limiter for Make instant webhook triggers.
Acts as a proxy: allowed requests are forwarded to Make's webhook URL.
Excess requests are queued or rejected based on mode.
"""
MAX_PER_HOUR = 30 # max scenario runs per rolling hour
MAX_PER_DAY = 200 # max scenario runs per rolling day
WINDOW_HOUR = 3600
WINDOW_DAY = 86400
@staticmethod
def check_and_record(event_key: str, source: str = "") -> dict:
now = time.time()
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
# Count runs in rolling windows
hour_count = conn.execute(
"SELECT COUNT(*) FROM trigger_events WHERE started_at > ?",
(now - MakeTriggerGuard.WINDOW_HOUR,)
).fetchone()[0]
day_count = conn.execute(
"SELECT COUNT(*) FROM trigger_events WHERE started_at > ?",
(now - MakeTriggerGuard.WINDOW_DAY,)
).fetchone()[0]
if hour_count >= MakeTriggerGuard.MAX_PER_HOUR:
oldest = conn.execute(
"SELECT MIN(started_at) FROM trigger_events WHERE started_at > ?",
(now - MakeTriggerGuard.WINDOW_HOUR,)
).fetchone()[0]
retry_after = int(oldest + MakeTriggerGuard.WINDOW_HOUR - now)
return {
"allow": False,
"reason": "hourly_ceiling",
"runs_in_last_hour": hour_count,
"ceiling": MakeTriggerGuard.MAX_PER_HOUR,
"retry_after_seconds": retry_after,
"message": (
f"Hourly scenario run ceiling reached: {hour_count} runs "
f"in the last hour (ceiling: {MakeTriggerGuard.MAX_PER_HOUR}). "
f"New triggers blocked for {retry_after}s."
),
}
if day_count >= MakeTriggerGuard.MAX_PER_DAY:
return {
"allow": False,
"reason": "daily_ceiling",
"runs_today": day_count,
"ceiling": MakeTriggerGuard.MAX_PER_DAY,
"message": (
f"Daily scenario run ceiling reached: {day_count} runs "
f"(ceiling: {MakeTriggerGuard.MAX_PER_DAY}). "
"Blocking further triggers to preserve monthly operation quota."
),
}
# Allow and record
conn.execute(
"INSERT INTO trigger_events (event_key, started_at, source) "
"VALUES (?, ?, ?)",
(event_key, now, source)
)
# Prune entries older than 2 days
conn.execute(
"DELETE FROM trigger_events WHERE started_at < ?",
(now - MakeTriggerGuard.WINDOW_DAY * 2,)
)
return {
"allow": True,
"runs_in_last_hour": hour_count + 1,
"runs_today": day_count + 1,
}
@app.route("/trigger/proxy", methods=["POST"])
def trigger_proxy():
"""
Proxy endpoint: check rate limit, then forward allowed requests to Make.
Replace your Make webhook URL with this endpoint in external systems.
"""
import os
make_url = os.environ.get("MAKE_WEBHOOK_URL", MAKE_WEBHOOK_URL)
payload = request.get_data()
source = request.headers.get("X-Source", "")
# Derive a stable event key from the payload for deduplication
event_key = hashlib.sha256(payload).hexdigest()[:16]
result = MakeTriggerGuard.check_and_record(event_key, source)
if not result["allow"]:
return jsonify(result), 429
if not make_url:
# Guard-only mode (no proxy target configured): return allow result
return jsonify(result), 200
# Forward to Make
try:
req = urllib.request.Request(
make_url,
data=payload,
headers={"Content-Type": request.content_type or "application/json"},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
make_body = resp.read()
make_status = resp.status
except Exception as e:
return jsonify({"error": f"Make webhook forward failed: {e}"}), 502
return Response(make_body, status=make_status,
content_type="application/json")
@app.route("/trigger/stats", methods=["GET"])
def trigger_stats():
now = time.time()
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
hour_count = conn.execute(
"SELECT COUNT(*) FROM trigger_events WHERE started_at > ?",
(now - MakeTriggerGuard.WINDOW_HOUR,)
).fetchone()[0]
day_count = conn.execute(
"SELECT COUNT(*) FROM trigger_events WHERE started_at > ?",
(now - MakeTriggerGuard.WINDOW_DAY,)
).fetchone()[0]
return jsonify({
"runs_in_last_hour": hour_count,
"runs_today": day_count,
"hourly_ceiling": MakeTriggerGuard.MAX_PER_HOUR,
"daily_ceiling": MakeTriggerGuard.MAX_PER_DAY,
})
if __name__ == "__main__":
init_db()
app.run(port=8082)
Deploy this as a small service on the same server that hosts your Make workflows. Replace the Make webhook URL in your external trigger sources (your form, your external API, your chat tool) with https://your-server/trigger/proxy. Set the MAKE_WEBHOOK_URL environment variable to your actual Make webhook URL. Requests within the rate limit are forwarded to Make immediately. Requests that exceed the ceiling receive a 429 with a retry_after_seconds field — configure your inbound source to respect this where possible, or queue excess requests and replay them after the window reopens. Adjust MAX_PER_HOUR based on your plan's monthly operation count divided by 730 (hours per month), multiplied by your per-run operation cost, leaving 25% buffer.
Failure Mode 4 — Data Store Self-Trigger Loops
Make's Data Store module provides a simple key-value store that scenarios can read and write. The "Watch Data Store Records" trigger fires when new records are added to a Data Store. This is a convenient pattern for creating processing pipelines: one scenario (the "producer") writes records to a Data Store when new work arrives, and a second scenario (the "consumer") watches the same Data Store and processes each record with an AI module.
The loop emerges when the consumer writes its AI-processed output back to the same Data Store it watches, or to a Data Store watched by the producer, closing the circuit. The consumer processes record A, generates AI output, writes the output as record B to the same Data Store. The "Watch" trigger fires on record B. The consumer processes record B, generates AI output, writes record C. The cycle continues indefinitely, each iteration consuming the full downstream module operation count plus the LLM token cost.
Make does not automatically detect this pattern. From the scenario editor, the consumer scenario only shows its own trigger and modules — it has no visibility into the chain of writes that other scenarios or itself are creating. Teams discover the loop when the Data Store grows unexpectedly large, the operation count spikes, and the LLM API bills increase simultaneously.
AI scenarios are particularly prone to this pattern because AI output is typically richer than the input — the AI adds fields, generates new text, transforms structure — making the output record look meaningfully different from the input record to a downstream filter. Both records match the same trigger because both are "new records in this Data Store," regardless of their content.
The loop signature: Data Store record count grows continuously with no human-initiated writes. The consumer scenario fires repeatedly in the history view. Operation count and LLM token cost both rise at the same rate. The fix requires a provenance field on every AI-written record — not a filter on record content, because the AI output may legitimately match any content filter.
The guard pattern for Data Store self-trigger loops is a hop count field written to every record that the AI scenario creates. The consumer scenario's first module checks the hop count before executing any AI steps, and stops if the hop count exceeds the configured ceiling:
import time
import sqlite3
import threading
from flask import Flask, request, jsonify
app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "make_datastore_guard.db"
MAX_HOP_COUNT = 1 # AI-processed records should not re-trigger AI processing
def init_db():
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS record_provenance (
record_key TEXT PRIMARY KEY,
created_by TEXT,
scenario_id TEXT,
hop_count INTEGER DEFAULT 0,
created_at REAL,
last_updated_at REAL
)
""")
class MakeDataStoreGuard:
"""
Tracks provenance of Make Data Store records created by AI scenarios.
Prevents self-trigger loops by checking hop_count before AI execution.
"""
@staticmethod
def tag_record(record_key: str, scenario_id: str,
created_by: str = "ai_scenario", parent_key: str = "") -> dict:
now = time.time()
# Inherit hop count from parent if this record is derived from another
parent_hops = 0
if parent_key:
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
row = conn.execute(
"SELECT hop_count FROM record_provenance WHERE record_key = ?",
(parent_key,)
).fetchone()
if row:
parent_hops = row[0]
hop_count = parent_hops + 1 if parent_key else 0
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
conn.execute("""
INSERT OR REPLACE INTO record_provenance
(record_key, created_by, scenario_id, hop_count,
created_at, last_updated_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (record_key, created_by, scenario_id, hop_count, now, now))
return {
"record_key": record_key,
"hop_count": hop_count,
"scenario_id": scenario_id,
"tagged": True,
}
@staticmethod
def check_record(record_key: str) -> dict:
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
row = conn.execute(
"SELECT created_by, scenario_id, hop_count, created_at "
"FROM record_provenance WHERE record_key = ?",
(record_key,)
).fetchone()
if not row:
# No provenance tag: record is from a human or untagged source
return {"allow": True, "provenance": "human_or_untagged"}
created_by, scenario_id, hop_count, created_at = row
if hop_count >= MAX_HOP_COUNT:
return {
"allow": False,
"reason": "data_store_loop_detected",
"record_key": record_key,
"created_by": created_by,
"scenario_id": scenario_id,
"hop_count": hop_count,
"max_hop_count": MAX_HOP_COUNT,
"message": (
f"Data Store loop detected: record {record_key!r} was created by "
f"scenario {scenario_id!r} (hop {hop_count}). "
f"Blocking AI processing — this record is AI-generated output, "
f"not a human-initiated trigger. "
"Review Data Store trigger conditions to prevent circular writes."
),
}
return {
"allow": True,
"provenance": created_by,
"scenario_id": scenario_id,
"hop_count": hop_count,
"warning": (
f"Record is scenario-created (hop {hop_count}). "
"Watching for loop patterns."
) if hop_count > 0 else None,
}
@staticmethod
def loop_stats() -> dict:
with db_lock:
with sqlite3.connect(DB_PATH) as conn:
blocked = conn.execute(
"SELECT COUNT(*) FROM record_provenance WHERE hop_count >= ?",
(MAX_HOP_COUNT,)
).fetchone()[0]
total = conn.execute(
"SELECT COUNT(*) FROM record_provenance"
).fetchone()[0]
max_hop = conn.execute(
"SELECT MAX(hop_count) FROM record_provenance"
).fetchone()[0] or 0
return {
"total_tagged_records": total,
"would_be_blocked": blocked,
"max_hop_count_seen": max_hop,
"ceiling": MAX_HOP_COUNT,
}
@app.route("/datastore/tag", methods=["POST"])
def tag_record():
data = request.get_json(force=True)
return jsonify(MakeDataStoreGuard.tag_record(
record_key=data.get("record_key", ""),
scenario_id=data.get("scenario_id", ""),
created_by=data.get("created_by", "ai_scenario"),
parent_key=data.get("parent_key", ""),
))
@app.route("/datastore/check", methods=["POST"])
def check_record():
data = request.get_json(force=True)
result = MakeDataStoreGuard.check_record(data.get("record_key", ""))
return jsonify(result), 200 if result["allow"] else 429
@app.route("/datastore/stats", methods=["GET"])
def loop_stats():
return jsonify(MakeDataStoreGuard.loop_stats())
if __name__ == "__main__":
init_db()
app.run(port=8083)
Add a /datastore/check HTTP call as the first module in your AI consumer scenario (after the Watch Data Store trigger, before any AI modules). Pass the triggered record's key as record_key. Use a Make Filter after the HTTP module to stop the scenario if allow is false. When your AI scenario writes output to a Data Store, add a /datastore/tag HTTP call immediately before or after the Data Store "Add a Record" module — pass the new record's key as record_key, your scenario's ID as scenario_id, and the input record's key as parent_key when the output is derived from a specific input record. Set MAX_HOP_COUNT = 1 when AI output records should never re-trigger AI processing. Set it to 2 if you have a legitimate two-hop pipeline where one AI step produces input for a second AI step, but the second step's output must not re-trigger the first.
State Table: Four Failure Modes at a Glance
| Failure Mode | Guard Class | Ceiling | What to Watch |
|---|---|---|---|
| Router branch multiplication All matching branches run in parallel |
MakeRouterGuard |
max_concurrent_branches (default: 1 for mutual exclusion) | Scenario history for same bundle ID appearing in multiple branches; overlapping filter conditions |
| Iterator fan-out amplification AI list length × downstream modules |
MakeIteratorGuard |
15 items per AI response (adjust per ops budget) | Average and max array length from AI responses; truncation rate from /iterator/stats |
| Instant trigger flood Webhook burst exhausts monthly quota |
MakeTriggerGuard |
30 runs/hour, 200 runs/day (adjust per plan) | Hourly trigger rate vs. plan's ops ceiling; /trigger/stats before and after marketing events |
| Data Store self-trigger loop AI output re-triggers AI processing |
MakeDataStoreGuard |
1 hop max (AI output must not re-trigger AI) | Data Store record count growth rate; scenario history for repeated runs on same record family |
Checklist: Before Deploying Make AI Scenarios
- Count operations per scenario run with real inputs. Run 20 test executions with production-representative payloads. Record the actual operation count from Make's scenario history (not your estimate). AI modules produce variable outputs — the 90th-percentile run is the one that determines your realistic ceiling.
- Audit all Router branch filter conditions for mutual exclusivity. For each Router, write down the filter condition for each branch. Identify any input that could match more than one branch simultaneously. If an overlap is possible, either make the filters mutually exclusive (add
NOT (condition_A)to branch B's filter) or deploy the Router branch guard and configuremax_concurrent_branchesto the intended number of parallel branches. - Explicitly cap AI list outputs in your LLM prompt. Any prompt that generates a list should include an explicit count ceiling: "Return at most 10 items," "Limit your response to 5 entities," "Do not include more than 8 tasks." Combine this with the Iterator guard as a defensive second layer — the prompt sets expectations, the guard enforces the billing ceiling.
- Use scheduled triggers for scenarios that don't need real-time response. If your AI processing doesn't require sub-minute latency, switch from an instant webhook trigger to a scheduled trigger with a polling interval. Scheduled triggers batch inbound events into the polling window, naturally smoothing burst traffic. Instant triggers are necessary only when the triggering system cannot retry and needs an immediate acknowledgment.
- Tag every Data Store record your AI scenario writes. Make this a non-negotiable part of your scenario architecture from day one. Even if you don't think you have a loop today, a new scenario added by a teammate that watches the same Data Store will thank you for the provenance tags when it avoids reprocessing AI-generated records.
- Enable Make's operation usage alerts. Make sends email alerts at 80% and 100% of monthly operation usage. Enable both under Organization → Billing. For scenarios running more than 10 times per day, set a more granular alert by running a scheduled Make scenario that calls Make's operations API and sends a Slack message if daily consumption exceeds your daily operation budget.
FAQ
How is Make's Router different from Zapier's Paths when it comes to operation billing?
Zapier Paths routes a Zap to exactly one matching path — the first matching filter in order, and no other. Make's Router sends the bundle to every branch whose filter condition matches, simultaneously. This means a Make Router with three branches where two filters overlap on the same input will fire both matching branches, consuming operations for all modules in both. For AI scenarios where LLM output classification can be ambiguous, Zapier's exclusive-routing behavior is safer; Make's parallel routing requires explicit mutual exclusion in filter design or an external branch counter guard to prevent unexpected operation multiplication.
Does Make's built-in error handling retry failed AI HTTP calls, and does that cost additional operations?
Yes and yes. Make's error handling has two retry options: "resume" (retry only the failed module) and "rollback" (retry the entire scenario from the trigger). When an AI HTTP module fails — due to a rate limit, timeout, or 5xx from the LLM API — Make will retry according to the error handler you've configured. Each retry of the failed module costs one additional operation for the retried module itself. Rollback retry costs the full scenario operation count again from the top, including any modules that succeeded before the failure. Use Make's "break" error handler (which marks the scenario run as failed without retrying) for AI HTTP modules if you want to avoid retry-based operation inflation, then implement your own retry logic in the scenario via a queue pattern or scheduled reconciliation run.
Can I use Make's built-in operation consumption view to detect loop patterns before they exhaust quota?
Make's dashboard shows total operations consumed in the current billing period and per-scenario operation counts in the scenario history. You can identify a looping scenario by sorting the scenario history by run count — a loop will show an abnormally high run count compared to other scenarios. However, the built-in view doesn't show run velocity (how fast a scenario is firing) or cross-scenario relationships (which scenario triggered which Data Store record that triggered another scenario). The Data Store guard's provenance system fills this gap by tracking which scenario created which records, making it possible to trace the loop chain without needing access to Make's internal dependency graph.
What's the right place in a Make AI scenario to put an operation budget check — at the start or inline between modules?
Put the primary budget check as the first module after the trigger, before any AI or data-fetching modules. This stops the scenario before it consumes any significant operations. For scenarios with an Iterator fan-out, add a second check between the AI module and the Iterator module — this is where you cap the array length before it multiplies downstream. A scenario that does a budget check only at the start will still overconsume operations when the Iterator fans out more items than expected; the inline check catches the variable cost before it materializes.
How does RunGuard integrate with Make scenarios?
RunGuard's SDK is designed for agents you host directly in Python or TypeScript — it guards the inner loop of AI tool calls before they make downstream requests. For Make, which runs your scenarios in its own infrastructure, the webhook-based guard endpoints in this post are the practical integration path: each guard runs as a small Flask or FastAPI service you host, and Make's HTTP module calls them at the key control points (before Router branches, before Iterator fan-out, at the trigger proxy, at Data Store writes). RunGuard's BudgetTracker and LoopDetector primitives map directly to the operation ceiling and hop count patterns above — you're implementing the same trip-before-bill logic, adapted for Make's per-operation billing model rather than per-token LLM billing.