June 20, 2026 Power Automate AI Builder Microsoft Cost Control

Microsoft Power Automate AI Builder Cost Control: Apply to Each Fan-Out, Parallel Branch Multiplication, Child Flow Recursion, and Trigger Overlap

Microsoft Power Automate's AI Builder charges per AI credit consumed, not per flow run. Each AI Builder action inside a flow — an AI Prompt, a document-processing model, a sentiment classification, a text extraction — consumes a fixed number of AI credits from your tenant's monthly allocation. A typical AI Prompt action costs between 1 and 10 credits depending on the model size and prompt length. The credit pool is a shared resource across your tenant: one runaway flow can exhaust the budget for every other AI Builder-enabled flow in the organization.

Four structural properties of Power Automate amplify credit consumption beyond what's visible when designing a flow in the low-code editor:

Apply to Each fan-out — the Apply to Each loop iterates over array outputs from previous actions (SharePoint list items, Dataverse query results, Excel rows, email attachments). If an AI Builder action is inside the loop, credit consumption equals items × credits per action. A SharePoint list with 2,000 documents × 3 credits per AI Prompt = 6,000 credits from a single flow run.
Parallel branch multiplication — Power Automate's parallel branch control runs all branches simultaneously. Each branch with an AI Builder action runs its action independently. Credit consumption scales linearly with branch count regardless of whether the downstream logic actually uses all branch outputs.
Child flow recursion — the Run a Child Flow action allows parent flows to invoke other flows as subroutines. When a child flow also calls an AI Builder action and the trigger condition is evaluated again at completion, conditional re-triggers can create recursive credit-burning chains that are invisible in either the parent or child flow's run history in isolation.
Scheduled trigger overlap — when a scheduled flow's execution time exceeds the trigger interval, Power Automate queues additional instances. If each queued instance calls AI Builder, credit burn compounds with the queue depth. A flow scheduled every 15 minutes that takes 20 minutes to run due to a large Apply to Each generates a permanently growing run queue, each instance burning the same credit load.

Failure Mode 1 — Apply to Each Fan-Out

Apply to Each is Power Automate's iteration construct. It takes an array input — the output of a SharePoint Get items action, a Dataverse List rows action, an Excel List rows present in a table action, or any expression that evaluates to an array — and executes the loop body once for each element. The loop body can contain any combination of actions, including AI Builder AI Prompt actions.

The credit cost of an AI Builder AI Prompt action is set at design time by the prompt template's model tier. A standard AI Prompt using the default GPT model tier costs approximately 1 credit per invocation. A premium model tier prompt costs more. The per-invocation cost does not change based on the length of the input text within the model's context window — you pay the same credits whether the prompt processes 50 words or 5,000 words of input. This flat-rate-per-call structure means credit consumption is determined entirely by loop iteration count, not by the complexity or size of the work being done.

The dangerous scenario is a flow triggered by a scheduled action or a change event on a large data source. A SharePoint list with 5,000 items queried via Get items (with no filter) and passed into Apply to Each with an AI Builder Classify Text action in the body generates 5,000 AI Builder calls from a single flow trigger. At the standard credit rate, that depletes a 5,000-credit monthly allocation in one execution. On enterprise plans with larger credit pools, the same pattern runs undetected until the credit dashboard shows unexpected consumption at the end of the billing period.

The fan-out rule: Never place an AI Builder action inside an Apply to Each without a credit pre-check that caps total iterations before the loop starts. Count the input array length before entering Apply to Each and short-circuit if the count exceeds your per-run credit budget. The count action is free — the AI Builder call is not.

Power Automate does not expose a native credit pre-check action. The guard runs as an HTTP call to an external endpoint that tracks per-flow credit consumption before the loop starts. The endpoint checks whether the proposed iteration count would exceed the run budget and returns a go/no-go decision before a single credit is consumed:

Python — Per-flow Apply to Each credit pre-check endpoint (Flask)

import time
import sqlite3
import threading
from flask import Flask, request, jsonify

app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "pa_credit_guard.db"

# Credits per AI Builder action (adjust to your actual model tier)
CREDITS_PER_AI_BUILDER_CALL = 1

# Per-flow-run ceiling: max AI Builder calls allowed in a single run
MAX_CALLS_PER_RUN = 200

# Per-tenant daily ceiling across all flows
MAX_CALLS_PER_TENANT_DAY = 2000

TENANT_ID = "your-tenant-id"

def init_db():
    with sqlite3.connect(DB_PATH) as conn:
        conn.execute("""
            CREATE TABLE IF NOT EXISTS flow_run_credits (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                tenant_id TEXT NOT NULL,
                flow_id TEXT NOT NULL,
                run_id TEXT NOT NULL,
                ai_calls_approved INTEGER DEFAULT 0,
                ai_calls_used INTEGER DEFAULT 0,
                started_at REAL,
                finished_at REAL
            )
        """)
        conn.execute(
            "CREATE INDEX IF NOT EXISTS idx_flow_run "
            "ON flow_run_credits (tenant_id, flow_id, run_id)"
        )

class ApplyToEachGuard:
    """
    Call check() at the start of a flow, before Apply to Each begins.
    Pass the item_count (array length) and credits_per_item.
    Returns allow=False if the proposed run would exceed per-run or per-day limits.
    """

    @staticmethod
    def check(tenant_id: str, flow_id: str, run_id: str,
              item_count: int, credits_per_item: int = CREDITS_PER_AI_BUILDER_CALL) -> dict:
        proposed_credits = item_count * credits_per_item
        now = time.time()
        day_start = now - 86400  # rolling 24-hour window

        with db_lock:
            with sqlite3.connect(DB_PATH) as conn:
                # Check per-run ceiling
                if proposed_credits > MAX_CALLS_PER_RUN:
                    return {
                        "allow": False,
                        "reason": "per_run_ceiling_exceeded",
                        "flow_id": flow_id,
                        "item_count": item_count,
                        "proposed_credits": proposed_credits,
                        "ceiling": MAX_CALLS_PER_RUN,
                        "message": (
                            f"Flow {flow_id!r} Apply to Each would consume {proposed_credits} AI credits "
                            f"({item_count} items × {credits_per_item} credits each). "
                            f"Per-run ceiling is {MAX_CALLS_PER_RUN}. "
                            "Add a filter to the Get items action before Apply to Each to reduce "
                            "the array size, or split into batches across multiple flow runs."
                        ),
                    }

                # Check per-tenant daily ceiling
                tenant_day_total = conn.execute(
                    "SELECT COALESCE(SUM(ai_calls_approved), 0) FROM flow_run_credits "
                    "WHERE tenant_id = ? AND started_at > ?",
                    (tenant_id, day_start)
                ).fetchone()[0]

                if tenant_day_total + proposed_credits > MAX_CALLS_PER_TENANT_DAY:
                    return {
                        "allow": False,
                        "reason": "tenant_daily_ceiling_exceeded",
                        "flow_id": flow_id,
                        "tenant_day_total": tenant_day_total,
                        "proposed_credits": proposed_credits,
                        "ceiling": MAX_CALLS_PER_TENANT_DAY,
                        "message": (
                            f"Tenant {tenant_id!r} has used {tenant_day_total} AI Builder credits "
                            f"in the last 24 hours. Approving {proposed_credits} more would exceed "
                            f"the daily ceiling of {MAX_CALLS_PER_TENANT_DAY}. "
                            "Defer this flow run or reduce item_count with a filter."
                        ),
                    }

                # Approve and record
                conn.execute(
                    "INSERT INTO flow_run_credits "
                    "(tenant_id, flow_id, run_id, ai_calls_approved, started_at) "
                    "VALUES (?, ?, ?, ?, ?)",
                    (tenant_id, flow_id, run_id, proposed_credits, now)
                )
                return {
                    "allow": True,
                    "flow_id": flow_id,
                    "run_id": run_id,
                    "approved_credits": proposed_credits,
                    "tenant_day_total": tenant_day_total + proposed_credits,
                }

@app.route("/guard/apply-to-each", methods=["POST"])
def apply_to_each_guard():
    data = request.get_json(force=True)
    result = ApplyToEachGuard.check(
        tenant_id=data.get("tenant_id", TENANT_ID),
        flow_id=data.get("flow_id", ""),
        run_id=data.get("run_id", ""),
        item_count=int(data.get("item_count", 0)),
        credits_per_item=int(data.get("credits_per_item", CREDITS_PER_AI_BUILDER_CALL)),
    )
    return jsonify(result), 200 if result["allow"] else 429

if __name__ == "__main__":
    init_db()
    app.run(port=8090)

Wire this endpoint as an HTTP action at the start of your Power Automate flow, before the Apply to Each. Pass the output length from your Get items action as item_count. Use a Condition action after the HTTP call: if the response body's allow field is false, terminate the flow with a Terminate action in "Cancelled" state and send a notification. If allow is true, proceed to Apply to Each. The overhead of one HTTP call before the loop (50–100ms) is negligible compared to the cost of 2,000 unguarded AI Builder calls.

Failure Mode 2 — Parallel Branch Multiplication

Power Automate's parallel branch control creates multiple branches that execute simultaneously. Unlike a Switch or Condition action (which selects one branch based on a condition), a parallel branch runs every branch regardless of any condition. All branches complete before the flow continues to the next action. Each branch is independent and can contain any action, including AI Builder actions.

The credit multiplication pattern emerges when teams use parallel branches to process the same input with multiple AI Builder models simultaneously — for example, extracting sentiment, extracting entities, and classifying the document type all in parallel. Three parallel AI Builder calls triple the credit consumption compared to three sequential calls, but the flow runs faster. The performance gain feels like an obvious win because the flow finishes in the time of the slowest branch rather than the sum of all branches.

The hidden cost surface is execution frequency. A flow with 3 parallel AI Builder branches triggered by a SharePoint item creation event will fire all 3 branches on every new SharePoint item. In a library where 50 documents are uploaded in a batch (drag and drop in SharePoint), Power Automate fires one flow instance per document. Fifty concurrent flow instances × 3 parallel AI Builder calls = 150 AI credits from a single batch upload. In a high-volume SharePoint environment, this pattern can exhaust a tenant's monthly credit allocation during a single large file migration.

The multiplication rule: Parallel branch count is a credit multiplier. If your trigger fires on high-volume events (SharePoint item creation, OneDrive file upload, email arrival), parallel branches multiply your worst-case credit burn by the branch count × trigger volume. Measure the maximum plausible trigger rate for your event before adding parallel AI Builder branches — and wire a global flow-execution rate limiter at the trigger level, not the branch level.

The rate limiter for parallel branch flows operates at the flow-execution level: it controls how many times the flow can execute within a time window, independently of how many branches each execution contains. By capping executions, you cap total AI Builder calls as a function of execution count × branch count:

Python — Flow execution rate limiter for parallel-branch AI Builder flows (Flask)

import time
import sqlite3
import threading
from flask import Flask, request, jsonify

app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "pa_parallel_guard.db"

# How many flow executions are allowed per window (per trigger type)
MAX_EXECUTIONS_PER_WINDOW = 10
WINDOW_SECONDS = 600  # 10-minute sliding window

def init_db():
    with sqlite3.connect(DB_PATH) as conn:
        conn.execute("""
            CREATE TABLE IF NOT EXISTS flow_executions (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                flow_id TEXT NOT NULL,
                trigger_type TEXT NOT NULL,
                run_id TEXT,
                executed_at REAL,
                branch_count INTEGER DEFAULT 1,
                ai_calls_per_branch INTEGER DEFAULT 1
            )
        """)
        conn.execute(
            "CREATE INDEX IF NOT EXISTS idx_flow_trigger "
            "ON flow_executions (flow_id, trigger_type, executed_at)"
        )

class ParallelBranchGuard:

    @staticmethod
    def check(flow_id: str, trigger_type: str, run_id: str,
              branch_count: int, ai_calls_per_branch: int = 1) -> dict:
        now = time.time()
        window_start = now - WINDOW_SECONDS
        with db_lock:
            with sqlite3.connect(DB_PATH) as conn:
                recent_count = conn.execute(
                    "SELECT COUNT(*) FROM flow_executions "
                    "WHERE flow_id = ? AND trigger_type = ? AND executed_at > ?",
                    (flow_id, trigger_type, window_start)
                ).fetchone()[0]

                if recent_count >= MAX_EXECUTIONS_PER_WINDOW:
                    oldest = conn.execute(
                        "SELECT executed_at FROM flow_executions "
                        "WHERE flow_id = ? AND trigger_type = ? AND executed_at > ? "
                        "ORDER BY executed_at ASC LIMIT 1",
                        (flow_id, trigger_type, window_start)
                    ).fetchone()
                    retry_after = int(WINDOW_SECONDS - (now - oldest[0])) if oldest else WINDOW_SECONDS
                    return {
                        "allow": False,
                        "reason": "execution_rate_ceiling",
                        "flow_id": flow_id,
                        "trigger_type": trigger_type,
                        "executions_in_window": recent_count,
                        "ceiling": MAX_EXECUTIONS_PER_WINDOW,
                        "ai_credits_blocked": branch_count * ai_calls_per_branch,
                        "retry_after_seconds": retry_after,
                        "message": (
                            f"Flow {flow_id!r} has executed {recent_count} times in the last "
                            f"{WINDOW_SECONDS}s (ceiling: {MAX_EXECUTIONS_PER_WINDOW}). "
                            f"Each execution burns {branch_count} × {ai_calls_per_branch} = "
                            f"{branch_count * ai_calls_per_branch} AI Builder credits. "
                            "Throttling this execution. If triggered by a batch upload, "
                            "consider a scheduled batching pattern instead of per-item triggers."
                        ),
                    }

                conn.execute(
                    "INSERT INTO flow_executions "
                    "(flow_id, trigger_type, run_id, executed_at, branch_count, ai_calls_per_branch) "
                    "VALUES (?, ?, ?, ?, ?, ?)",
                    (flow_id, trigger_type, run_id, now, branch_count, ai_calls_per_branch)
                )
                return {
                    "allow": True,
                    "flow_id": flow_id,
                    "executions_in_window": recent_count + 1,
                    "ai_credits_this_run": branch_count * ai_calls_per_branch,
                }

@app.route("/guard/parallel-branch", methods=["POST"])
def parallel_branch_guard():
    data = request.get_json(force=True)
    result = ParallelBranchGuard.check(
        flow_id=data.get("flow_id", ""),
        trigger_type=data.get("trigger_type", ""),
        run_id=data.get("run_id", ""),
        branch_count=int(data.get("branch_count", 1)),
        ai_calls_per_branch=int(data.get("ai_calls_per_branch", 1)),
    )
    return jsonify(result), 200 if result["allow"] else 429

if __name__ == "__main__":
    init_db()
    app.run(port=8091)

Call this endpoint as the first HTTP action in every parallel-branch AI Builder flow. Pass the flow_id, the trigger_type (the event that fired the flow), and the branch_count (the number of parallel branches that contain AI Builder actions). Terminate on a 429 response before the parallel branches execute. This ensures that burst trigger events — a large batch upload, a SharePoint migration, a bulk email import — do not multiply into an uncontrolled credit burn proportional to both the item count and the branch count.

Failure Mode 3 — Child Flow Recursion

Power Automate's Run a Child Flow action allows any flow to invoke another flow as a subroutine and receive its output as structured data. The parent flow passes inputs to the child flow, the child flow runs its logic (which can include AI Builder actions), and the child flow returns outputs to the parent. This is the recommended Power Automate pattern for reusable logic: put shared AI processing in a child flow and call it from multiple parent flows.

The recursion failure mode emerges when a child flow's output is used to update the same data source that triggers the parent flow. A parent flow triggered by a SharePoint item update runs an AI Builder action via a child flow to generate a summary field. The child flow's output is written back to the same SharePoint item's Summary column. The Summary column write triggers the parent flow again — because it is still an item update on the same list, and the trigger does not distinguish between which column changed. The parent calls the child flow again to re-generate the summary for the now-updated item, which writes the same summary back, which triggers the parent again.

The recursion runs until Power Automate's circuit breaker on infinite loop detection terminates the flow (typically after 30–60 minutes and hundreds of child flow invocations). In a 45-minute recursion cycle with a child flow that calls a 3-credit AI Builder action, a single misconfigured trigger generates 180+ child flow invocations and 540+ AI credits before the platform terminates the loop. With multiple items in the list simultaneously updating, multiply accordingly.

The recursion rule: Any child flow that writes back to the triggering data source of its parent flow must include a guard column or a hash-based idempotency check. Before writing output, check whether the current item already contains the AI-generated value. If the value matches the output, skip the write entirely. No write = no trigger = no recursion.

The idempotency guard for child flow recursion runs inside the child flow before the write-back action. It computes a hash of the AI Builder output and compares it against the hash stored on the item from the previous run. An identical hash means the update is redundant — skip the write and return without triggering the parent flow again:

Python — Child flow write-back idempotency endpoint (Flask)

import time
import hashlib
import sqlite3
import threading
from flask import Flask, request, jsonify

app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "pa_child_flow_guard.db"

# How long to retain hash records (longer than your recursion detection window)
HASH_TTL_SECONDS = 3600  # 1 hour

def init_db():
    with sqlite3.connect(DB_PATH) as conn:
        conn.execute("""
            CREATE TABLE IF NOT EXISTS item_output_hashes (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                data_source TEXT NOT NULL,
                item_id TEXT NOT NULL,
                output_field TEXT NOT NULL,
                output_hash TEXT NOT NULL,
                recorded_at REAL
            )
        """)
        conn.execute(
            "CREATE UNIQUE INDEX IF NOT EXISTS idx_item_field "
            "ON item_output_hashes (data_source, item_id, output_field)"
        )

class ChildFlowWriteBackGuard:
    """
    Call check() before writing AI Builder output back to a data source.
    If the output hash matches the previously recorded hash, the write is
    redundant and will re-trigger the parent flow — skip it.
    """

    @staticmethod
    def check(data_source: str, item_id: str, output_field: str,
              output_value: str) -> dict:
        output_hash = hashlib.sha256(output_value.encode()).hexdigest()
        now = time.time()
        expiry = now - HASH_TTL_SECONDS

        with db_lock:
            with sqlite3.connect(DB_PATH) as conn:
                # Clean up expired records
                conn.execute(
                    "DELETE FROM item_output_hashes WHERE recorded_at < ?",
                    (expiry,)
                )
                existing = conn.execute(
                    "SELECT output_hash, recorded_at FROM item_output_hashes "
                    "WHERE data_source = ? AND item_id = ? AND output_field = ?",
                    (data_source, item_id, output_field)
                ).fetchone()

                if existing and existing[0] == output_hash:
                    return {
                        "should_write": False,
                        "reason": "output_unchanged",
                        "data_source": data_source,
                        "item_id": item_id,
                        "output_field": output_field,
                        "message": (
                            f"AI Builder output for {data_source!r} item {item_id!r} "
                            f"field {output_field!r} has not changed since last write "
                            f"({int(now - existing[1])}s ago). "
                            "Skipping write-back to prevent recursive parent flow trigger. "
                            "If you expect the output to differ, check whether your AI Builder "
                            "prompt uses a deterministic temperature setting."
                        ),
                    }

                # New or changed output — record hash and allow write
                conn.execute(
                    "INSERT OR REPLACE INTO item_output_hashes "
                    "(data_source, item_id, output_field, output_hash, recorded_at) "
                    "VALUES (?, ?, ?, ?, ?)",
                    (data_source, item_id, output_field, output_hash, now)
                )
                return {
                    "should_write": True,
                    "data_source": data_source,
                    "item_id": item_id,
                    "output_field": output_field,
                    "output_hash": output_hash,
                }

@app.route("/guard/child-flow-writeback", methods=["POST"])
def child_flow_writeback():
    data = request.get_json(force=True)
    result = ChildFlowWriteBackGuard.check(
        data_source=data.get("data_source", ""),
        item_id=data.get("item_id", ""),
        output_field=data.get("output_field", ""),
        output_value=data.get("output_value", ""),
    )
    return jsonify(result), 200

if __name__ == "__main__":
    init_db()
    app.run(port=8092)

In the child flow, add an HTTP action after the AI Builder action that calls /guard/child-flow-writeback with the data source identifier, the item ID, the output field name, and the AI Builder output text. Follow it with a Condition that checks body('HTTP_guard')['should_write']: if false, skip the update action and end the child flow without writing back. The parent flow never sees a trigger event because the data source item never changes. The recursion circuit breaks at the first idempotent iteration.

This guard relies on deterministic AI Builder output — if your AI Prompt uses a high temperature setting that produces different outputs on every call for the same input, the hash will never match and the guard will always allow writes. Set temperature to 0 in AI Builder prompts used for structured field generation (summaries, classifications, extractions) where you want stable, reproducible outputs. Reserve non-zero temperatures for creative generation tasks where the parent trigger is not the data source being written to.

Failure Mode 4 — Scheduled Trigger Overlap

Power Automate's scheduled (Recurrence) trigger fires a flow on a fixed schedule — every 15 minutes, every hour, every day. The flow executes and completes before the next scheduled firing in the common case. The edge case that creates a cost trap is when a flow's execution time exceeds the recurrence interval.

This happens most commonly with Apply to Each loops processing large data sets. A flow scheduled to run every 30 minutes processes a growing SharePoint list. When the list grows beyond the point where the flow can complete in 30 minutes, the next scheduled trigger fires before the current run finishes. Power Automate queues the new instance and starts it as soon as the current one finishes — or in some configurations, runs both concurrently. Both instances process the same list, both burn the same AI Builder credits, and the queue depth grows by one run with each missed interval.

A flow that takes 40 minutes to run on a 30-minute schedule generates a queue that grows by one instance every 30 minutes. After 3 hours, there are 6 instances queued. If the underlying data volume causing the slowdown persists (for example, a document ingestion backlog), the queue continues to grow. Each queued instance independently burns the full credit load of the AI Builder actions inside it. The result is credit consumption proportional to queue depth × credits per run, where queue depth grows linearly with time until the data volume drops.

The overlap rule: A scheduled flow with AI Builder actions must include an execution lock — a record that marks the flow as "currently running" and prevents new scheduled instances from starting their AI Builder work while another instance is already executing. The lock is checked at flow start and released at flow end. Overlapping instances that see the lock held terminate without consuming any AI Builder credits.

The execution lock endpoint uses a SQLite-backed mutex with a maximum lock duration (a watchdog timeout) to recover from flows that terminate abnormally without releasing the lock:

Python — Scheduled flow execution lock to prevent trigger overlap (Flask)

import time
import sqlite3
import threading
from flask import Flask, request, jsonify

app = Flask(__name__)
db_lock = threading.Lock()
DB_PATH = "pa_schedule_lock.db"

# Maximum time a lock can be held before it auto-expires (watchdog)
# Set this to 2× your expected maximum flow execution time
LOCK_MAX_SECONDS = 3600  # 1 hour

def init_db():
    with sqlite3.connect(DB_PATH) as conn:
        conn.execute("""
            CREATE TABLE IF NOT EXISTS flow_locks (
                flow_id TEXT PRIMARY KEY,
                run_id TEXT NOT NULL,
                acquired_at REAL NOT NULL,
                released_at REAL
            )
        """)

class ScheduledFlowLock:

    @staticmethod
    def acquire(flow_id: str, run_id: str) -> dict:
        now = time.time()
        expired_before = now - LOCK_MAX_SECONDS
        with db_lock:
            with sqlite3.connect(DB_PATH) as conn:
                existing = conn.execute(
                    "SELECT run_id, acquired_at, released_at FROM flow_locks "
                    "WHERE flow_id = ?",
                    (flow_id,)
                ).fetchone()

                if existing:
                    held_run_id, acquired_at, released_at = existing
                    is_released = released_at is not None
                    is_expired = acquired_at < expired_before

                    if not is_released and not is_expired:
                        running_for = int(now - acquired_at)
                        return {
                            "acquired": False,
                            "reason": "flow_already_running",
                            "flow_id": flow_id,
                            "held_by_run_id": held_run_id,
                            "running_for_seconds": running_for,
                            "lock_expires_in_seconds": int(LOCK_MAX_SECONDS - running_for),
                            "message": (
                                f"Flow {flow_id!r} is already running (run {held_run_id!r}, "
                                f"started {running_for}s ago). "
                                "This scheduled instance will terminate without consuming "
                                "AI Builder credits. "
                                "If this message persists beyond the expected run duration, "
                                f"the lock auto-expires after {LOCK_MAX_SECONDS}s."
                            ),
                        }

                # No lock, or lock released, or lock expired — acquire
                conn.execute(
                    "INSERT OR REPLACE INTO flow_locks "
                    "(flow_id, run_id, acquired_at, released_at) "
                    "VALUES (?, ?, ?, NULL)",
                    (flow_id, run_id, now)
                )
                return {
                    "acquired": True,
                    "flow_id": flow_id,
                    "run_id": run_id,
                    "lock_expires_at": now + LOCK_MAX_SECONDS,
                }

    @staticmethod
    def release(flow_id: str, run_id: str) -> dict:
        now = time.time()
        with db_lock:
            with sqlite3.connect(DB_PATH) as conn:
                conn.execute(
                    "UPDATE flow_locks SET released_at = ? "
                    "WHERE flow_id = ? AND run_id = ?",
                    (now, flow_id, run_id)
                )
                return {"released": True, "flow_id": flow_id, "run_id": run_id}

@app.route("/lock/acquire", methods=["POST"])
def acquire_lock():
    data = request.get_json(force=True)
    result = ScheduledFlowLock.acquire(
        flow_id=data.get("flow_id", ""),
        run_id=data.get("run_id", ""),
    )
    return jsonify(result), 200 if result["acquired"] else 409

@app.route("/lock/release", methods=["POST"])
def release_lock():
    data = request.get_json(force=True)
    result = ScheduledFlowLock.release(
        flow_id=data.get("flow_id", ""),
        run_id=data.get("run_id", ""),
    )
    return jsonify(result), 200

if __name__ == "__main__":
    init_db()
    app.run(port=8093)

Wire the /lock/acquire endpoint as the first HTTP action in every scheduled AI Builder flow. Pass the flow ID (use the Power Automate expression workflow()['name']) and the run ID (workflow()['run']['name']). After the HTTP call, add a Condition: if the response body's acquired field is false, terminate the flow immediately in "Cancelled" state. If acquired is true, proceed with the rest of the flow logic. At the very end of the flow — including after any error branches using the Configure run after setting — add a second HTTP action calling /lock/release. The released_at timestamp is the signal that the next scheduled instance can acquire the lock and proceed.

State Table

Failure mode	Guard class	Ceiling / trigger	What to watch
Apply to Each fan-out Item count × AI Builder calls exhausts credits in one run	`ApplyToEachGuard`	200 AI calls per run, 2,000/day per tenant	Proposed credits vs. per-run ceiling; tenant daily total approaching monthly allocation
Parallel branch multiplication Branch count multiplies credits on every trigger	`ParallelBranchGuard`	10 flow executions per 10-minute window	Execution rate per trigger type; spikes correlate with batch upload events
Child flow recursion Write-back to trigger source re-fires parent indefinitely	`ChildFlowWriteBackGuard`	Hash match = skip write (idempotency)	should_write=false rate; high rate = recursion loop active
Scheduled trigger overlap Queue depth grows when run time exceeds interval	`ScheduledFlowLock`	1 concurrent instance; watchdog after 3600s	acquired=false rate on lock endpoint; persistent false = flow duration > interval

Checklist Before Going Live

Count your Apply to Each input before the loop, not after. Use the length() expression on the array output from your Get items or List rows action. If the count can exceed your per-run credit budget in any plausible scenario (list growth, unbounded query), add a filter query to the Get items action to limit results before the loop, not inside the loop.
Profile your flow's execution time at 2× expected data volume before choosing a recurrence interval. If your flow processes 500 items in 20 minutes today, assume it will process 1,000 items in 40 minutes within 3 months. Set your recurrence interval to at least 60 minutes with the execution lock guard in place, not 30 minutes assuming current volume holds.
Audit every child flow that writes back to a SharePoint list, Dataverse table, or any other event-sourced data store. For each write-back, ask: does the parent flow's trigger fire on this data source? If yes, add the hash-based idempotency check before every write. The SharePoint Modified date column is not a reliable deduplication signal — it updates on every write regardless of whether the written value changed.
Add Configure run after settings to the lock release HTTP action. The lock release must execute whether the flow succeeds, fails, skips, or times out. In Power Automate, Configure run after allows an action to run after a preceding action in any outcome state. Without this, a failed flow will hold the execution lock until the watchdog timeout expires, blocking all scheduled instances during that window.
Monitor AI Builder credit consumption per flow in the Power Platform admin center. The AI Builder activity report shows credit consumption by flow. Set a tenant-level alert when consumption reaches 70% of the monthly allocation — not 90%, because flows with high Apply to Each iteration counts can exhaust the remaining 30% in a single run before the next monitoring cycle.
Use Power Automate's built-in concurrency control as a first layer. On scheduled flows, the Concurrency Control setting in the trigger can limit concurrent runs to 1 — this is a platform-native lock that does not require an external endpoint. Use it as a belt-and-suspenders layer alongside the execution lock guard. The platform concurrency control does not expose queue depth metrics; the external guard endpoint does, which is why both are needed.
Set AI Builder prompt temperature to 0 for all write-back flows. A non-deterministic prompt produces different output on each call for the same input, making the hash-based idempotency check ineffective. Temperature 0 ensures that re-running the AI Builder action on the same item produces the same output, which the hash check will correctly identify as unchanged and skip.

FAQ

How does Power Automate AI Builder billing differ from Microsoft Copilot Studio, which we already have in our tenant?

Copilot Studio uses a separate message-based billing model — you pay per message resolved by a Copilot Studio bot, where each conversation turn counts as one message. AI Builder inside Power Automate flows uses an AI credit model — each AI Builder action invocation consumes a fixed number of credits from your tenant's Microsoft Power Platform AI Builder credit allocation. The two credit pools are separate. Exhausting your AI Builder credits from a runaway Power Automate flow does not affect Copilot Studio message capacity, and vice versa. Both share the same Power Platform admin center dashboard but draw from different license entitlements.

Does Power Automate have any native protection against runaway Apply to Each loops consuming AI Builder credits?

Power Automate has a flow execution timeout (30 days maximum, configurable lower) and a platform-level throttling policy that can slow or suspend flows consuming excessive resources. Neither of these targets AI Builder credit consumption specifically. The platform's flow run timeout will eventually stop a runaway Apply to Each loop, but not before potentially consuming thousands of credits. The per-flow credit cap must be implemented externally via the guard endpoint pattern described above, as there is no native "max AI Builder calls per flow run" setting in the Power Automate designer.

We use the AI Builder custom model (trained on our documents) rather than the pre-built AI Prompt action. Do the same failure modes apply?

Yes. AI Builder custom models (custom document processing, custom object detection, custom classification models) also consume AI credits per inference call, at a rate that varies by model type and the data volume processed per call. Custom document processing models typically consume more credits per call than AI Prompt actions because they process the entire document through the model rather than a text prompt. All four failure modes — Apply to Each fan-out, parallel branch multiplication, child flow recursion, and scheduled trigger overlap — apply identically to custom model inference calls inside flows. The guard endpoint patterns above are model-agnostic; pass the correct credits_per_item for your specific model tier.

Can I replicate these guard patterns entirely within Power Automate using native actions rather than an external HTTP endpoint?

Partially. Power Automate's native actions can implement basic guards using SharePoint or Dataverse as state storage. A SharePoint list can serve as a distributed lock table (acquire lock = create a list item, release lock = delete it). A Dataverse row can store the hash for the child flow write-back idempotency check. The limitation is that native SharePoint and Dataverse write actions are not atomic — two concurrent flow instances can both read "no lock exists" and both proceed to create a lock item, with the last write winning and both instances continuing. An external endpoint backed by SQLite with a threading lock provides the atomic compare-and-set semantics that SharePoint and Dataverse native connectors cannot. For low-volume flows where exact atomicity is not critical, native SharePoint-based guards are a reasonable starting point before investing in an external guard service.

How do I integrate RunGuard's SDK with Power Automate AI Builder flows?

RunGuard's Python SDK can back the guard endpoints described in this post. Deploy RunGuard's BudgetTracker as the engine behind the Apply to Each guard endpoint — call tracker.record(session_id=run_id, tokens=item_count) for each approved loop and tracker.check(session_id=run_id) to enforce the per-run ceiling. The LoopDetector covers the child flow recursion scenario when the recursive pattern involves repeated identical AI Builder calls — the detector trips on repeated tool-call patterns within a session ID. Wire these as HTTP action calls at the start of your Power Automate flows and handle the 429 responses with Terminate actions. Install with pip install runguard and host on any server reachable from your Power Automate environment's HTTP connector allowlist.

Catch these loops at runtime, not in your credit dashboard

RunGuard is a circuit-breaker SDK for AI agents and automation flows. Wire it once, get loop detection + budget enforcement + alerts when any breaker trips. Works in Python and TypeScript.

Start free 14-day trial