The Four Rules of AI Economics (Runrate Framework)

8 min read · Updated 2026-05-02

Runrate Framework

The AI Cost Iceberg

Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).

Read the full framework →

Runrate Framework

5-Stage AI Cost Maturity Curve

From Invisible → Tracked → Allocated → Optimized → Governed — where does your org sit?

Read the full framework →

Runrate Framework

AI Workforce P&L

Treat AI agents like employees: cost structure, productivity target, and retirement trigger per agent.

Read the full framework →

Every finance leader navigates business with a set of economic principles. Revenue recognition rules, SG&A allocation methods, capital budgeting hurdle rates. These principles are portable — they work across industries, companies, and time. AI finance is new enough that most companies don't have a framework. Here are the four rules that should anchor every AI economics conversation. Think of them as the laws of motion for AI spending.

Rule 1: Token Cost Is the Visible 10%; The Iceberg Is 90% Hidden

This is the master rule. Everything else flows from it.

When your finance team pulls the OpenAI invoice, they see token spend. Last month: 50 million input tokens at $0.005/1K = $250. Plus 20 million output tokens at $0.015/1K = $300. Total: $550. The spreadsheet gets updated. The number gets reported to the board. Everyone feels good about the AI spend control.

This is the tip of the AI Cost Iceberg.

The iceberg underneath looks like this: those 50 million input tokens are repeated system prompts and context you're sending on every inference. 30% of your token cost is context you could have cached or eliminated. The 20 million output tokens required a human to review them for accuracy (a claims examiner spends 5 minutes on each high-value claim). That human review costs $40,000/month — 130x the token cost. Some inferences fail and retry, multiplying token cost by 1.5x. Your observability infrastructure (logging, dashboards, audit trails) costs $5,000/month. The vector database storing embeddings costs $2,000/month. The evaluation dataset to test model accuracy costs $3,000/month.

The true cost: $550 (tokens) + $40,000 (review) + $5,000 (infra) + $2,000 (vector DB) + $3,000 (evaluation) + $825 (retry) = $50,825/month.

Your visible budget was $550. Your actual cost was $50,825. You were off by 92x.

This is the AI Cost Iceberg, and it explains why 88% of enterprises use AI but only 5.5% are "AI high performers" (McKinsey 2025). Most teams are budgeting the tip and getting blindsided by the hidden mass underneath.

The rule in practice: when you see an AI line item, assume it's 10% of true cost and work backward. If the vendor charges $1,000/month, the true cost is probably $9,000-$15,000/month when you account for review, retry, and infrastructure. If this seems outrageous, that's because your organization doesn't have cost attribution yet. You're still in Stage 1 of the Maturity Curve: Invisible.

Rule 2: Multi-Step Amplification Compounds Your Spend (The Retry Tax)

Here's a hidden multiplier that catches every team: retries and loops.

An AI agent doesn't work like a human. If a human claims adjuster can't parse a document, they ask a colleague or move on. An AI agent doesn't have judgment. It either succeeds or fails. If it fails (hallucination, formatting error, timeout), the system retries.

Let's model it. You deploy an AI agent to process insurance claims. The agent extracts key facts, checks them against rules, flags exceptions. On its first attempt, it succeeds 85% of the time. On retry (if it fails), it succeeds 90% of the time because it sees the error. The effective success rate: 85% + (15% × 90%) = 98.5%. To achieve that 98.5% success rate, you've run the inference 1.15x times (some claims get processed twice). Cost multiplier: 1.15x.

But wait. The agent also needs to call external tools. When processing a claim, it calls your policy database to check coverage. That API call has a timeout. If it fails, the system retries. The policy API succeeds 95% of the time on first try, 99% on retry. So you're making 1.04x policy calls per inference (some get retried). Cost multiplier: 1.04x.

The agent also generates a summary that needs human review. The human catches errors 10% of the time and asks the agent to re-analyze. So 10% of cases re-run the analysis. Cost multiplier: 1.10x.

Chain these together: 1.15x × 1.04x × 1.10x = 1.32x. You're paying 32% more than your headline token cost suggests because of multi-step amplification.

At scale, this is devastating. Klarna's $0.19 per ticket includes retry overhead. Without retry mitigation (guardrails, validation, observability), your true cost per ticket would be $0.25-$0.35.

The rule in practice: measure your retry rate obsessively. Every 10% of failed inferences that retry costs you 10% more. Your observability infrastructure should highlight retry cost as a line item. Then attack it: better guardrails reduce retries, semantic validation reduces loops, caching repeated calls cuts API multipliers. Runrate's observability helps you see this; most teams don't until their bill shocks them.

Rule 3: Cost-Per-Outcome Beats Cost-Per-Token (The Unit Economics Truth)

This is the transition from Stage 2 (Tracked) to Stage 4 (Optimized) on the Maturity Curve, and it's where CFOs earn their keep.

A vendor tells you: "Use our AI model at $0.005 per token." An engineer hears: "cheap, efficient, deploy widely." A CFO should hear: "that's not the relevant metric."

The relevant metric is cost per outcome. In a customer service context: cost per resolved ticket. In insurance: cost per adjudicated claim. In healthcare: cost per diagnosed case. In lending: cost per approved loan.

Let's say you deploy AI to support a 12-person customer service team. The AI costs $0.002 per token. Your team processes 1,000 tickets per month. Of those, the AI handles 600 completely (fully resolved). The human team handles the remaining 400.

Cost of the AI portion: 600 tickets × 50 tokens per ticket (rough estimate) × 2,000 tokens/month ÷ (1,000 tokens/ticket) × $0.002 per token... you know what, the math is irrelevant. Let's say it's $600/month in token cost.

True cost of AI portion (with review, retry, infra): $600 × 10 = $6,000/month.

Cost per AI-resolved ticket: $6,000 ÷ 600 = $10 per ticket.

But wait. The human team still costs $200,000/month (12 people × $15K/month payroll all-in). And they now handle only 400 tickets/month (the hard cases) instead of 1,000. Why? Because the easy cases are now AI. The human productivity went down.

Human cost per ticket for the 400 they handle: $200,000 ÷ 400 = $500 per ticket.

Your blended cost per ticket (AI + human): (600 × $10 + 400 × $500) ÷ 1,000 = ($6,000 + $200,000) ÷ 1,000 = $206 per ticket.

Without AI: 1,000 tickets ÷ 12 people ÷ 20 work days ÷ 8 hours = 0.52 tickets per hour. At $75/hour (loaded labor cost), that's $144 per ticket.

With AI: $206 per ticket.

You're worse off. The AI destroyed margin.

What went wrong? You optimized for cost-per-token ($0.002 cheap) instead of cost-per-outcome (ticket resolution went negative). The AI was solving easy cases that the human could have handled in 10 minutes at a lower cost. You should have deployed AI on the hard cases that take the human 90 minutes to handle, not the easy cases.

The rule in practice: every AI investment should have a cost-per-outcome target. "This claim processing AI should cost under $5 per claim" (vs. the $25 human cost). "This support agent should resolve tickets at $0.15" (vs. the $144 human cost). If you can't define cost-per-outcome, your AI project is a cost center, not a profit center.

Rule 4: Untracked AI Becomes Shadow AI (The Governance Tax)

Shadow AI is the silent killer of AI unit economics.

It starts innocently. A finance analyst discovers ChatGPT and uses it to draft quarterly summaries (unapproved, untracked). A support manager signs up for a third-party AI vendor to help with ticket routing (unapproved). A product team uses their personal OpenAI API key to prototype features (unapproved). Individual credit card charges for AI tools start appearing across the expense report.

After three months, your finance team has no idea how much AI you're actually spending. Your IT team has no visibility into what data is being sent to third-party APIs. Your legal team doesn't know whether you're violating data residency requirements. You're at Stage 1 of the Maturity Curve: Invisible.

The hidden cost of shadow AI: governance overhead. Your compliance team needs to audit it. Your security team needs to assess data leakage risk. Your finance team needs to rebuild the chart of accounts to track it. Your CTO needs to evaluate whether the tools integrate with your stack. You're spending money to clean up after untracked spending.

But worse, shadow AI is where the margin erosion happens. Individual contributors running AI on ChatGPT produce outputs that get merged into your product without any cost tracking. You think a feature cost 200 engineering hours; it actually cost 200 engineering hours + $50,000 in API calls that nobody counted. Your unit economics are wrong.

Mavvrik's core insight (and Runrate's too): 40% of enterprise AI spend is untracked. That's not lost spend — it's hidden spend. It's happening, but the finance team doesn't see it.

The rule in practice: implement mandatory cost attribution before you scale AI. Every AI API call should be logged against a cost center, project, and business unit. It should require a purchase order or allocation code, just like buying cloud infrastructure. You need a cost gateway — a system that sits between your team and the AI API and enforces governance.

This is Stage 3-4 of the Maturity Curve, and it's where Runrate adds the most value. You move from "we don't know what we spent" to "we know exactly what we spent and why."

Putting the Four Rules Together: The AI Economics Checklist

Here's how a CFO should think about AI economics in 2026:

  1. Assume the visible cost is 10% of true cost. When you budget AI, multiply the token cost by 8-10x to get to the true cost including review, infra, and overhead.

  2. Measure your retry rate and multi-step amplification. What percentage of inferences retry? What's your API failure rate? How many multi-hop decisions does each outcome require? These are your leverage points for cost reduction.

  3. Define cost-per-outcome for every AI project. Don't ask "how cheap is the model?" Ask "how much does this save per work item vs. the manual alternative?" If the answer is "it doesn't save anything," kill the project.

  4. Implement cost governance immediately. Track every AI API call. Require cost codes. Visibility prevents 40% of wasteful shadow AI spend.

The CFO who masters these four rules will be the one who explains to the board in 2027: "Our AI spend grew from $2M to $6M, but because we implemented cost attribution, we know that $5M of that went to work items with positive ROI, and $1M is candidate for reoptimization. We're at Stage 4 of the Maturity Curve, and we're investing $500K to move to Stage 5 governance."

The CFO who ignores these rules will explain: "We spent $6M on AI and we're not sure if it made money."

Curious where your team sits on the 5-Stage AI Cost Maturity Curve? Take the 15-question self-assessment and get a personalized report on your path to work-item-level cost attribution and the governance model that turns AI from a cost center into a P&L multiplier.

Where does your team sit on the maturity curve?

Take the 15-question self-assessment and get a personalized report.

Start the Assessment

Was this article helpful?