Cost Per Outcome vs Cost Per Token

Runrate Framework

The AI Cost Iceberg

Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).

In the next five years, every large enterprise will have a silent war between engineering teams and finance teams over how to measure AI cost. Engineers will say cost-per-token. CFOs will say cost-per-outcome. The framing you choose determines who controls the AI budget and how much the company actually spends on AI.

The Engineer's Frame: Cost Per Token

Cost-per-token is the engineer's natural frame. It's precise, measurable, and vendor-neutral. OpenAI charges $0.10 per 1 million input tokens and $0.30 per 1 million output tokens for GPT-4 Turbo. That's a concrete number. A 1,000-token request costs $0.0001. If you process 1 billion tokens per month, your cost is $100,000. Simple arithmetic.

From an engineering standpoint, cost-per-token is the right metric. It directly aligns with LLM selection, prompt optimization, and model upgrade decisions. If you switch from GPT-4 ($0.03 per 1K tokens) to GPT-4 Turbo ($0.10 per 1K tokens), you know exactly how much more you're paying. If you reduce your average prompt from 2,000 tokens to 1,500 tokens, you know you're cutting cost by 25%.

Cost-per-token is engineer-relevant.

The CFO's Frame: Cost Per Outcome

Cost-per-outcome is the CFO's natural frame. It answers the question: "What did it cost to process this customer ticket, adjudicate this claim, or originate this loan?" It includes not just tokens but integrations, infrastructure, human review, and opportunity cost.

Klarna's AI customer service agent costs $0.19 per resolved ticket. That's cost-per-outcome. It includes API cost, but also:

The infrastructure to route tickets
The integrations to look up customer history
The vector database to search prior solutions
The human escalation path for complex issues
The training and evaluation to improve accuracy

You can't break down that $0.19 into constituent parts because they're bound together operationally. The system costs $0.19 to produce one resolved ticket.

Cost-per-outcome is CFO-relevant, operations-relevant, and financially meaningful.

Why the Framing Matters

The framing determines the incentive structure:

Cost-per-token framing:

Incentive: Minimize tokens per request
Technique: Reduce prompt length, use cheaper models, limit context
Result: Engineers optimize for efficiency, sometimes at the expense of quality
Control: Engineering team owns the budget and optimization

Cost-per-outcome framing:

Incentive: Minimize cost per resolved ticket, claim, or application
Technique: Improve accuracy (fewer escalations), reduce retries (better integrations), optimize human review time
Result: Organization optimizes for business value, not token efficiency
Control: Finance team owns the budget and ties it to operational outcomes

These are fundamentally different optimization targets. An engineer optimizing for cost-per-token might reduce context length, which hurts accuracy, which increases human review time, which actually increases total cost. But the engineer's dashboard shows lower token cost, so they think they're winning.

A CFO optimizing for cost-per-outcome would increase context length (higher token cost) to improve accuracy (lower human review cost), because the end-to-end math is better.

The Benchmarking Problem

Here's why the framing battle matters in practice:

When Klarna says "our agent costs $0.19 per resolved ticket," they're using cost-per-outcome framing. But they don't break down the $0.19 into components. A CFO wants to know: is $0.19 expensive or cheap? They can't tell without comparable benchmarks.

When an engineer says "our model uses 1,500 tokens per request at $0.10 per 1K tokens, so cost-per-token is $0.15," they're using cost-per-token framing. But that tells you nothing about whether the request actually resolves the customer's problem, whether it requires human escalation, or what the true business cost is.

Klarna: $0.19 per resolved ticket (business outcome) Engineer: $0.15 per token (technical input)

These are incomparable numbers. The CFO thinks the engineer's estimate is conservative; the engineer thinks the CFO's benchmark is inflated. Neither is wrong; they're measuring different things.

The Iceberg Resolves the Conflict

The AI Cost Iceberg provides the bridge between the two framings:

Cost-per-token: $0.05 per request (visible tip)
Cost-per-outcome: $0.19 per resolved ticket (full iceberg)

The iceberg explains where the $0.14 difference comes from:

Integrations: $0.03
Retries: $0.02
Human escalation: $0.06
Infrastructure and observability: $0.03

Once you're explicit about these hidden costs, the engineer and the CFO can have a productive conversation. The engineer might say: "I can reduce token cost from $0.05 to $0.04 by using a cheaper model." The CFO says: "But if that reduces accuracy and increases human escalations from $0.06 to $0.09, the total cost goes up to $0.22. Don't do it."

The Framing Battle in Practice

In a typical enterprise:

Year 1: Engineering team builds an agent. They quote cost-per-token. CFO budgets based on token cost. Agent is deployed.

Year 2: Actual spend is 5–10x higher than budget because of hidden costs. CFO is shocked. Engineering team claims the CFO didn't understand how much infrastructure costs. Finance team claims engineering underestimated.

Year 3: CFO insists on cost-per-outcome reporting. Engineering team resists because it "doesn't account for efficiency gains." Battle ensues.

Year 4: Organization implements work-item-level cost attribution. Every agent outcome is tagged with its full cost. Engineering team and finance team finally have a shared language.

The organization that skips to Year 4 wins. Everyone else wastes time in Years 1–3 arguing about how to measure cost.

The Vendor's Game

Vendors deliberately exploit this framing ambiguity:

To engineers: "Our agent uses state-of-the-art model optimization. Cost-per-token is $0.02."
To CFOs: "Our agent resolves customer issues at $0.50 per ticket."

Both statements can be true, but they measure different things. The engineer builds a proof-of-concept at $0.02 cost-per-token. The CFO deploys it at $0.50 cost-per-outcome. Neither party is lied to; they're just measuring different layers of the iceberg.

When evaluating vendors, insist on cost-per-outcome using the exact definition: "What is the all-in cost to produce one resolved outcome, including API, infrastructure, integrations, human review, and compliance overhead?"

The Future: Outcome-Based Pricing

The next generation of AI vendors will price by outcome, not by token. Instead of charging $0.10 per 1M tokens, they'll charge $0.50 per resolved ticket. This forces the vendor to absorb the hidden cost and incentivizes them to optimize the full stack: API, retries, human review, everything.

Outcome-based pricing aligns vendor incentives with CFO incentives: both are optimizing for cost-per-outcome. The vendor that can deliver $0.19 per resolved ticket cheaper than others wins business.

Token-based pricing aligns vendor incentives with engineer incentives but misaligns with CFO incentives. That's why the framing battle exists: different parts of the org are optimizing different metrics.

What to Do Next

When you hear an agent cost estimate, immediately ask: "Is that cost-per-token or cost-per-outcome?" If it's cost-per-token, ask what the outcome is and what the full cost is including all hidden layers. If it's cost-per-outcome, ask what's included and whether human review time is included.

Use the AI Cost Iceberg to translate between the two framings. Once you're explicit about hidden costs, the engineer and the CFO can optimize together instead of at cross-purposes.

For a deeper walkthrough of cost attribution and how to surface cost-per-outcome across your entire agent fleet, request the CFO Field Guide or a demo with Runrate.

Want to see this in your stack?

Book a 30-minute walkthrough with a Runrate founder.

Get a Demo

Was this article helpful?

Related in this cluster

AI Agent Economics

How Much Does an AI Agent Actually Cost? A Buyer's Deep Dive

CFO8 min read

AI Agent Economics

The Hidden Costs of AI Agents Nobody Puts in the Demo

CFO6 min read

AI Agent Economics

The True Cost of AI Agents (and Why Your Bill Is 10x Your Token Spend)

CFO5 min read

Cost Per Outcome vs Cost Per Token: The Executive's Framing Battle

The Engineer's Frame: Cost Per Token

The CFO's Frame: Cost Per Outcome

Why the Framing Matters

The Benchmarking Problem

The Iceberg Resolves the Conflict

The Framing Battle in Practice

The Vendor's Game

The Future: Outcome-Based Pricing

What to Do Next

Want to see this in your stack?

Related in this cluster

How Much Does an AI Agent Actually Cost? A Buyer's Deep Dive

The Hidden Costs of AI Agents Nobody Puts in the Demo

The True Cost of AI Agents (and Why Your Bill Is 10x Your Token Spend)