Cost Per Work Item: A Unifying Metric for AI Economics

6 min read · Updated 2026-05-02

Runrate Framework

The AI Cost Iceberg

Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).

Read the full framework →

Runrate Framework

AI Workforce P&L

Treat AI agents like employees: cost structure, productivity target, and retirement trigger per agent.

Read the full framework →

The FinOps Foundation established "cost per unit of work" as the north-star KPI for cloud-native economics. The principle is simple: instead of measuring infrastructure cost in dollars or cloud credits, measure it against business outcomes. Runrate extends this principle to the agentic enterprise, operationalizing it as cost per work item.

A work item is the atomic unit of business outcome: a resolved customer support ticket, an adjudicated insurance claim, a processed loan application, a completed sales qualification, a resolved employee onboarding request. The metric is the fully loaded cost to produce that work item, including all API, infrastructure, human review, and integration costs.

Why unifying under one metric matters: A mid-market company might have three AI agents operating in parallel (customer service, claims, HR onboarding), managed by three different teams, billed to three different P&L lines. Without a unifying cost framework, you can't compare whether the customer service agent (at $0.52 per ticket) is more expensive than the claims agent (at $0.68 per claim), and whether you should invest more in one and less in the other. Cost per work item lets you compare across business units using the same denominator.

How Cost Per Work Item Differs from Token Cost

Most companies trying to measure AI cost start with token counting: how many input tokens, output tokens, and cached tokens did the LLM consume? This is cloud cost accounting for AI. It answers "what did we pay OpenAI?" but not "what did we get for it?"

Token cost is inherently incomplete because:

  1. It ignores hidden costs. The AI Cost Iceberg shows that visible API spend is about 10% of true cost. Embeddings, vector database storage, human review, retries, and observability are not measured in tokens but they're part of the cost.

  2. It's disconnected from business outcomes. A work item might involve 5,000 input tokens (expensive) or 500 input tokens (cheap), but the business value is the same: the work got done. Cost per work item anchors to outcomes, not inputs.

  3. It rewards false efficiency. You can optimize for low token count by using cheaper models, shorter prompts, or fewer API calls—but if the quality suffers and human review time increases, you've just moved cost from the API bill to the labor line. Cost per work item catches this because human review is included.

A McKinsey finding from their State of AI 2025 report supports this: 88% of organizations now use AI in at least one function, but only 39% see EBIT impact. The companies seeing impact are the ones measuring cost per outcome, not cost per token.

Three Examples Across Verticals

Customer service (SaaS): Cost per ticket resolved.

  • Volume: 50,000 tickets per month
  • Cost per ticket: $0.32 (API, embeddings, integrations, observability)
  • Human review: 15% of tickets, adding $0.08 per reviewed ticket
  • Blended cost per ticket: $0.32 * 0.85 + ($0.32 + $0.08) * 0.15 = $0.34 per ticket
  • Monthly cost: $17,000. Savings vs. human support (at $1.00/ticket): $33,000/month.

Healthcare claims (insurance): Cost per claim adjudicated.

  • Volume: 100,000 claims per month
  • Cost per claim: $0.45 (API, fraud detection, embedding lookups, compliance review at 40% rate)
  • Monthly cost: $45,000. Savings vs. human adjuster (at $1.40/claim): $95,000/month.

Loan origination (financial services): Cost per application processed.

  • Volume: 8,000 applications per month
  • Cost per application: $2.10 (API, document OCR, credit check integrations, human compliance review at 30% rate, underwriting at 5% rate for edge cases)
  • Monthly cost: $16,800. Savings vs. human loan processor (at $6.00/application): $32,400/month.

Notice that the cost per work item varies dramatically: customer service at $0.34, claims at $0.45, loan origination at $2.10. This isn't because one is "more advanced." It's because loan origination has higher regulatory review requirements, more complex integrations, and more edge cases requiring human judgment. Cost per work item captures that complexity.

Extending FinOps Language to Agentic Economics

The FinOps Foundation uses cost per unit of work to hold cloud infrastructure teams accountable. The insight is: "Don't optimize for cloud cost reduction in isolation; optimize for the ratio of cost-to-business-outcome." A project that costs $50K in cloud but delivers $500K in revenue is a win. A project that costs $5K in cloud but delivers $10K in revenue is a loss.

Runrate applies the same logic to AI spend. Don't optimize for token cost or API bill reduction; optimize for cost per work item relative to quality and margin. If your customer service AI costs $0.52 per ticket but delivers a 90% customer satisfaction score and a 5% repeat-ticket rate, and the human baseline is $1.50 per ticket at 85% satisfaction and 12% repeat-ticket rate, the AI is more efficient even at higher cost.

This shift from cost minimization to cost-per-outcome management is what separates AI high performers (the 5.5% of organizations seeing EBIT impact, per McKinsey) from everyone else. They're thinking about work-item economics, not token economics.

Building a Unified Cost Dashboard

Once you're measuring cost per work item across your AI operations, you can build a unified dashboard:

| Work Item Type | Volume/Month | Cost Per Item | Total Cost | YoY Change | Margin Impact | | --- | --- | --- | --- | --- | --- | | Support tickets | 50,000 | $0.34 | $17,000 | +8% | +2.1% | | Insurance claims | 100,000 | $0.45 | $45,000 | +12% | -1.8% | | Loan applications | 8,000 | $2.10 | $16,800 | -5% | +0.9% | | HR onboarding | 1,200 | $1.50 | $1,800 | +3% | +0.2% | | Total | 159,200 | $0.48 | $80,600 | +8.5% | +0.4% |

This view lets you allocate capital: if loan application costs are up 8.5% and eating 0.3% of margin, but support ticket costs are flat and delivering +2.1% margin, you know where to invest (support optimization) and where to constrain (loan processing).

The beauty of this metric is that it's anchored in business reality, not in technical metrics. You're not optimizing for "tokens per dollar" or "API calls per second." You're optimizing for outcomes. This shift is what separates companies that are running AI as a cost center from companies that are running AI as a business.

The Benchmarking Question

How do you know if your cost per work item is "good"? There are a few reference points:

For customer support, Klarna's $0.19 per ticket is the benchmark, but that's after years of optimization and at massive scale. A reasonable first target for most support organizations is $0.50-$0.75 per ticket.

For claims, industry benchmarks vary by claims type (health, auto, workers' comp have different complexity), but a cost per claim in the $0.50-$1.50 range is reasonable. If you're at $3.00, you're either processing highly complex claims or leaving optimization opportunity on the table.

For loan origination, typical cost per application is $2-$4, depending on whether it's a simple auto loan (lower cost) or a complex commercial real estate underwriting (higher cost).

But the best benchmark is your own baseline. Where are you today? Can you cut cost per work item by 20% through optimization? 30%? The companies that move fastest are the ones that establish a baseline, set a target, and measure progress monthly.

Cost per work item is the unifying language for AI economics. Learn more about how to allocate these costs across business units and customers in the articles on chargeback and showback and multi-tenant allocation, or return to the pillar article on AI cost attribution.

Want to see this in your stack?

Book a 30-minute walkthrough with a Runrate founder.

Get a Demo

Was this article helpful?