How to Allocate AI Costs to a Customer (A Step-By-Step)

6 min read · Updated 2026-05-02

Runrate Framework

AI Workforce P&L

Treat AI agents like employees: cost structure, productivity target, and retirement trigger per agent.

Read the full framework →

Runrate Framework

The AI Cost Iceberg

Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).

Read the full framework →

Allocating AI costs to individual customers is the foundation of SaaS unit economics and pricing strategy. Once you know what it costs to serve each customer with AI, you can price fairly, protect margin, and identify which customers are profitable. Here's the step-by-step process.

Step 1: Instrument Every AI Call with Customer Metadata

Every API call to an LLM provider must include customer-level metadata so you can trace costs backward.

When your application makes an API call to Claude or GPT-4, include:

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    messages=[...],
    metadata={
        "customer_id": "cust_12345",
        "customer_name": "Acme Corp",
        "work_item_id": "ticket_67890",
        "workspace_id": "workspace_abc",
        "user_id": "user_xyz"
    }
)

The metadata fields you need:

  • customer_id: Unique identifier for the customer (required)
  • workspace_id: For multi-workspace customers, which workspace is this for? (optional)
  • work_item_id: The specific ticket, request, or task this relates to (required)
  • user_id: Which user in the customer's account triggered this? (optional)

This metadata doesn't go to OpenAI or Anthropic; it stays in your application logs. You use it to connect API costs back to customers.

Step 2: Log Every API Call with Cost and Metadata

When the API call returns, capture:

  • The customer_id and workspace_id from the metadata
  • The actual tokens consumed (input tokens, output tokens, cache tokens)
  • The cost calculated from the provider's pricing
  • The timestamp

Create a cost ledger table (in your data warehouse, analytics database, or Runrate):

| timestamp | customer_id | workspace_id | work_item_id | model | input_tokens | output_tokens | cost_usd | category | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | 2025-05-02T10:15:32Z | cust_12345 | ws_abc | ticket_67890 | claude-3-5-sonnet | 1200 | 450 | $0.0256 | api | | 2025-05-02T10:16:10Z | cust_12345 | ws_abc | ticket_67890 | claude-3-5-sonnet | 800 | 200 | $0.0160 | api | | 2025-05-02T10:20:45Z | cust_12345 | ws_abc | ticket_67890 | embedding-api | 5000 | - | $0.0001 | embedding | | 2025-05-02T10:25:00Z | cust_12345 | ws_abc | ticket_67890 | human-review | 30-min review | - | $0.25 | human |

Every row is a transaction. Over a month, you'll have hundreds of thousands of rows (one row per API call, per customer, per work item).

Step 3: Aggregate by Customer

Sum all costs by customer_id over a time period (daily, weekly, monthly):

SELECT
  customer_id,
  DATE_TRUNC('month', timestamp) as month,
  SUM(cost_usd) as total_ai_cost,
  SUM(input_tokens) as total_input_tokens,
  SUM(output_tokens) as total_output_tokens,
  COUNT(*) as api_calls
FROM cost_ledger
GROUP BY customer_id, month
ORDER BY total_ai_cost DESC;

Result:

| customer_id | month | total_ai_cost | total_input_tokens | total_output_tokens | api_calls | | --- | --- | --- | --- | --- | --- | | cust_67890 | 2025-05 | $3,412.50 | 2,100,000 | 450,000 | 12,340 | | cust_12345 | 2025-05 | $2,156.20 | 1,200,000 | 280,000 | 7,850 | | cust_54321 | 2025-05 | $890.35 | 600,000 | 150,000 | 3,200 | | cust_11111 | 2025-05 | $245.80 | 180,000 | 60,000 | 890 |

Now you can see per-customer AI cost.

Step 4: Add Non-API Costs

Raw API cost is only part of the AI Cost Iceberg. For each customer, also allocate:

Embedding storage: If you store embeddings in Pinecone or Weaviate, calculate the per-customer storage cost. If Pinecone costs $500/month and customer cust_12345 has 2M vectors out of a total 10M vectors, allocate (2M / 10M) × $500 = $100.

Observability: If Langfuse costs $300/month and customer cust_12345 generated 8,000 log events out of 40,000 total, allocate (8,000 / 40,000) × $300 = $60.

Human review time: If a customer's requests require human review (compliance, quality checking), log the time and calculate cost. If a reviewer costs $60/hour and cust_12345 had 10 hours of review time, allocate $600.

Third-party integrations: If your agent calls Stripe, Twilio, or other third-party APIs on behalf of the customer, log those costs and include them.

Updated per-customer cost for cust_12345 in May:

| Cost Category | Amount | | --- | --- | | API (Claude, GPT-4) | $1,800 | | Embeddings | $120 | | Vector storage (allocated) | $100 | | Observability (allocated) | $60 | | Human review | $600 | | Third-party integrations | $80 | | Total | $2,760 |

This is the fully loaded cost to serve cust_12345 with AI in May.

Step 5: Normalize Against Business Metrics

Now divide by the customer's work-item volume to get cost per outcome:

  • Customer cust_12345 processed 6,200 AI-assisted support tickets in May
  • Total AI cost: $2,760
  • Cost per ticket: $2,760 / 6,200 = $0.445 per ticket

You can also normalize by:

  • Revenue: Customer's monthly subscription revenue is $3,200. AI cost as % of revenue: $2,760 / $3,200 = 86% (high!)
  • Margin: Customer subscription margin is 75% ($2,400). AI cost vs. margin: $2,760 vs. $2,400 (AI eats the entire margin)
  • Profit: Customer is unprofitable because AI cost exceeds margin

Step 6: Rank Customers by Profitability

Once you have per-customer AI costs normalized, rank them:

| Rank | Customer | Subscription Revenue | AI Cost | Margin | Profitability | | --- | --- | --- | --- | --- | --- | | 1 | cust_67890 | $12,000 | $3,412 | $9,000 | $5,588 (62%) | | 2 | cust_12345 | $3,200 | $2,760 | $2,400 | -$360 (NEGATIVE) | | 3 | cust_54321 | $5,500 | $890 | $4,125 | $3,235 (78%) | | 4 | cust_11111 | $1,800 | $246 | $1,350 | $1,104 (82%) |

Now you know: cust_12345 is unprofitable because AI cost is too high relative to their subscription revenue. You need to take action.

Step 7: Make a Decision

For each customer where AI cost exceeds profitability targets:

  1. Tier them up: Move them to a higher-priced plan that includes AI features (or charges separately for AI).
  2. Restrict AI features: Limit their AI usage (e.g., "AI available for Pro and Enterprise plans only").
  3. Implement usage-based pricing: Charge them a per-ticket or per-interaction fee on top of their subscription.
  4. Optimize: Work with them to reduce AI usage or improve efficiency (better prompts, smaller context windows).
  5. Sunset: If none of the above work, discontinue service or move them to a self-service, lower-touch tier.

The goal isn't to maximize revenue from AI; it's to ensure AI doesn't destroy unit economics.

Automation and Continuous Monitoring

Once you have this process in place, automate it:

  1. Daily dashboard: Track total AI cost, cost per customer, cost per work item.
  2. Weekly alert: Flag any customer whose AI cost spiked by >20% from their baseline.
  3. Monthly reconciliation: Reconcile your cost ledger against provider invoices to ensure accuracy.
  4. Quarterly review: Review customer profitability and pricing strategy.

This is where Runrate comes in: we automate steps 1-6 (instrumentation, aggregation, allocation, normalization) so your finance team doesn't have to. You focus on step 7 (decision-making).

To learn more about setting pricing strategy and protecting SaaS margin, return to the article on multi-tenant AI cost allocation or the pillar article on AI cost attribution.

Want to see this in your stack?

Book a 30-minute walkthrough with a Runrate founder.

Get a Demo

Was this article helpful?