Runrate Framework
The AI Cost Iceberg
Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).
Read the full framework →One of the earliest and most common blockers to AI cost attribution is the shared API key trap. An engineering team, eager to experiment, spins up a single OpenAI or Anthropic API key and hands it out to multiple services or teams. It's convenient: one key, one billing account, one place to monitor spend. But it's an attribution disaster.
When multiple agents or services share a single API key, all their API calls are commingled on the invoice. If your customer support AI, your sales AI, and your claims AI all use the same key, the monthly bill shows total spend but not what each agent consumed. You can't answer "what did the support AI cost?" without estimating or sampling.
This seems like a billing problem. It's actually an attribution and governance problem.
Why Shared Keys Destroy Attribution
Cost attribution depends on tracing costs backward from the bill to the work item. Here's how that normally flows:
- Customer support agent processes a ticket (ticket ID: #12345)
- Agent makes 3 API calls to Claude, consuming 2,400 input tokens + 800 output tokens
- OpenAI invoice shows usage and cost
- Finance maps the invoice cost back to ticket #12345
- Cost per ticket is calculated
With a shared API key:
- Customer support agent, sales AI, and claims AI all process work items
- All three agents make API calls to OpenAI using the same key
- OpenAI invoice shows total usage and cost (no breakdown by caller)
- Finance cannot map the cost back to individual agents or work items
- Cost per ticket becomes a guess or an average
This breaks every level of attribution: you can't track agent-level cost, you can't track business-unit cost, and you can't track work-item cost.
More critically, a shared key means no one team owns the budget. If support AI burn spikes, is it because volume increased, because the model got more expensive, or because the sales team started using the same key for a new agent? You can't tell.
The Architecture Fix: One Key Per Agent (or Team)
The solution is to enforce one API key per agent, or one API key per team (depending on your governance model). This creates cost isolation.
If you have three agents (support, sales, claims), use three API keys:
OPENAI_KEY_SUPPORTassigned to the support teamOPENAI_KEY_SALESassigned to the sales teamOPENAI_KEY_CLAIMSassigned to the claims team
OpenAI and Anthropic both support this: they issue one API key per project, and each project has its own usage metrics and billing line.
With this setup:
- Support team's OpenAI bill: $8,500/month
- Sales team's OpenAI bill: $2,100/month
- Claims team's OpenAI bill: $14,200/month
- Total: $24,800/month
Now each team owns a budget, and finance can trace each team's cost.
The Instrumentation Problem: From Cost per Team to Cost per Work Item
One API key per agent solves cost isolation across teams, but it doesn't automatically give you cost per work item. You still need to instrument every API call with metadata so you can trace it back to the originating ticket, claim, or application.
Here's what that instrumentation looks like:
API Call: {
"model": "claude-3-5-sonnet",
"messages": [...],
"metadata": {
"ticket_id": "SUP-12345",
"customer_id": "CUST-67890",
"agent_id": "support-v2",
"timestamp": "2025-05-02T14:32:15Z"
}
}
When OpenAI returns the usage (e.g., 1,200 input tokens, 400 output tokens), you calculate the cost and store:
Cost Record: {
"ticket_id": "SUP-12345",
"customer_id": "CUST-67890",
"agent_id": "support-v2",
"api": "openai",
"cost": "$0.048",
"tokens": 1600,
"timestamp": "2025-05-02T14:32:15Z"
}
Now you have a cost ledger where every row is a work item. You can aggregate by ticket, by customer, by agent, by day, or by any dimension you want.
The catch: this requires discipline in your API call architecture. Every agent needs to emit this metadata with every call. If one team forgets to add the ticket_id field, those costs become unattributable.
Common Pitfalls
Pitfall 1: Shared keys across multiple environments. A development team and a production team share a key to save on API costs. Development experiments are now mixed into production cost tracking. Don't do this. Use separate keys for dev and prod.
Pitfall 2: Shared keys for "quick experiments." A team spins up a shared key for a short-term project and forgets to retire it. Six months later, there's still a shared key with unknown usage from unknown sources. Enforce key retirement policies: every key should have an owner and an expiration date.
Pitfall 3: Metadata bloat. A team instruments every API call with metadata, but adds so much metadata (user ID, session ID, request ID, geolocation, device type, etc.) that the overhead exceeds the value. Keep metadata lean: focus on what you need for cost attribution (work item ID, agent ID, customer ID, timestamp).
Pitfall 4: Shared keys because "we use Anthropic for some stuff and OpenAI for other stuff." Some teams think they need different keys for different providers. You don't. You can have one key per provider per agent. Anthropic key for support, OpenAI key for support, Anthropic key for claims, OpenAI key for claims. No mixing.
Anti-Patterns: When Key Isolation Goes Wrong
Even with good governance, teams sometimes create problems:
Anti-pattern 1: API key per feature, not per agent. A team spins up a key for "customer support," a key for "chat," a key for "summarization," and a key for "recommendations"—all for the same customer service agent. Now you have 4 keys for 1 agent and your cost is split across them. This defeats the purpose of isolation.
Anti-pattern 2: Shared key that's "supposed to be temporary." A team creates a shared key for a proof-of-concept ("we'll migrate to individual keys later"). The PoC succeeds, moves to production, and three years later you're still using the shared key. By then, migrating to individual keys is harder.
Anti-pattern 3: Metadata that's incomplete. Your logging requires ticket_id and customer_id on every call, but a team occasionally makes calls without them (thinking they're "internal" or "test" calls). Those unmapped costs become noise in your ledger.
Anti-pattern 4: Different metadata standards across teams. One team logs ticket_id, another logs ticket_uuid, another logs ticket_num. Your aggregation layer has to normalize these, which is fragile.
The antidote: establish clear standards upfront. One key per agent. One metadata schema. Code validation that rejects non-compliant calls. Auditing to catch drift.
Governance: Enforcing Key Isolation
To make this work at scale, you need governance:
- Require one key per agent (or team). Make it a policy. New agents = new keys.
- Enforce metadata with code. If your SDK or internal API doesn't require metadata on every call, add a validation layer that rejects calls without it.
- Audit key usage monthly. Are there keys that haven't been used? Keys with sudden spikes? Keys with no metadata? These are signals of problems.
- Publish cost per agent to engineering. Make the cost visible to the teams that own the agents. Teams that see their cost tend to optimize it.
- Retire unused keys. Every key should have an owner and an expiration date. Keys that haven't been used in 30 days should be retired.
The shared API key problem isn't really a problem with shared keys (tools are tools). It's a visibility and governance problem. Runrate solves this by enforcing key isolation, metadata instrumentation, and cost ledger aggregation across your entire AI infrastructure.
For more detail on how to implement work-item-level attribution at scale, see the article on allocating AI costs to a customer or return to the pillar article on AI cost attribution.
Want to see this in your stack?
Book a 30-minute walkthrough with a Runrate founder.
Was this article helpful?