Runrate Framework
The AI Cost Iceberg
Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).
Read the full framework →Runrate Framework
5-Stage AI Cost Maturity Curve
From Invisible → Tracked → Allocated → Optimized → Governed — where does your org sit?
Read the full framework →Runrate Framework
AI Workforce P&L
Treat AI agents like employees: cost structure, productivity target, and retirement trigger per agent.
Read the full framework →You're in a board meeting and someone says "our generative AI spend on inference is being compressed by cheaper tokens per context window, so we need to shift our strategy toward post-training reasoning models." You nod. You have no idea what they said. This glossary is your cheat sheet. Here are the 50 terms every finance leader needs to understand to read an AI budget, evaluate a vendor, or sit through a technical conversation without nodding.
Cost & Billing Terms (1-10)
1. Token — The granular unit of text that AI models process. Roughly 0.75 words per token. You're charged per token consumed.
Why CFOs care: Tokens are how vendors bill you. More tokens in your prompts or longer outputs = higher bills.
2. Prompt — The instruction or question you send to the AI model.
Why CFOs care: Longer prompts cost more. System prompts you reuse (guardrails, context) multiply cost across every transaction.
3. Completion — The AI-generated response to your prompt.
Why CFOs care: Output tokens cost 3x more than input tokens. Long completions drive up the bill.
4. Context Window — The maximum amount of text a model can see at once. Claude 3.5 Sonnet: 200K tokens. GPT-4: 128K tokens.
Why CFOs care: Larger context windows tempt teams to dump entire documents in for analysis. Unnecessary context = wasted token spend.
5. Cost-Per-1K-Tokens — The vendor's pricing unit. GPT-4o input costs $0.005/1K. Claude 3.5 Sonnet costs $0.003/1K input.
Why CFOs care: This is the only pricing metric your vendor will give you. You need to translate it to cost-per-outcome.
6. Cost-Per-Request — Your actual operational cost per inference or transaction. If you spend $10K/month running 1M inferences, cost-per-request is $0.01.
Why CFOs care: This is the number that matters for your P&L. Token cost is the denominator; business value is the numerator.
7. Cost-Per-Outcome — The true cost of delivering a business result. Cost to resolve one customer support ticket, underwrite one claim, or process one loan application.
Why CFOs care: This is the number that determines whether AI creates margin or erodes it. From token cost to cost-per-outcome is the CFO's job.
8. AI Bill Shock — When your AI bill doubles or triples month-over-month because someone deployed an agent that runs prompts in a loop.
Why CFOs care: It's preventable if you have observability and cost controls. Without them, it's the budget killer of 2026.
9. Shadow AI — AI spend you don't know about. Team members running GPT-4 on ChatGPT. Finance analysts querying Claude. Untracked API keys.
Why CFOs care: MIT NANDA research shows 95% of AI pilots fail to deliver P&L impact. Shadow AI is 40% of enterprise AI spend.
10. Cost Attribution — Assigning the actual cost of an AI inference to the business unit, customer, or work item that caused it.
Why CFOs care: Without it, you can't answer "did this AI really make money?" Runrate solves this.
Technical Architecture Terms (11-20)
11. LLM (Large Language Model) — An AI model trained on billions of words. GPT-4, Claude 3.5 Sonnet, Gemini are LLMs.
Why CFOs care: Most of your AI spend is on LLM inference. Understanding which LLM you're using drives vendor negotiations.
12. Fine-Tuning — Customizing a pre-trained model on your proprietary data. Takes an existing GPT-4 and adjusts it toward your domain.
Why CFOs care: It's a one-time training cost. Usually cheaper than building your own model, but more expensive than vanilla prompting.
13. Embedding — A numerical representation of text, useful for semantic search or storing customer context. Embeddings cost less than running full inference.
Why CFOs care: Embeddings are cheaper than LLM calls but your vector database costs add up at scale.
14. Vector Database — Storage for embeddings, used for retrieval-augmented generation. Pinecone, Weaviate, Qdrant, or vector search in Postgres.
Why CFOs care: Storing millions of embeddings costs money. This is part of the hidden AI Cost Iceberg.
15. RAG (Retrieval-Augmented Generation) — The pattern of retrieving relevant documents, then passing them to the LLM to answer a question.
Why CFOs care: RAG reduces hallucinations but increases cost (vector search + embedding storage + larger context windows). Worth it if it improves accuracy.
16. Hallucination — When the AI makes up plausible-sounding information that's false.
Why CFOs care: Hallucinations require human review. Human review time is expensive. This is the largest hidden cost in the iceberg.
17. Guardrails — Rules or constraints you apply to the model to prevent bad outputs (refusing to answer certain topics, enforcing format).
Why CFOs care: Good guardrails reduce human review time, which is expensive. Bad guardrails miss real errors.
18. AI Gateway — Infrastructure that routes requests to the right model, enforces rate limits, caches responses, and logs usage. Examples: Anthropic's Workbench, Langfuse, LiteLLM.
Why CFOs care: Gateways add cost but prevent bill shock and reduce token waste. ROI is usually positive.
19. Observability — Logging and monitoring what your AI system is doing. Which models, which tokens, which errors, which decisions.
Why CFOs care: Without observability, you don't know where your money is going. This is the foundation of cost attribution.
20. Evaluation — Testing whether your AI system's outputs are accurate, safe, and useful. Benchmark datasets, human scoring, automated metrics.
Why CFOs care: Evaluation is expensive (human time, compute cost, data labeling). It's hidden cost in the iceberg. Skip it and your AI fails silently.
AI Operations Terms (21-30)
21. Agent — An AI system that can take actions autonomously (make API calls, modify data, use tools) rather than just generate text.
Why CFOs care: Agents are more valuable but more expensive because they run in loops and make tool calls.
22. Agentic AI — The operational pattern of deploying agents as business processes rather than chatbots. Agents that adjudicate claims, qualify leads, or triage tickets autonomously.
Why CFOs care: This is where AI margin is made or lost. Agents require cost-per-outcome thinking, not cost-per-token.
23. Agent Runtime — The infrastructure that orchestrates agents, manages state, handles retries, logs activities. Anthropic Builds, Anthropic SDK, LangGraph, CrewAI.
Why CFOs care: Runtime overhead adds 10-20% to token cost. It's non-negotiable for production agents.
24. MCP (Model Context Protocol) — A standard for connecting models to external tools and data sources. Lets Claude see your Salesforce, run database queries, etc.
Why CFOs care: MCP makes agents more powerful but also more dangerous (more tool calls, more retries, more cost).
25. Tool Use — When an AI calls a third-party API or tool (Stripe charge, Twilio SMS, Salesforce lookup) to complete a task.
Why CFOs care: Tool calls cost real money. A $0.002 AI response that triggers a Stripe API call might cost $0.10 in downstream fees.
26. Function Calling — The ability of an LLM to invoke structured functions. GPT-4's function calling, Claude's tool_use block.
Why CFOs care: Function calling makes agents reliable, but it increases token cost (structured outputs are verbose).
27. Prompt Caching — The vendor's ability to cache repeated prompts at a discount. OpenAI charges $0.90/1M cached tokens vs. $5/1M regular input.
Why CFOs care: With reused system prompts, caching can cut inference cost by 10-40%. This is your first optimization lever.
28. Batch Inference — Running inferences in batches (overnight, off-peak) at a 50% discount. OpenAI's Batch API, Anthropic's upcoming batch pricing.
Why CFOs care: Not all inferences need to be real-time. Batching can cut your bill significantly if your use cases allow delays.
29. Flex Tokens / Provisioned Throughput — Commitment pricing from vendors. Pay monthly for guaranteed capacity at a discount vs. pay-as-you-go.
Why CFOs care: If you have predictable volume, commitment pricing saves 20-40%. It locks in your costs and prevents bill shock.
30. Rate Limit (TPM/RPM) — Tokens-per-minute and requests-per-minute caps enforced by the vendor to prevent abuse. TPM 80K = max 80,000 tokens/minute.
Why CFOs care: Hit rate limits and your system fails. Raising limits costs more. This is part of AI procurement.
AI Finance & Governance Terms (31-40)
31. LLMOps — Operations for large language models. Like DevOps but for AI. Versioning models, managing prompts, handling failures, logging costs.
Why CFOs care: Good LLMOps prevents bill shock. Bad LLMOps loses money.
32. Model Registry — Central place to version and deploy models. Hugging Face Model Hub, vendor-specific registries, internal registries.
Why CFOs care: Using the right model for the right task drives cost efficiency. A model registry enforces this discipline.
33. Semantic Cache — Cache based on meaning rather than exact text match. If you ask the same question two ways, semantic cache recognizes it as a repeat.
Why CFOs care: Semantic caching reduces token waste. Fewer repeated inferences = lower bills.
34. AI Procurement — Evaluating and contracting AI vendors. RFP, pricing negotiation, implementation.
Why CFOs care: Most companies negotiate contracts at a $10K/month minimum. Runrate's field guide covers the conversation.
35. AI POC (Pilot) — Proof-of-concept AI projects. You build a prototype to validate business case before scaling.
Why CFOs care: 95% of AI POCs fail to deliver value (MIT NANDA). The CFO's job is to fund winners and kill losers fast.
36. AI ROI — Return on investment from AI. (Benefit - Cost) / Cost. If AI saves $100K/month and costs $20K/month, ROI is 400%.
Why CFOs care: This is the only metric that matters for the board. If you can't calculate it, your AI project won't get approved.
37. Payback Period — Time to recover the cost of an AI investment. If cost is $100K and savings are $20K/month, payback is 5 months.
Why CFOs care: Payback period under 12 months is usually acceptable. Longer and the board gets nervous.
38. AI Unit Economics — Cost and revenue per AI-driven transaction. E.g., cost-per-resolved-ticket, cost-per-underwritten-claim, cost-per-approved-loan.
Why CFOs care: Unit economics determine scalability. If AI resolution costs more than manual, it's a value-destruction project.
39. FinOps for AI — Applying FinOps principles (Cost Optimization, Business Alignment, Governance) to AI spend.
Why CFOs care: FinOps Foundation now covers AI. Aligning with FinOps standards is the path to governance maturity.
40. AI Cost Attribution — Assigning the cost of each AI inference to the business unit, team, or customer that caused it. This is Runrate's core.
Why CFOs care: Without attribution, you can't optimize. With attribution, you can identify money-losing workflows and fix them.
AI Finance Impact Terms (41-50)
41. EBITDA — Earnings before interest, tax, depreciation, amortization. Your operating profit. AI costs hit EBITDA directly.
Why CFOs care: High EBITDA margins are valued by acquirers and PE. If AI erodes EBITDA, you've got a problem.
42. AI Line Item — Budget code for "generative AI spend." Most CFOs track it as a separate P&L line for transparency.
Why CFOs care: You need to budget it separately so you can defend the spend to the board.
43. AI Margin Compression — When AI adoption drives costs faster than revenue growth, squeezing operating margin. "We deployed AI agents and gross margin dropped 3%."
Why CFOs care: This is the biggest AI risk in 2026. McKinsey State of AI 2025 finds only 5.5% of orgs are "AI high performers" with EBIT impact.
44. AI Chargeback — Billing internal business units for the AI infrastructure they use. Finance team building a claims bot pays a "chargeback" for inference cost.
Why CFOs care: Chargebacks align incentives. Teams that feel the cost make cheaper, more efficient workflows.
45. AI Gross Margin — Revenue minus cost of goods sold, where COGS includes AI cost. If you serve customers via AI, the AI cost is part of COGS.
Why CFOs care: Gross margin is the metric that kills or validates AI projects. If AI-driven gross margin is negative, the project is a long-term loss.
46. Total Cost of Ownership (TCO) — Not just token cost, but the full cost of operating an AI system: infrastructure, observability, training, governance, human review, vendor fees.
Why CFOs care: Most CFOs underestimate TCO by 300%. This is the AI Cost Iceberg in spreadsheet form.
47. AI Margin Leverage — The multiplier effect of automating high-margin workflows. If an agent processes 100x the volume of a human, and cost is 50% of human, AI multiplies margin.
Why CFOs care: This is where AI creates real value. Find workflows with high leverage and the CFO's case is made.
48. AI Spend Velocity — How fast AI spend is growing month-over-month. "Our AI spend is growing 25% MoM and we have no controls."
Why CFOs care: Velocity without governance is how you get bill shock. You need cost controls and attribution before velocity.
49. Maturity Model — Runrate's 5-stage framework: Invisible → Tracked → Allocated → Optimized → Governed. Most enterprises are at stage 1-2.
Why CFOs care: Knowing your stage tells you what controls you need to implement next.
50. AI Cost as Payroll Equivalent — Runrate's principle: treat AI agents like employees. Each has a timecard (work done), attribution (P&L line), cost structure (per-use vs. salary-equivalent), and retirement trigger.
Why CFOs care: This reframes AI from "cool technology" to "operational asset" — which is how the board thinks about every other headcount line.
These 50 terms are your minimum viable vocabulary. Bookmark this page. When someone says "we need to optimize our agent orchestration to reduce prompt caching misses," you now know to ask: "Is this reducing our cost per outcome or just moving costs around?"
The CFO's job is to translate vendor jargon into business impact. Use these definitions to start that translation. If you're building the CFO's case for AI cost attribution, the 40-page CFO Field Guide to AI Costs walks through the line-item model and the board-deck talking points.
Go deeper with the field guide.
A step-by-step PDF for implementing AI cost attribution.
Was this article helpful?