AI Agent Cost Calculator

Scenario	Cost / item	Monthly	vs base
Base case	$0.305	$15,265	—
Review rate +6 pts (12% → 18%)	$0.41	$20,515	+34.4%
Cheaper model (Claude Haiku 4)	$0.294	$14,704	-3.7%
Third-party calls −30%	$0.299	$14,965	-2%

The AI Agent Cost Calculator gives you the true cost of running a specific AI agent in production—not just the API bill, but the full cost including human review overhead, infrastructure, and operational expenses.

What the calculator inputs are

Agent configuration:

Model choice (Claude Sonnet 3.5, GPT-4o, Gemini 2.5, etc.)
Average input tokens per query: e.g., 2,500
Average output tokens per query: e.g., 750
Queries per month: e.g., 50,000 (tickets, claims, applications)

Operational parameters:

Human review rate (%): e.g., 12% of work items need human verification
Human review time (minutes): e.g., 3 minutes per reviewed item
Human review cost (hourly): e.g., $35/hour (your actual cost for senior staff to review)
Failure/retry rate (%): e.g., 2% of queries need retries
Third-party API calls per query: e.g., average of 0.5 Stripe/Twilio/Salesforce calls per work item

Infrastructure and operations:

Vector database storage (annual): e.g., $8,000/year for semantic search
Logging, observability, and monitoring (annual): e.g., $6,000/year
Prompt engineering and optimization (annual): e.g., $12,000/year
On-call engineering time (annual): e.g., $10,000/year

Calculator outputs:

API cost (inference only): input tokens × input price + output tokens × output price = $/month
Human review cost: (queries × human review %) × (review time ÷ 60) × hourly cost = $/month
Retry overhead: (queries × retry rate) × (API cost per query) = $/month
Third-party API cost: queries × avg calls per query × cost per call = $/month
Infrastructure monthly allocation: (annual infrastructure cost) ÷ 12 = $/month
Total monthly cost: sum of all above
Cost per work item: total monthly cost ÷ queries
Annual cost: total monthly cost × 12

Example walkthrough: Customer service agent

Agent configuration:

Model: Claude Sonnet 3.5 ($3 per million input tokens, $15 per million output tokens)
Input tokens per ticket: 2,000 (customer message + context)
Output tokens per ticket: 600 (agent response)
Tickets per month: 50,000

API cost calculation:

Input: (50,000 × 2,000) ÷ 1,000,000 × $3 = $300/month
Output: (50,000 × 600) ÷ 1,000,000 × $15 = $450/month
Total API cost: $750/month

Operational parameters:

Human review rate: 12% (6,000 tickets/month)
Review time: 3 minutes per ticket
Review hourly cost: $35/hour (senior CSR doing QA)
Human review hours: 6,000 × 3 ÷ 60 = 300 hours/month
Human review cost: 300 × $35 = $10,500/month

Failure/retry:

Retry rate: 2% of queries
Cost per query: $750 ÷ 50,000 = $0.015
Retry cost: 1,000 × $0.015 = $15/month

Third-party APIs:

Average 0.2 Stripe verification calls per ticket (once per 5 tickets)
Stripe verification cost: ~$0.10 per call
Third-party cost: 50,000 × 0.2 × $0.10 = $1,000/month

Infrastructure (monthly allocation):

Vector database: $8,000 ÷ 12 = $667/month
Logging and observability: $6,000 ÷ 12 = $500/month
Prompt engineering: $12,000 ÷ 12 = $1,000/month
On-call engineering: $10,000 ÷ 12 = $833/month
Total infrastructure: $3,000/month

Total monthly cost: $750 (API) + $10,500 (human review) + $15 (retry) + $1,000 (third-party) + $3,000 (infrastructure) = $15,265/month

Cost per work item: $15,265 ÷ 50,000 = $0.305 per ticket

Annual cost: $15,265 × 12 = $183,180/year (for one agent handling 600,000 tickets/year)

Example 2: Claims adjudication agent

Agent configuration:

Model: Claude Opus 4.6 ($15 per million input tokens, $75 per million output tokens)
Input tokens per claim: 8,000 (claim details, policy, history, supporting docs)
Output tokens per claim: 1,500 (decision and justification)
Claims per month: 8,000

API cost calculation:

Input: (8,000 × 8,000) ÷ 1,000,000 × $15 = $960/month
Output: (8,000 × 1,500) ÷ 1,000,000 × $75 = $900/month
Total API cost: $1,860/month

Operational parameters:

Human review rate: 25% (2,000 claims/month, higher complexity = higher review)
Review time: 5 minutes per claim (more complex than support)
Review hourly cost: $50/hour (experienced adjudicator for QA)
Human review hours: 2,000 × 5 ÷ 60 = 167 hours/month
Human review cost: 167 × $50 = $8,350/month

Failure/retry:

Retry rate: 3% (slightly higher for complex claims)
Cost per claim: $1,860 ÷ 8,000 = $0.233
Retry cost: 240 × $0.233 = $56/month

Third-party APIs:

1.5 verification calls per claim (MVR, medical records, policy database)
Average cost: $3 per verification
Third-party cost: 8,000 × 1.5 × $3 = $36,000/month

Infrastructure:

Vector database (larger): $15,000 ÷ 12 = $1,250/month
Compliance logging: $10,000 ÷ 12 = $833/month
Prompt engineering: $15,000 ÷ 12 = $1,250/month
On-call engineering: $12,000 ÷ 12 = $1,000/month
Total infrastructure: $4,333/month

Total monthly cost: $1,860 (API) + $8,350 (human review) + $56 (retry) + $36,000 (third-party) + $4,333 (infrastructure) = $50,599/month

Cost per work item: $50,599 ÷ 8,000 = $6.32 per claim

Annual cost: $50,599 × 12 = $607,188/year (for one agent handling 96,000 claims/year)

Why the calculator matters

The calculator reveals the true cost structure of an agent. Most teams see the API bill ($750/month in the first example, $1,860 in the second) and think that's the cost. In reality:

In the CSR example: API is only 5% of total cost. Human review, infrastructure, and operations are 95%.
In the claims example: API is 3.7% of total cost. Third-party APIs and human review are 72% of total cost.

This is the AI Cost Iceberg in action. The visible API spend (the tip) is misleading. The hidden cost (infrastructure, human review, third-party APIs) is where the true magnitude lives.

How to benchmark your results

CSR/support agent:

Industry range: $0.40–$1.20 per ticket (fully loaded)
If you're above $1.20, your human review rate or infrastructure overhead is high. Investigate.
If you're below $0.40, verify your human review rate isn't too low (accuracy might be suffering)

Claims adjudication agent:

Industry range: $2.00–$4.50 per claim (fully loaded)
If you're above $4.50, you have too many third-party API calls or too much human review. Consider process redesign.
If you're below $2.00, you might be under-reviewing (compliance risk)

Loan origination agent:

Industry range: $1.50–$3.50 per application (fully loaded)
If you're above $3.50, your KYC verification overhead is high. Consider batching or better tooling.

Back-office workflow (general admin, data entry):

Industry range: $0.80–$2.00 per unit
This is more variable depending on what "unit" means. Calibrate to your specific workflow.

Sensitivity analysis in the calculator

The calculator also shows sensitivity on key assumptions:

If human review rate is 18% instead of 12%:

CSR example: cost per ticket goes from $0.305 to $0.454 (+49%). Still under industry benchmark.
Claims example: cost per claim goes from $6.32 to $7.80 (+23%). Still acceptable.

If you upgrade to Claude Opus 4.6 (more capable, higher cost):

CSR example: API cost goes from $0.015 to $0.035 per ticket. Net cost per ticket goes from $0.305 to $0.325 (+7%). Worth it if accuracy improves.

If you can reduce third-party API calls by 30%:

Claims example: third-party cost drops from $36K to $25,200. Total cost drops from $50,599 to $39,799. Cost per claim drops from $6.32 to $4.97 (−21%). Major win.

The calculator helps you understand which levers matter most for cost optimization.

What to do next

Input your own agent configuration. What's your model choice? What's your actual human review rate? What's the true cost of third-party verification in your domain? Once you have a realistic cost-per-outcome number, you can benchmark against peers, set improvement targets, and evaluate whether the agent deployment is profitable.

When you're ready to see what work-item-level AI cost attribution looks like in your stack, talk to Runrate — 15-minute demo.

Want to see this in your stack?

Book a 30-minute walkthrough with a Runrate founder.

Get a Demo

Was this article helpful?

AI Agent Cost Calculator

Agent configuration

Human review

Operational overhead

Infrastructure (annual)

Levers: which assumption moves the needle?

What the calculator inputs are

Example walkthrough: Customer service agent

Example 2: Claims adjudication agent

Why the calculator matters

How to benchmark your results

Sensitivity analysis in the calculator

What to do next

Want to see this in your stack?