Runrate Framework
5-Stage AI Cost Maturity Curve
From Invisible → Tracked → Allocated → Optimized → Governed — where does your org sit?
Read the full framework →An SLO is a service-level objective. For cloud infrastructure, it means "uptime of 99.9%" or "latency of <200ms." For AI agents, it means "cost per unit of work of $X ± Y%." An SLO is not a target—it's a commitment. Teams hit targets when they try hard. They defend SLOs because they're accountable. The difference matters. Once you have an SLO, anomalies stop being "interesting data points" and start being violations. Violations trigger investigation. Investigation drives optimization. This is why organizations with AI cost SLOs have 30-40% better cost discipline than organizations without them. Most mid-market companies don't have SLOs yet. They should.
What an AI cost SLO looks like
Format: "Agent X runs at $Y per unit of work, with an acceptable variance range of ± Z%."
Examples:
- Support escalation agent: $0.40 per ticket ± 15% (acceptable range: $0.34-$0.46)
- Claims processor: $5.00 per claim ± 12% (acceptable range: $4.40-$5.60)
- Lead qualification: $12 per lead ± 20% (acceptable range: $9.60-$14.40)
The SLO has three components:
1. The target ($Y per unit). This is based on: your baseline cost per unit (what you're actually running at today), your margin model (what the business can afford), and benchmarks (what competitors or adjacent use cases achieve).
2. The variance band (± Z%). This is your acceptable range. It accounts for normal variation (volume changes, seasonal spikes, input variability) without triggering false alarms. 10-15% is typical for mature agents. 15-25% is typical for new agents (more volatility while they're ramping). 5% is too tight (you'll have constant alerts). 30%+ is too loose (you'll miss real problems).
3. The measurement window. When do you measure? Daily, weekly, or monthly? Daily is too granular (one bad day doesn't mean the SLO is violated). Weekly is reasonable (gives you a trend). Monthly is too slow (you miss course-correction opportunities).
How to set an SLO: A worked example
You have a support escalation agent. Let's set its SLO.
Step 1: Measure current cost per ticket. Over the last 60 days, your agent processed 45,000 tickets. Total cost was $18,000 (including API cost, vector DB, observability, and allocations). Cost per ticket: $18,000 / 45,000 = $0.40.
Step 2: Establish your margin model. How much is the business willing to spend to resolve a support ticket? You know:
- Manual baseline: a human support agent costs $75k/year salary, processes 50 tickets/day * 250 working days = 12,500 tickets/year. Cost per ticket: $75k / 12.5k = $6 per ticket. (That includes salary, benefits, training, overhead.)
- Your AI agent at $0.40/ticket is saving $5.60 per ticket.
- You need to keep at least 50% of that margin: $2.80 saved per ticket = target cost of at most $3.20/ticket.
- So your business can afford AI agents up to $3.20/ticket. You're running at $0.40. You have buffer.
Step 3: Set the target based on economics and benchmarks. Your $0.40 baseline is solid, but you want to leave room for scale and model improvements. Set your target at $0.42/ticket. (Slightly higher than baseline, but you have room to increase if volume spikes or costs change.) Your margin savings is $75k / 12.5k - $0.42 = $5.58 per ticket. That's 93% margin retention. Good.
Step 4: Set the variance band. Support agents have stable load (can forecast volume pretty well) but can see spikes during peak season or after big marketing campaigns. Set the band at ± 15%: acceptable range is $0.357 to $0.483 per ticket.
Step 5: Define the measurement window and escalation. Measure weekly (every Monday, compare the prior week's cost per ticket against the SLO). If one day is above the range, no alarm. If the week average is above the range, flag it. If it's above the range for two weeks, escalate to the engineering team and CFO.
SLO: Support agent costs $0.42 ± 15% per ticket, measured weekly. Escalate after 2 weeks above SLO.
Why variance matters more than the target
The target number ($0.40, $5.00, $12) is less important than the variance band. Here's why:
A conservative team might set a target of $0.50 per ticket (above their actual $0.40) with a ± 10% band. If they hit $0.48, they're below target and feel good. If they hit $0.56, they're above SLO and panic.
An aggressive team might set a target of $0.35 per ticket (below their actual $0.40) with a ± 25% band. If they hit $0.40, they're within SLO but unhappy. If they hit $0.44, they're still within SLO and continue.
Both teams end up defending a $0.44-$0.48 range, but with different psychology. The conservative team feels pressure to improve below the original target. The aggressive team accepts that $0.40-$0.44 is normal and focuses on optimizing elsewhere.
The lesson: don't set the target too tight. Set it where the team actually performs, then set a variance band that allows normal fluctuation. This prevents alert fatigue and lets teams focus on real problems.
SLOs should be tied to margin, not just cost
Don't set AI cost SLOs in a vacuum. Tie them to the business case.
Example: You're launching a new AI lead-qualification agent. The business case: "this agent will score 3,000 leads/month at $12/lead (estimated cost), saving 1 FTE in lead qualification ($80k/year), netting $44k/year in margin improvement." The cost SLO should be: "$12 per lead ± 20%."
If the agent runs at $14.40 (above the SLO range), you're still netting $38.6k/year in margin. Acceptable. If it runs at $15.60 (2x the target), you're netting $35.8k/year. Still acceptable. If it runs at $20, you need to ask: does the margin make sense, or should we kill it?
By tying SLOs to margin, you avoid the trap of optimizing for cost when the real issue is business case. Some agents should cost more if they're solving expensive problems.
Three escalation patterns
Pattern 1: Exceeding SLO for one week. Yellow alert. Investigate. Is it a known business event (peak season, new customer, data quality issue)? Is it an infrastructure issue (cache miss, vector DB spike)? Usually the answer is known and expected. Document and move on.
Pattern 2: Exceeding SLO for 2+ weeks. Orange alert. Escalate to the team and CFO. Root cause analysis required. Possible corrective actions:
- Adjust the SLO (was it too aggressive?)
- Optimize the agent (tuning, model change, prompt adjustment)
- Accept the new baseline (circumstances have changed)
Pattern 3: Exceeding SLO by 50% or trending worse. Red alert. Pause the agent immediately. Investigate before restarting.
How SLOs create accountability
Here's the magic of SLOs: they create shared language between product, engineering, finance, and the CFO.
Without SLOs: "The support agent is expensive this month." Vague. Who owns it? What's the plan?
With SLOs: "Support agent exceeded SLO twice this quarter. Week of March 3, cost per ticket was $0.48 (SLO: $0.42 ± 15%). Root cause: new customer with high-complexity documents. Addressed by implementing document classification. Week of April 7, cost per ticket was $0.51 (SLO: $0.42 ± 15%). Root cause: retry storm due to API rate limiting. Fixed by implementing exponential backoff." This is clear, actionable, and measurable.
Teams that have SLOs hit them 80% of the time. Teams that don't have SLOs drift. Set the SLO, measure weekly, escalate violations, and watch cost discipline improve.
Moving from stage 3 to stage 4 of the maturity curve
The 5-Stage AI Cost Maturity Curve has a big jump between stage 3 (Allocated) and stage 4 (Optimized). The difference is SLOs. With SLOs, cost becomes something the team actively manages, not something finance passively tracks. Ready to implement cost SLOs for your AI agents? Book a demo to see how Runrate automates SLO tracking and escalation.
Want to see this in your stack?
Book a 30-minute walkthrough with a Runrate founder.
Was this article helpful?