Runrate Framework
The AI Cost Iceberg
Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).
Read the full framework →When a portfolio company's operations team proposes adopting an AI vendor—whether it's a customer service platform, a claims processing system, or a workflow automation tool—your instinct as an operating partner is correct: slow down and ask hard questions. But which questions matter? The vendors will demo beautiful interfaces and promise efficiency gains. What you need is a structured evaluation framework that separates marketing from operational reality.
This article walks through a vendor-neutral rubric for evaluating AI vendor proposals across five dimensions: cost transparency, attribution capability, lock-in risk, integration depth, and vendor stability.
The Five Evaluation Dimensions
Dimension 1: Cost Transparency (Weight: 25%)
The vendor must be able to tell you exactly what you're paying for and why. This is where most vendors fail.
Ask the vendor:
- "What is your all-in monthly cost for processing 50,000 claims per month?" Demand a detailed breakdown: API inference costs, retries and error handling, observability and logging, human-in-the-loop review costs, data storage, and any platform or setup fees.
- "How does your pricing scale?" Does cost per unit of work go down with volume (good), stay flat (acceptable), or go up (red flag)?
- "Can you give me three reference customers and their monthly cost for a similar workflow?" A vendor confident in their economics will provide this. If they hedge ("it depends too much on your specific setup"), they are hiding something.
- "Are there any hidden costs I should anticipate?" Ask about retry penalties (if the AI fails, does it retry and incur additional cost?), vector database storage for long-running workflows, fine-tuning datasets, or compliance and audit logging overhead.
Scoring rubric:
- 5/5: Vendor provides itemized cost breakdown, reference customer costs, and honest about scale of pricing. Cost is transparent and benchmarked.
- 4/5: Vendor provides breakdown and reference costs but with some hedging on scale.
- 3/5: Vendor provides ballpark estimate but unable or unwilling to give specific reference data.
- 2/5: Vendor quotes "starts at $X per month" with vague details.
- 1/5: Vendor claims "custom pricing, let's talk to sales."
Dimension 2: Attribution Capability (Weight: 25%)
Can the vendor tie the cost of running their system back to the specific work outcomes your team cares about? This is the cornerstone of AI cost attribution and the AI Cost Iceberg framework.
Ask the vendor:
- "Can you show me a cost-per-claim or cost-per-ticket dashboard for a customer workflow?" They should be able to demonstrate real data showing exactly how much it costs to process one unit of work.
- "Do you log inference cost, retry cost, and human review cost separately?" If the vendor lumps everything into one line item, you cannot optimize.
- "Can you integrate with our finance system or provide cost data via API?" The vendor should be able to feed cost attribution data into your accounting system daily, not monthly. If they only provide annual invoices, you cannot see anomalies or cost spikes in real time.
- "What happens if a workflow fails and triggers a retry?" Does the retry cost get billed as a separate line item, or is it bundled? You need to know.
Scoring rubric:
- 5/5: Vendor provides real-time cost per outcome dashboards with granular line-item breakdown and API access for integration.
- 4/5: Vendor provides monthly cost-per-outcome reporting with detailed breakdown.
- 3/5: Vendor provides cost visibility but with a 30+ day reporting lag.
- 2/5: Vendor provides aggregated monthly costs with limited breakdown.
- 1/5: Vendor does not offer cost attribution; you must estimate based on API bills.
Dimension 3: Lock-In Risk (Weight: 20%)
How portable is the AI workflow the vendor is building for you? If you want to switch vendors in 18 months, how painful is that process?
Ask the vendor:
- "Are you running commodity APIs (OpenAI, Anthropic, Google) under the hood, or proprietary models?" Commodity APIs = portable. Proprietary models = locked in.
- "What proprietary training or fine-tuning will you do on our data?" If the vendor invests in training a model on your claims data, you own the risk if you switch vendors. Get clarity on data ownership and portability.
- "What's the onboarding and migration cost to switch platforms?" Honest vendors will say "it's a 4–6 week effort; here's the cost." If they dodge the question, assume high lock-in.
- "Can we access our data and workflow definitions if we leave?" You should be able to export all your training data, validated claim outcomes, and workflow definitions and port them to another vendor.
- "What is the contract termination cost?" Some vendors charge exit fees for early termination. Make sure it's in the contract and reasonable.
Scoring rubric:
- 5/5: Commodity APIs, data fully portable, minimal migration cost, no termination fees.
- 4/5: Mixture of commodity and proprietary components; data mostly portable; migration cost < 2 months of savings.
- 3/5: Some proprietary components; partial data portability; migration would take 2–3 months.
- 2/5: Heavy proprietary components; limited data portability; migration cost is substantial.
- 1/5: Fully proprietary; data trapped; vendor owns the workflow.
Dimension 4: Integration Depth (Weight: 15%)
How well does the vendor's system integrate with your existing tech stack? Does it require rework of your workflows, or does it plug in?
Ask the vendor:
- "Can you integrate with our case management system / claims platform / CRM via API?" (Name the specific tools your team uses.)
- "Do you support our authentication standard?" (Single sign-on, LDAP, API keys, OAuth?)
- "What's the implementation timeline?" Estimates under 8 weeks are realistic. Estimates over 12 weeks suggest deep integration work.
- "Do you require custom development on our side?" If yes, budget 2–3 FTEs for 3 months. If the vendor claims they can do it all, ask for a reference customer who had zero custom dev.
Scoring rubric:
- 5/5: Plug-and-play integration with your stack, pre-built connectors for your tools, 4-week implementation.
- 4/5: Good API support, 8-week implementation, minimal custom dev.
- 3/5: Basic API support, 10-12 week implementation, some custom development.
- 2/5: Limited integration, 12+ week timeline, heavy custom development.
- 1/5: Point solution; rework of your workflow required.
Dimension 5: Vendor Stability (Weight: 15%)
Is this vendor going to exist in three years? What is their financial health, governance, and customer base?
Ask the vendor (or research):
- "How many enterprise customers do you have, and what's your total ARR?" A vendor with $50M+ ARR and 500+ customers is safer than a vendor with $5M ARR and 20 customers.
- "Who are your major investors?" Venture-backed is fine; underfunded is risky.
- "What's your historical customer churn rate?" Below 10% annual churn is healthy. Above 15% is a warning sign.
- "Do you have a roadmap for your product, and can you share your engineering investment priorities?" Vendors betting on the right technologies are safer.
- "Have you raised capital in the last 12 months?" If not, and they're still a young company, profitability is uncertain.
Scoring rubric:
- 5/5: $50M+ ARR, 500+ customers, well-funded, sub-10% churn, clear roadmap.
- 4/5: $20M+ ARR, 200+ customers, funded, <12% churn.
- 3/5: $10M+ ARR, 100+ customers, early-stage risk present.
- 2/5: <$10M ARR, <100 customers, high failure risk.
- 1/5: Undisclosed financials, high churn, uncertain future.
Scoring and Decision Framework
Create a weighted scorecard:
| Dimension | Weight | Your Score | Weighted Score | | --- | --- | --- | --- | | Cost Transparency | 25% | __ / 5 | __ | | Attribution Capability | 25% | __ / 5 | __ | | Lock-In Risk | 20% | __ / 5 | __ | | Integration Depth | 15% | __ / 5 | __ | | Vendor Stability | 15% | __ / 5 | __ | | Total | | | __ / 5 |
Decision rule:
- 4.2–5.0: Recommend proceeding. This is a strong vendor.
- 3.5–4.2: Proceed with caution. You have moderate risk. Negotiate hard on lock-in and cost terms.
- 2.5–3.5: Do a deep reference check before proceeding. Consider alternatives.
- <2.5: Do not recommend. The vendor has too many weaknesses.
Red Flags in Every Category
Cost Transparency red flags:
- Vendor will not provide reference customer costs
- Pricing changes based on "your use case" every time you ask
- Significant gap between vendor's quoted cost and actual customer cost
Attribution red flags:
- Vendor does not track cost by work item
- Vendor does not provide cost data more frequently than quarterly
- Cost attribution is opaque or requires custom reporting
Lock-in red flags:
- Vendor claims you cannot export your data
- Training or fine-tuning is mandatory and "improvements cannot be ported"
- Contract has high early termination fees
Integration red flags:
- Vendor requires you to rebuild your workflows in their system
- Implementation timeline is 6+ months
- Vendor points to outlier reference customers with fast implementations
Stability red flags:
- Vendor will not disclose ARR or customer count
- Churn rate is over 15%
- Vendor is heavily dependent on a single large customer
The Negotiation
Once you've scored vendors, negotiate on the three dimensions where you have leverage:
- Cost and volume discounts. Tell the vendor: "If you beat [competitor]'s cost by 15%, we can guarantee $X per month volume for two years."
- Contract terms. Push for annual price lock-in, 90-day termination notice, and data portability guarantees in the contract.
- SLAs and support. If this workflow is business-critical, demand uptime SLAs, response time commitments, and dedicated support.
Most vendors will negotiate on all three. If a vendor refuses to budge on any of them, that's a stability or confidence signal—they may not need your business, or they may not believe in their product long-term.
Beyond the Vendor: Operationalizing the Selection
Finally, after you've selected a vendor, the work is just beginning. Establish a 90-day post-implementation review with the operations team: Did cost per outcome land where we expected? Is the workflow stable? Are there integration issues? The best vendor-selection decisions are validated by operational reality. For a deeper dive on structuring AI vendor governance across your portfolio and connecting vendor selection to exit narratives, see the PE Operating Partner AI Playbook. When you're ready to see how work-item-level cost attribution helps you validate vendor economics against actual spend, request a demo.
Want to see this in your stack?
Book a 30-minute walkthrough with a Runrate founder.
Was this article helpful?