Runrate Framework
The AI Cost Iceberg
Visible API spend (10%) vs hidden inference, storage, observability, retries, human review (90%).
Read the full framework →Legal document review is the original "AI will eliminate cost" narrative in enterprise software. A first-year associate billing $250/hour reviewing a 10-page contract for key terms takes 45 minutes; cost per page: $3.75. An AI system (Harvey, Spellbook, LexisNexis AI-Assisted Research, Casetext) that does the same review in 30 seconds costs $0.15–$0.50 per page. The math says 94% cost reduction. The reality is messier because document review has two phases—mechanical extraction and judgment—and AI handles only the first.
The work-item economics of legal AI
Legal AI cost models vary by use case. The unit of work is typically per-page-reviewed or per-document-reviewed, not per-hour-billed.
Contract review for M&A diligence: An associate reviews a vendor contract (say, a 15-page software license agreement). Task: identify key commercial terms (price, term length, renewal terms), identify risk terms (liability caps, IP indemnification, data security requirements), flag deviations from the company's standard terms. Time: 60–90 minutes; cost at $250/hour fully loaded: $40–$60 per contract.
AI (Harvey, Spellbook) can extract commercial terms, identify missing standard clauses, and flag known-risky language in 3–5 minutes. Cost: $2–$5 per contract. But—and this is critical—the AI flags risk; a partner still reads the flags and makes the judgment call. The AI doesn't eliminate the human; it shifts the human's work from "read every page slowly" to "review AI flags and argue about interpretation."
Real time saved: 70–80% of the reading time (the mechanical extraction). Real cost saved: 30–50% of the total legal cost (because judgment and signing-off still require partners, who are expensive).
Litigation document review: A team reviews 100,000+ documents (emails, contracts, meeting notes) to find documents responsive to discovery. First-level review: young associates at $150–$200/hour reading documents, tagging responsive vs. non-responsive, identifying key issues. Cost per document: $0.50–$2 depending on document length and complexity. AI (Relativity's AI-Assisted Review, Kroll Ontrack, native TAR tools) can tag documents at $0.05–$0.15 per document with 90%+ accuracy.
Here's the gotcha: that 90% accuracy on 100,000 documents means 10,000 mislabeled documents. If 1% are truly responsive and you mislabel 10% of the 1,000 responsive ones, you've violated discovery. The legal risk of AI error is uninsurable. So most firms run hybrid: AI does first-pass tagging, then a subset of AI-tagged documents (maybe 30%) go to human review as a "confidence check." Real cost reduction: 40–60%, not 90%.
Contract generation and revision: AI that drafts employment agreements, NDAs, or sales contracts from a template runs at $0.25–$2 per draft depending on complexity. A junior associate drafting the same contract takes 2–4 hours (cost: $50–$200). AI acceleration is real, but cost avoidance requires a belief that the AI output is good enough for senior review (which it often isn't). If the partner spends 30 minutes editing the AI draft instead of 120 minutes writing from scratch, time saved is ~75%, cost saved is ~50% (because editing takes legal judgment).
Where legal AI works and where it fails
AI wins: extracting specific data points from documents (party names, effective dates, payment terms in contracts; plaintiff/defendant, court name, judgment amount in litigation discovery). Pattern matching over large document sets. Entity recognition. Clause identification (find all non-compete clauses across 500 vendor contracts).
AI struggles: nuance and context. A phrase like "indemnification extends to third-party claims arising out of the customer's use of the software" needs interpretation. Does that cover the case where the customer misconfigured the software? What if the third party sues for something partially caused by the software and partially by the customer's actions? Those are judgment calls. An LLM will make a guess; the right answer depends on regulatory context, prior case law, and the specific commercial relationship.
AI's real risk: confidently wrong. An AI system that tells you a contract has a $1M liability cap when it actually says $100k isn't just inefficient; it's dangerous. You rely on AI output and miss a key term. The LLM's fluency makes it sound authoritative even when it's hallucinating.
This is why elite firms (top-50 AmLaw) are adopting AI for labor-intensive, low-judgment work (document discovery, data extraction, contract term compilation) but not for judgment-heavy work (deal strategy, risk assessment, litigation strategy).
The vendor landscape for legal AI
Harvey (OpenAI-backed, ChatGPT-powered) positions as the "AI partner for lawyers." Spellbook (founded by YC, acquired by Alteryx, then spun back out) focuses on contract drafting and review. Casetext (Thomson Reuters portfolio) provides AI-assisted legal research. LexisNexis and Westlaw both have AI research layers. Relativity (e-discovery standard) now bundles AI tagging.
Pricing models vary: some vendors charge per-document, some per-user-per-month, some per query. Harvey charges per-question or per-document-reviewed. Spellbook charges $30–$100/month per user. Casetext charges based on query volume.
The wedge is differentiation on accuracy. A vendor claiming 95% accuracy on contract term extraction is meaningfully better than one claiming 85%. But most vendors don't publish accuracy benchmarks against human lawyers. You're buying based on marketing, not empirical performance.
The cost attribution challenge in legal
Law firms don't measure work-item economics the way other industries do. They measure revenue (billable hours) and utilization rate (hours billed ÷ hours available), not cost per deliverable. A partner billing 2,000 hours per year at $350/hour is $700k revenue, regardless of how many contracts she reviewed or how efficiently.
For in-house legal teams, the problem is different: they're often one cost center buried in "corporate overhead." Finance sees "general counsel salary" and "outside counsel cost," but doesn't break down what work is being done. So integrating AI is an operational question without a cost baseline. If you don't know what it costs to review 100 contracts today, you can't measure savings when you deploy AI.
The second challenge: risk avoidance doesn't show up in accounting. If AI contract review prevents one missed liability cap and saves the company $500k in unexpected exposure, that's not recorded as AI ROI; it's recorded as a non-event. You prevented a disaster, but there's no financial entry for it.
Legal AI cost benchmark table
| Task | Work unit | Manual cost | AI-assisted cost | Risk of error | Payback horizon | | --- | --- | --- | --- | --- | --- | | M&A contract review | 1 contract (15 pages) | $40–$60 | $2–$5 | Low-medium | 12–18 mo | | Litigation discovery review | 1 document | $0.50–$2 | $0.05–$0.15 | Medium-high | 6–12 mo | | Contract term extraction | 1 contract | $20–$40 | $1–$3 | Low | 6–12 mo | | Legal research query | 1 research question | $50–$200 | $5–$30 | Medium | 12–24 mo | | Contract generation | 1 standard agreement | $100–$300 | $10–$50 | Medium (editing required) | 12–18 mo | | Due diligence compliance | 1 company (100+ docs) | $2,000–$5,000 | $500–$1,500 | Medium | 6–12 mo |
The COO playbook for legal AI
-
Start with high-volume, low-judgment work. If you have 500 vendor contracts in your system and need to identify all non-compete clauses, that's a perfect AI task. If you need to evaluate a complex M&A deal with novel structure, that's not. Know the difference.
-
Measure your current legal cost by task, not by role. How much do you spend on contract review per year? How many contracts? Cost per contract. This is your baseline. If you can't articulate it, you can't measure AI ROI.
-
Run a parallel-process pilot. For 100 contracts, use AI to extract terms and simultaneously have a junior associate do the same. Compare extraction accuracy and time. If AI is 90%+ accurate and 10x faster, expand to larger set. If AI is 70% accurate, the error rate is too high for trust.
-
Establish accuracy thresholds by task. Contract review (low-judgment): AI must be 95%+ accurate on term extraction. Litigation review (higher judgment): AI can be 85%+ accurate; humans double-check flagged documents. Legal research: AI can be 70%+ accurate; human still verifies sources.
-
Use AI for labor shifting, not headcount elimination. Don't promise that AI will eliminate one junior associate headcount. Promise that it will let one junior associate do the work of two, freeing her to do higher-judgment work or reducing external counsel spend. Headcount elimination is risky in legal (you might need the capacity in six months); labor shifting is credible.
-
Document the AI decision path for audit and liability. If AI reviews a contract and misses a term, can your firm prove that the AI was properly configured and the human reviewer followed a proper QA process? You need audit trails: which AI model reviewed which document, what confidence score, what did the human reviewer verify. This is part of your implementation cost.
-
Monitor error rates quarterly. After launch, check a random sample of AI decisions for accuracy. If error rate creeps up (model drift, configuration drift), escalate. Most vendors don't proactively warn you; you have to monitor.
For COOs at law firms and in-house legal teams, AI payback is clearest in high-volume extraction tasks (contract term identification, discovery tagging) with 12–18 month payback cycles. For core client-facing legal work (strategy, negotiation, judgment), AI is an augmentation tool that accelerates speed but doesn't eliminate headcount. To map your specific legal work-item costs and where AI intervention matters most, talk to Runrate to establish cost attribution across your legal operations.
Go deeper with the field guide.
A step-by-step PDF for implementing AI cost attribution.
Was this article helpful?