AI STRATEGY

Usage-Based Pricing for AI Products: A Product Manager's Playbook

By Institute of AI PM·14 min read·Jun 2, 2026

TL;DR

Usage-based pricing (UBP) charges customers for what they consume — tokens, queries, documents, minutes — rather than for access. It's the dominant pricing model for AI infrastructure (OpenAI, Anthropic, ElevenLabs) and is increasingly the right model for AI-native products where cost scales with usage. The critical decisions: what's the billing unit, how do you protect gross margin, and how do you prevent customer sticker shock from destroying trust. This playbook covers the billing unit choice, tier structure, gross margin management, and the failure modes that turn usage-based pricing from an advantage into a liability.

What Usage-Based Pricing Is — and What It Isn't

Usage-based pricing (also called consumption pricing or metered billing) means customers pay in proportion to how much they use the product — not for access to it. This is fundamentally different from per-seat SaaS, where pricing is decoupled from usage volume. Understanding this distinction clarifies when UBP is the right choice and when it's not.

Usage-based pricing

OpenAI ($0.002/1K tokens), ElevenLabs ($0.30/1K characters), Midjourney (per image), AWS Bedrock (per inference call)

Cost scales with usage. Revenue scales with usage. Gross margin is roughly constant as volume grows. Works when your marginal cost is significant and traceable.

Per-seat pricing

GitHub Copilot ($19/user/month), Notion AI ($8/user/month), most enterprise SaaS with AI features bundled

Revenue scales with users regardless of usage. Gross margin improves with volume (fixed infrastructure costs amortized). Works when your marginal cost per user is low regardless of how much they use it.

Outcome-based pricing

Pay per qualified lead generated, per contract clause caught, per customer issue resolved without human escalation

Revenue scales with value delivered. Highest value alignment — customers only pay when you succeed. Requires robust outcome measurement and attribution, which most products don't have at launch.

Hybrid (increasingly common in 2026)

Platform seat fee + consumption overage, free tier + pay-as-you-go above limit, commit contract + overage

Most mature AI products in 2026 use hybrids: a base commitment that ensures predictable revenue, plus a consumption component that captures upside and aligns with large users.

The most important decision at product launch is whether your cost structure maps to usage-based billing. If your marginal cost per request is meaningful and variable (LLM inference is the clearest example), usage-based pricing protects your gross margin at scale. If your marginal cost is near-zero regardless of usage volume, per-seat is safer and simpler.

Choosing Your Billing Unit: The Decision That Shapes Everything

The billing unit is the most consequential usage-based pricing decision. It determines whether customers understand what they're paying for, whether they can predict their bills, and whether your pricing scales with value or just with activity. Getting it wrong is hard to fix without a major repricing event.

Unit: Tokens

Used by: LLM API providers (OpenAI, Anthropic, Google)

Pros: Maps directly to infrastructure cost. Fine-grained. Industry standard for AI infrastructure.

Cons: Opaque to business buyers — most can't estimate token consumption. Leads to bill anxiety. Not a value metric (a short, high-value response uses fewer tokens than a long, low-value one).

Right for infrastructure/API products. Wrong for most end-user products.

Unit: Documents / tasks

Used by: AI contract review, AI email response, AI research report

Pros: Maps to the unit of value the customer cares about. Predictable for customers. Easy to reason about in procurement.

Cons: Documents vary enormously in complexity and cost. A 5-page NDA and a 200-page acquisition agreement are not the same 'document' but may be priced the same.

Best for most AI workflow products. Consider tiering by document complexity if variance is high.

Unit: Minutes / hours (time-based)

Used by: AI voice agents, AI meeting analysis, AI audio transcription

Pros: Natural unit for voice and video workflows. Customers understand it intuitively.

Cons: Doesn't map well to value (a 30-minute meeting that produces one insight vs a 30-minute meeting that produces a deal are the same price). Incentivizes shorter usage.

Right for voice/audio/video products. Consider outcome bonuses on top.

Unit: API calls / queries

Used by: AI search, AI classification, embedding generation

Pros: Simple to instrument and bill. Easy for technical buyers to estimate.

Cons: Not a value metric. A query that returns a life-changing insight and a query that returns nothing are the same price.

Right for infrastructure-layer products. For application-layer products, move to a task unit.

Designing the Tier and Commit Structure

Pure pay-as-you-go is the simplest usage-based model but rarely the right long-term structure. Most AI products in 2026 use a tiered or commit-plus-overage structure that gives customers cost predictability while capturing upside from high-volume users.

Free tier with hard limits

Essential for discovery. Set limits that let individual users validate your product without budget approval. But set them low enough that teams hitting the limit need to upgrade — not low enough that they hit the limit before they've seen the value.

Prepaid credits (commit discounts)

Customers buy a block of usage upfront at a discount (e.g., $500 in credits for 20% off pay-as-you-go rates). Creates predictable COGS on your side, reduces churn (spent credits hurt more than unused subscription), and moves procurement conversations from 'is this a good deal' to 'how much should we commit to'.

Overage pricing

Usage above a committed tier is billed at a higher per-unit rate. Drives commit upgrades. But if overages are a surprise, they destroy trust. Always notify customers at 80% of their commit and give them one-click upgrade options. Silent overages are a retention killer.

Seat floor plus consumption

A minimum seat fee ensures predictable base revenue; usage billing captures value from power users. This hybrid is increasingly standard for enterprise AI products in 2026: it satisfies procurement's need for predictable annual contracts while letting usage upside flow to you.

Design principle

Structure pricing so that customers who get the most value pay the most — not customers who use the most compute. These are often not the same. If your 10 highest-value customers are generating 100K tokens each and your 100 lowest-value customers are generating 50K tokens each, token-based billing is charging your best customers only 2x more for 10x the value.

Price Your AI Product for Value, Not Just Volume

The AI PM Masterclass covers pricing strategy, unit economics, and go-to-market decisions for AI products — taught live by a Salesforce Sr. Director PM.

Common Failure Modes in Usage-Based AI Pricing

Usage-based pricing amplifies both wins and mistakes. Here are the failure modes that come up repeatedly in AI PM post-mortems.

Wrong billing unit — charging for inputs, not outcomes

Billing per token when your users care about per document, or billing per API call when they care about per successful classification. The billing unit should be the unit of value, not the unit of compute. When it isn't, customers who use your product most efficiently get charged least — which is inverted value capture.

No cost safeguards — the $10K surprise bill

A poorly written prompt that loops, a batch job that runs over an unexpectedly large corpus, a developer testing in production: all of these produce unexpected large bills. Without hard spend caps, customers who trigger them will churn regardless of who caused the issue. Implement soft alerts at 80% and hard caps at 100% of any configured budget. Make this a default-on feature, not a buried setting.

No free tier for discovery

Requiring a credit card before a user can try the product filters out most potential champions. Individual users who discover your product can't get budget approval without first demonstrating value — but they can't demonstrate value without access. Free tiers with limits are not a cost center; they're a sales channel.

Pricing below infrastructure cost

Many AI products in 2023–2025 launched with usage-based pricing set below their actual LLM API cost — subsidizing customer usage with VC money. When funding tightens, they reprice and lose customers. Price above your cost from day one, even if it means fewer early users. You can always lower prices; repricing up destroys trust.

Enterprise customers can't budget for unpredictable usage

Procurement teams at large companies need annual budget commitments. Pure pay-as-you-go with no commit option loses enterprise deals that would have converted with a commit structure. For enterprise sales, always have a committed annual contract option even if individual users are billed metered.

Gross Margin and the Unit Economics of Usage-Based AI

Usage-based pricing exposes your unit economics directly. Every customer interaction has a traceable cost, and if your revenue per unit doesn't exceed your cost per unit, you lose money on every customer at scale. Here's how to think about this before you set prices.

1

The cost stack per billable unit

For a document processing product billed per document, your cost stack includes: LLM API cost (input + output tokens), infrastructure overhead, storage, and amortized engineering. A 10-page contract review might cost $0.08–$0.15 in LLM API fees. If you charge $0.20/document, your gross margin is 25–60% — acceptable but thin. Price for where your costs will be in 12 months as you optimize, not where they are today.

2

The compression path

AI inference costs are falling ~50–70% per year as hardware improves and smaller models get better. Usage-based pricing benefits from this: your revenue per unit stays constant while your cost per unit falls, expanding gross margin over time. This is the opposite of SaaS, where gross margin is relatively stable. Model your pricing with a cost-reduction roadmap.

3

Volume discounting must be gated on commit, not on usage

Giving large discounts to customers who simply use a lot without committing erodes your margin without securing revenue. Volume discounts should come with commit commitments (annual contracts, prepaid credits) — not with retroactive volume rebates based on consumption.

4

Watch for the 'power user subsidy'

In usage-based products, power users often generate more cost than revenue if pricing isn't carefully calibrated. A developer running nightly batch jobs on your product at the lowest tier will cost you money. Identify your cost-to-serve per customer cohort early, and reprice tiers where you're subsidizing heavy users.

The target gross margin for a standalone AI product is 60–75% at scale. If your current gross margin is below 40%, you have a pricing or cost problem that usage-based pricing alone won't fix. Usage-based pricing is a revenue alignment mechanism — it doesn't substitute for modeling your cost structure and pricing above it.

Build a Sustainable AI Product Business

The AI PM Masterclass covers pricing strategy, unit economics, and the full product monetization stack for AI products — taught live by a former Apple Group PM and Salesforce Sr. Director PM.