AI Vendor RFP Template: 50+ Questions to Ask Before Signing
TL;DR
Most AI procurement decisions get made on a vibe-check demo and a price quote. Six months later you discover the model retrains on your data, latency triples at peak, and there is no documented exit. This RFP template gives you 50+ specific questions across seven categories. Send it before pricing discussions — the answers will tell you who is enterprise-ready and who is not.
Section 1: Company & Viability
You are about to depend on this vendor for a system in your production stack. AI startups die. Find out if this one will be alive in 24 months.
Funding stage, last round size, and current monthly burn
Why it matters: Sub-12-month runway is a procurement red flag. Ask for the runway number, not the round size.
Total revenue and number of paying enterprise customers above $100K ARR
Why it matters: Logo lists are marketing. Paid contract counts at your tier are signal.
Customer concentration: % of revenue from top 3 customers
Why it matters: >40% concentration means losing one customer destabilizes the company you depend on.
Headcount split: engineering vs. go-to-market
Why it matters: AI startups with <40% engineering headcount usually cannot ship the roadmap they promised.
Acquisition or shutdown clause: what happens to our data and contract if you are acquired?
Why it matters: Get the assignment and termination rights in writing before pricing discussion.
Three reference customers we can call without your team on the line
Why it matters: Vendor-curated calls are theater. Insist on direct access.
Production uptime track record over the last 12 months, with incident reports
Why it matters: Past incidents predict future ones better than any pitch deck.
Section 2: Security & Compliance
Current certifications: SOC 2 Type II, ISO 27001, HIPAA, FedRAMP, PCI-DSS
Why it matters: Type II (not Type I) is the bar. Ask for the most recent report, not just the badge.
Penetration test cadence and most recent third-party report
Why it matters: Annual third-party pen test minimum. Ask to read the executive summary under NDA.
Sub-processors list, including model providers (OpenAI, Anthropic, AWS Bedrock, etc.)
Why it matters: Your data flows through every sub-processor. They all need to clear your privacy review.
Data residency options: US, EU, region-locked deployments
Why it matters: Required for GDPR, Schrems II, and many regulated industries.
Encryption: at rest, in transit, and key management (BYOK or HYOK supported?)
Why it matters: BYOK is the negotiation point. Vendors who refuse it are not enterprise-ready.
SSO support (SAML, OIDC) and SCIM provisioning
Why it matters: Without SSO and SCIM, identity drift and orphaned accounts are inevitable.
Audit logging: what events are logged, retention period, export format
Why it matters: If logs cannot be exported to your SIEM, you have a compliance gap on day one.
Incident notification SLA in writing
Why it matters: Many contracts say 'reasonable notice.' Demand a specific number — 24 or 72 hours.
Section 3: Model & Performance
What underlying model(s) power the product? Frontier API, fine-tune, or in-house?
Why it matters: If they pass through GPT-4o, you are paying a markup. Decide if the wrapper is worth it.
Model versioning: how is the model pinned, and how are upgrades rolled out?
Why it matters: Silent model swaps break your evals. Pinning + opt-in upgrades is the bar.
Latency: p50, p95, and p99 at our expected volume, broken out by request type
Why it matters: Averages hide outages. p99 is what your users actually feel.
Quality benchmarks on your internal evals (not just MMLU/HumanEval)
Why it matters: Public benchmarks are gamed. Make them run your eval set as part of the RFP.
Hallucination rate methodology and most recent measurement
Why it matters: If they cannot define how they measure hallucination, they are not measuring it.
Multi-modal capabilities and roadmap (vision, voice, structured output)
Why it matters: Locks in whether your future use cases will need a second vendor.
Failover behavior when the underlying model provider is down
Why it matters: OpenAI outages happen. Your vendor needs a degraded-mode answer.
Section 4: Data Handling
Will our prompts, completions, or files be used to train any model — yours or a third party's?
Why it matters: Default-on training is the single biggest hidden risk. Get an explicit no in writing.
Data retention: how long is our data stored, and how do we configure zero retention?
Why it matters: Zero-retention mode should be available. Default 30-day retention is acceptable; default-forever is not.
Data deletion: SLA for deletion requests and verification mechanism
Why it matters: GDPR Article 17 requires verifiable deletion, not a vendor email saying 'done.'
PII detection and redaction: built-in or our responsibility?
Why it matters: Determines who carries the liability when PII leaks into prompts.
Customer-isolated tenancy or shared infrastructure?
Why it matters: Shared infra increases prompt-leakage and noisy-neighbor risk.
Cross-border data flow: where is data processed and stored?
Why it matters: Required for Schrems II, EU AI Act, and most regional privacy laws.
Run AI Procurement Like a Senior PM
Vendor evaluation, AI contracts, and exit planning are taught in the AI PM Masterclass — live, by a Salesforce Sr. Director PM and former Apple Group PM.
Section 5: SLA & Support
Uptime SLA — exact percentage and the credits formula
Why it matters: 99.9% sounds nice. Read the credit cap — many SLAs cap credits at one month of fees.
Support tiers, response SLAs by severity, and 24/7 coverage
Why it matters: P1 response under 1 hour and named TAM for >$100K ARR is the enterprise bar.
Status page URL and historical incident transparency
Why it matters: If their status page is empty, they hide incidents. Ask for the internal incident log.
Maintenance window policy and notification lead time
Why it matters: Surprise maintenance during your peak hour is a customer-trust event.
Rate limits and burst behavior under spike traffic
Why it matters: Define what happens at 10x baseline. Hard 429s vs. graceful queuing matters a lot.
Section 6: Pricing & Commercials
Pricing model: per-seat, per-token, per-request, or hybrid
Why it matters: Per-token costs explode with long contexts. Per-seat hides usage risk. Know which trap you are picking.
Volume discount tiers and the price ramp at 2x, 5x, 10x baseline
Why it matters: Most pricing pages stop at tier 3. Get the model for your year-3 forecast.
Annual price cap on increases at renewal
Why it matters: 5–7% cap is standard. Without one, you are the renewal hostage.
Overage billing and the cost ceiling we can pre-set
Why it matters: A bug that hits the API in a loop should not generate a six-figure bill. Demand a hard cap.
Multi-year discounts and the early termination penalty if we exit
Why it matters: Often the same number with opposite signs. Negotiate both, not one.
What is included in the base price vs. paid add-ons (logging, evals, audit, BYOK)
Why it matters: Vendors hide enterprise must-haves behind add-on SKUs. Itemize before signing.
Section 7: Exit & Portability
The exit clause is where most AI contracts fail. Negotiate it before you sign — you have zero leverage after.
Data export: format, scope, and timeline post-termination
Why it matters: JSON or CSV in <30 days. 'On request' with no SLA is a lock-in clause.
Prompt and fine-tune portability: can we take our prompts, evals, and fine-tunes elsewhere?
Why it matters: If fine-tunes are bound to their proprietary base model, you have rebuilding work in any migration.
Termination for convenience clause and notice period
Why it matters: 30–60 day notice is fair. 12-month auto-renewal with a 60-day window is a trap.
Transition assistance: hours of professional services included post-termination
Why it matters: Negotiate this in writing before signing. After signing, it is billed at $400/hr.
Data deletion certification post-exit
Why it matters: Required for SOC 2 and most regulated industries.