Structured Outputs & Function Calling: How to Make AI Do What You Actually Need
TL;DR
Unstructured AI text is hard to use in real products. Structured outputs and function calling are the techniques that turn LLM responses into reliable, parseable data your application can actually act on. This guide explains how they work, when to use each, and how to design reliable AI pipelines that don't break when the model formats something unexpectedly.
Three Approaches to Structured Output
Approach 1: Prompt-Based Structuring
Tell the model to return JSON in your system prompt. Simple, works with any model, but models sometimes add explanatory text anyway.
When to use: Prototyping, low-stakes use cases, models without native structured output support.
Approach 2: Native Structured Outputs
OpenAI and Anthropic both support structured output modes that constrain the model's generation to match a defined schema. The model literally cannot produce output that doesn't conform.
When to use: Production systems, any pipeline where downstream code depends on specific fields.
Approach 3: Function Calling / Tool Use
Goes beyond structured output — lets the model decide to call an external function with structured arguments. This is the foundation of AI agents.
When to use: Customer service bots, multi-step agents, any feature that takes real-world actions.
Designing Good Tool Schemas
Tool design is where AI PMs add real value. A poorly designed tool schema leads to models calling tools incorrectly, hallucinating arguments, or choosing the wrong tool entirely.
Name tools by what they do, not how they work
`get_user_subscription_status` not `query_billing_db`. The model uses the name to decide when to call the tool.
Write descriptions as if explaining to a smart colleague
Bad: "Gets data". Good: "Returns the current subscription tier, billing date, and payment status for a user. Call this when the user asks about their plan."
Make parameters unambiguous
If a parameter accepts multiple formats, specify the exact format. Models will choose inconsistently if you don't.
Constrain enum values whenever possible
Use "action": "escalate" | "respond" | "close" instead of a free string. Enums dramatically reduce hallucination in argument values.
Keep tools focused
One tool per discrete capability. Don't create `process_customer_request` that does five things — create five tools. Model routing improves with focused tools.
Common Structured Output Patterns
Extraction Pipeline
Input: unstructured text (support ticket, document, email). Output: structured fields. Use case: routing, classification, data enrichment.
Validation with Fallback
AI generates output → validate against schema → if invalid, retry with error feedback injected into the prompt.
Multi-step Tool Orchestration
Model calls tool A → gets result → calls tool B with result from A → synthesizes final response. How complex agents work.
Streaming Structured Output
Stream JSON as it's generated. Libraries like Instructor (Python) and zod-stream (TypeScript) handle partial JSON validation.
Apply These Concepts in the AI PM Masterclass
You'll build real agentic pipelines using structured outputs and function calling — live, with a Salesforce Sr. Director PM.
Error Handling: The Part Most Tutorials Skip
Retry with feedback
If the model returns malformed JSON, send it back with the error message. Allow 2–3 retries maximum before falling back.
Graceful degradation
Define which fields are required vs. optional. If required fields are missing, fail loudly. If optional fields are missing, use defaults and log.
Max retry limits
Set 2–3 retries maximum. If the model can't produce valid structured output after 3 tries, fall back or route to human review.
Monitor parse failure rate
Track the percentage of responses that fail schema validation. Above 2% indicates a prompt engineering problem. A sudden spike indicates a model behavior change.
Function Calling for AI Agents: The PM View
Tool selection is probabilistic
The model chooses which tool to call based on the conversation and tool descriptions. If two tools are similarly described, the model may choose unpredictably. Design tools to be clearly differentiated.
Parallel tool calling
Modern models can call multiple tools simultaneously in a single turn. Design your systems to handle parallel tool results — it dramatically speeds up multi-step workflows.
Tool call logging is essential
Log every tool call: which tool, what arguments, what was returned, and the model's subsequent response. This is your primary debugging surface for agentic behavior.
Constrain destructive actions
Tools that write, delete, or send (emails, messages, payments) should require an explicit confirmation step or human approval. Never give an agent unconfirmed destructive capabilities in production.
Build Real Agentic Pipelines in the AI PM Masterclass
Structured output design and function calling architecture are covered in the AI PM Masterclass. You'll build real agentic pipelines using these patterns.