TECHNICAL DEEP DIVE

Structured Outputs & Function Calling: How to Make AI Do What You Actually Need

By Institute of AI PM·13 min read·Mar 22, 2026

TL;DR

Unstructured AI text is hard to use in real products. Structured outputs and function calling are the techniques that turn LLM responses into reliable, parseable data your application can actually act on. This guide explains how they work, when to use each, and how to design reliable AI pipelines that don't break when the model formats something unexpectedly.

Three Approaches to Structured Output

Approach 1: Prompt-Based Structuring

Tell the model to return JSON in your system prompt. Simple, works with any model, but models sometimes add explanatory text anyway.

Pros: Simple, universalCons: Not guaranteed, requires robust error handling

When to use: Prototyping, low-stakes use cases, models without native structured output support.

Approach 2: Native Structured Outputs

OpenAI and Anthropic both support structured output modes that constrain the model's generation to match a defined schema. The model literally cannot produce output that doesn't conform.

Pros: Eliminates parsing failures, simpler error handlingCons: Slight latency overhead, requires schema upfront

When to use: Production systems, any pipeline where downstream code depends on specific fields.

Approach 3: Function Calling / Tool Use

Goes beyond structured output — lets the model decide to call an external function with structured arguments. This is the foundation of AI agents.

Pros: Enables agentic workflows, real-world actionsCons: More complex, requires careful safety design

When to use: Customer service bots, multi-step agents, any feature that takes real-world actions.

Designing Good Tool Schemas

Tool design is where AI PMs add real value. A poorly designed tool schema leads to models calling tools incorrectly, hallucinating arguments, or choosing the wrong tool entirely.

1

Name tools by what they do, not how they work

`get_user_subscription_status` not `query_billing_db`. The model uses the name to decide when to call the tool.

2

Write descriptions as if explaining to a smart colleague

Bad: "Gets data". Good: "Returns the current subscription tier, billing date, and payment status for a user. Call this when the user asks about their plan."

3

Make parameters unambiguous

If a parameter accepts multiple formats, specify the exact format. Models will choose inconsistently if you don't.

4

Constrain enum values whenever possible

Use "action": "escalate" | "respond" | "close" instead of a free string. Enums dramatically reduce hallucination in argument values.

5

Keep tools focused

One tool per discrete capability. Don't create `process_customer_request` that does five things — create five tools. Model routing improves with focused tools.

Common Structured Output Patterns

Extraction Pipeline

Input: unstructured text (support ticket, document, email). Output: structured fields. Use case: routing, classification, data enrichment.

Validation with Fallback

AI generates output → validate against schema → if invalid, retry with error feedback injected into the prompt.

Multi-step Tool Orchestration

Model calls tool A → gets result → calls tool B with result from A → synthesizes final response. How complex agents work.

Streaming Structured Output

Stream JSON as it's generated. Libraries like Instructor (Python) and zod-stream (TypeScript) handle partial JSON validation.

Apply These Concepts in the AI PM Masterclass

You'll build real agentic pipelines using structured outputs and function calling — live, with a Salesforce Sr. Director PM.

Error Handling: The Part Most Tutorials Skip

Retry with feedback

If the model returns malformed JSON, send it back with the error message. Allow 2–3 retries maximum before falling back.

Graceful degradation

Define which fields are required vs. optional. If required fields are missing, fail loudly. If optional fields are missing, use defaults and log.

Max retry limits

Set 2–3 retries maximum. If the model can't produce valid structured output after 3 tries, fall back or route to human review.

Monitor parse failure rate

Track the percentage of responses that fail schema validation. Above 2% indicates a prompt engineering problem. A sudden spike indicates a model behavior change.

Function Calling for AI Agents: The PM View

Tool selection is probabilistic

The model chooses which tool to call based on the conversation and tool descriptions. If two tools are similarly described, the model may choose unpredictably. Design tools to be clearly differentiated.

Parallel tool calling

Modern models can call multiple tools simultaneously in a single turn. Design your systems to handle parallel tool results — it dramatically speeds up multi-step workflows.

Tool call logging is essential

Log every tool call: which tool, what arguments, what was returned, and the model's subsequent response. This is your primary debugging surface for agentic behavior.

Constrain destructive actions

Tools that write, delete, or send (emails, messages, payments) should require an explicit confirmation step or human approval. Never give an agent unconfirmed destructive capabilities in production.

Build Real Agentic Pipelines in the AI PM Masterclass

Structured output design and function calling architecture are covered in the AI PM Masterclass. You'll build real agentic pipelines using these patterns.