How to Practice User Research for AI Products Without a Team

Why User Research Skills Matter More for AI Products

For traditional software, user research validates whether a feature solves a known problem. For AI products, user research must also uncover problems users cannot name, calibrate trust in systems users do not understand, and discover failure modes that only surface in natural usage. This makes research more important and harder to do well.

Users Cannot Evaluate AI Quality Reliably

Ask a user whether a search result is relevant and they can tell you. Ask a user whether an AI-generated summary is accurate and they often cannot — because evaluating the summary requires reading the source material, which defeats the purpose. AI product research must design evaluation tasks that reveal quality perception without relying on users to be ground-truth judges. This is a fundamentally different research challenge.

Trust Calibration Is the Core UX Challenge

The ideal state for an AI product is calibrated trust — users trust the AI when it is right and question it when it is wrong. In practice, most users either over-trust (accept everything without checking) or under-trust (check everything, eliminating the time savings). Research must measure where users fall on this spectrum and identify the design interventions that move them toward calibration. No traditional PM framework teaches this.

Failure Discovery Requires Observation, Not Surveys

You will never discover the most important AI failure modes through surveys or interviews alone. Users do not report the subtly wrong AI output — they silently correct it, or worse, accept it. The failures that matter most are the ones users do not notice. Research for AI products must include behavioral observation — watching users interact with AI outputs in real tasks and flagging the moments they should have caught an error but did not.

The 5 Guerrilla Research Methods You Can Practice Solo

You do not need participants on payroll or a usability lab. Each of these methods can be practiced with friends, family, cohort members, or even strangers — and each produces genuine insights about how users interact with AI products.

1
Think-Aloud Observation With Existing AI Products
Ask someone to use an AI product (ChatGPT, Google's AI Overviews, Grammarly, Notion AI) while narrating their thought process. You are not testing the product — you are studying how a real person forms expectations, reacts to outputs, decides whether to trust or edit, and handles errors. Record the session (with permission) and note three things: where they hesitated, where they accepted output without checking, and where they expressed surprise. Do this five times with different people and you will have more insight into AI trust calibration than most PMs get from months of analytics.
2
Wizard-of-Oz AI Prototyping
Build a fake AI experience where you manually generate the 'AI output' behind the scenes. Use a Google Form as the input, manually craft responses that simulate AI behavior (including deliberate errors), and send them via email or Slack. Then interview the participant about their experience. Did they notice the errors? How did the errors affect their trust? Would they use this feature again? This method lets you test AI product concepts before any model exists — and it teaches you to think about the AI experience from the user's perspective, not the model's.
3
Competitive Usability Testing
Pick two competing AI products that solve the same problem (e.g., Jasper vs. Copy.ai for marketing copy, GitHub Copilot vs. Cursor for code completion). Ask a participant to complete the same task in both products, then interview them about the differences. Which output did they trust more? Why? What made one feel more reliable? How did they decide when to accept vs. edit? Competitive testing reveals user mental models more effectively than single-product testing because the contrast forces users to articulate preferences they would otherwise leave implicit.
4
Error Seeding Studies
Take a set of AI-generated outputs (summaries, recommendations, classifications) and deliberately introduce errors into some of them. Show these to a participant and ask them to identify which outputs are correct and which contain errors. This method measures something critical for AI products: error detection rate. If users cannot spot the errors you seeded, you have evidence that the AI's failure mode is invisible to users — which is a product-level problem requiring design intervention, not model improvement. Track which error types users catch and which they miss.
5
Contextual Inquiry via Screen Share
Ask a participant to share their screen while doing their actual work (not a synthetic task) and incorporate an AI tool into their workflow. Watch without guiding. Where do they use the AI? Where do they avoid it? When they use it, what do they do with the output — accept, edit, or discard? Remote screen sharing makes this trivially easy to execute. Two 30-minute contextual inquiry sessions will teach you more about real AI usage patterns than any amount of product analytics, because you see the context surrounding every decision.

How to Synthesize Research Findings Into Product Decisions

Raw research data is not useful until you synthesize it into product decisions. The gap between "users struggle with X" and "therefore we should build Y" is where most junior PMs get stuck. This synthesis framework bridges that gap.

Pattern Identification

After 3–5 research sessions, cluster your observations into patterns. Do not force categories — let them emerge from the data. You are looking for behaviors that repeat across participants, not unique incidents. A pattern seen in 3 of 5 participants is a signal. A behavior seen in 1 of 5 is an anecdote. The discipline is resisting the urge to design solutions for anecdotes. Common AI-specific patterns: trust collapse after a single error, over-reliance during time pressure, and avoidance of AI features in high-stakes decisions.

Insight Framing

Convert each pattern into an insight using this format: 'Users [behavior] because [underlying motivation], which means [product implication].' Example: 'Users accept AI-generated email drafts without editing because they perceive editing as negating the time savings, which means the product must build quality checks into the acceptance flow rather than relying on users to review.' The insight must contain a causal explanation and a product direction. Without both, it is just an observation.

Prioritization by Impact

Not every insight leads to an action. Prioritize insights by asking two questions: (1) How many users does this affect? (2) What is the severity of the impact — inconvenience, task failure, or trust loss? Trust loss is the most severe because it affects all future interactions, not just the current one. An insight about trust calibration almost always outprioritizes an insight about efficiency, because a user who does not trust the AI will never use it long enough to benefit from efficiency improvements.

From Insight to Hypothesis — The Bridge Most PMs Skip

Every product decision based on research should be framed as a testable hypothesis: 'We believe that [design change] will [improve specific metric] because our research showed [specific finding].' Example: 'We believe that showing a confidence indicator on AI-generated summaries will reduce unverified acceptance by 30% because our think-aloud studies showed that users default to accepting summaries when they have no signal about output reliability.' This discipline prevents you from building features based on vibes and gives you a clear success criterion to evaluate against.

Communicating Research to Stakeholders

Engineers want specifics: 'Users attempted to edit AI output 4 out of 5 times, spending an average of 90 seconds per edit.' Designers want user quotes and behavioral patterns: 'Three of five users said they did not trust the output but could not explain why — they described a feeling of unease.' Executives want business implications: 'Users who encounter an AI error in their first session are 60% less likely to use the feature again, which directly impacts our AI-assisted conversion funnel.' Synthesize once, then translate for each audience. Same data, different framing.

Practice user research methods with structured exercises and peer feedback

IAIPM's cohort program includes guided research sprints, synthesis workshops, and portfolio-building exercises designed to develop the discovery skills AI PM roles require.

See Program Details

Building a Research Portfolio That Impresses Hiring Managers

Most AI PM candidates have zero research artifacts in their portfolio. The ones who do stand out immediately — not because the research is perfect, but because it demonstrates a user-centered approach that hiring managers actively screen for. Here is how to turn your practice research into portfolio pieces.

The Research Brief

For each research exercise, write a one-page brief that covers: the research question, the method you chose and why, participant profiles (even if it was 3 friends), key findings with supporting evidence, and the product recommendation that follows. Format it cleanly — a Google Doc with clear headings is fine. The brief proves you can run a structured research process, not just talk about one in an interview.

The Insight Deck

After running 3–5 research sessions on the same AI product, create a 10-slide synthesis deck. Slide structure: research objective, methodology overview, 3–4 key insights with user quotes, prioritized recommendations, and proposed next steps. This is the artifact that a hiring manager can review in 5 minutes and immediately understand your research thinking. It mimics the deliverable you would produce in a real PM role.

The Case Study Write-Up

Combine your research, synthesis, and recommendations into a 2–3 page case study that tells the full story: 'I noticed users struggling with X in this AI product. I ran research to understand why. Here is what I found. Here is what I would build.' This is the most powerful portfolio piece because it demonstrates the end-to-end PM process — from observation to insight to action. Pair it with a PRD for the feature you recommended and you have a complete portfolio entry.

Research Practice Readiness Checklist

Before you run your first guerrilla research session, confirm these conditions are in place. Skipping any of them turns a learning exercise into wasted time.

I have a specific research question written down — not just 'learn about users' but 'understand how users decide whether to trust AI-generated recommendations'
I have chosen one of the five guerrilla methods and can explain why it is the right method for my research question
I have recruited at least 3 participants who are not product managers — real users, not people who think like PMs
I have a recording setup ready (screen recording for remote, phone audio for in-person) and have confirmed consent
I have prepared a discussion guide with 5–7 open-ended questions that do not lead the participant toward a specific answer
I have a note-taking template ready with columns for observations, quotes, and my interpretations — keeping them separate prevents bias
I have blocked time within 24 hours of each session for synthesis — raw notes lose context faster than you expect
I have identified the portfolio artifact I will create from this research — brief, deck, or case study — before I start