The Feynman Technique for AI Product Managers: Learn by Teaching

The Four Steps

Richard Feynman, Caltech physicist and Nobel laureate, claimed he could explain any concept to a freshman — and if he couldn't, he didn't actually understand it. The technique is the operationalized version of that test. It works because forced simplicity exposes vague reasoning that survives passive reading.

Step 1: Pick One Concept

Single, narrow, named. 'Attention' is good. 'How LLMs work' is not. Write it at the top of a blank page.

Step 2: Teach It in Plain Language

Write an explanation as if to a smart 12-year-old. No jargon unless you also define it. Use analogies. Aim for 200–500 words. Set a 15-minute timer.

Step 3: Mark Your Gaps

Highlight every sentence where you waved your hands, skipped a step, or used a term you couldn't define. Those are your real gaps. Read or ask someone to fill exactly those — not the whole topic again.

Step 4: Simplify and Repeat

Rewrite the explanation. Shorter, sharper, fewer caveats. If you can do it in three short paragraphs and still be honest, you own the concept. If not, return to step 3.

Worked Example 1: Transformers

Here's a Feynman-style explanation that passes the smart-12-year-old test. Notice: no math, one analogy, every term defined the moment it appears.

A transformer is a kind of machine learning model that reads text and predicts what comes next. Imagine reading a sentence: when you see "it," you instinctively look back to figure out what "it" refers to. The transformer does this for every word, all at once, in parallel.

It does this with a mechanism called attention. For each word, the model asks: which other words in this sentence matter most to me right now? It assigns a weight to every other word — high for words that matter, low for words that don't — and uses those weighted neighbors to build a richer understanding of the current word.

Stack this attention operation many times — dozens of layers — and the model can capture meaning at multiple levels: word, phrase, sentence, paragraph. Train it on a few trillion words from the internet and you get something that can write essays, answer questions, and code. That's an LLM.

Common gap when you try this yourself: the words "assigns a weight" hide a lot of math. That's fine — but only if you can also explain Q, K, V if pressed. If you can't, that's your next study target.

Worked Example 2: RAG

RAG is the most common architecture you'll encounter in enterprise AI products. PMs who can't explain it cleanly lose credibility fast.

An LLM by itself only knows what was in its training data, which is frozen at some point in the past. If you ask it about your company's policy doc, it has no idea — that data wasn't in training.

RAG, or retrieval-augmented generation, fixes this by doing two steps. First, it searches your documents for the chunks most relevant to the user's question. Then it pastes those chunks into the prompt, alongside the question, and asks the LLM to answer using them.

The search step usually uses embeddings — vectors that capture meaning — so "parental leave" and "maternity benefits" can match even though the words differ. RAG is what lets a chatbot accurately answer "how many vacation days do I get?" for your specific company without retraining the model.

Practice Feynman With Live Critique

The masterclass runs Feynman drills weekly: you explain, the instructor (a Salesforce Sr. Director PM) finds your gaps, you redo it. Three rounds and you actually own the concept.

Worked Example 3: RLHF

RLHF is where most PMs hand-wave. The phrase "trained with human feedback" covers a three-step process most people skip.

An LLM that only learned to predict the next word doesn't know what makes an answer "good" — just what's likely. RLHF is how we teach it preferences.

Step one: humans look at pairs of model outputs for the same question and pick the better one. Do this thousands of times. Step two: a separate model — the reward model — learns to predict, from any answer, which one humans would prefer. Now you have a scoring function.

Step three: take the original LLM and nudge its weights so it produces answers the reward model scores higher. Repeat. The result is a model that doesn't just predict text — it predicts text humans like. That's why ChatGPT feels different from a raw GPT-3 base model.

Watch for the gap: most explanations stop at "humans rate outputs." The reward model is the part most PMs miss. If you skipped it in your draft, that's your study target — and it's also why you need DPO and constitutional AI in your vocabulary.

A Checklist For Any New Topic

When a new concept hits the field — and one will, every month — run this checklist. If you can't check every box, you don't own it yet.

I can explain it without jargon

Or, if I need a term, I can define it inline. No 'as you know' hand-waves.

I can name the failure modes

If you can't say when the technique fails, you only understand the marketing version.

I can compare it to one alternative

RAG vs fine-tuning. RLHF vs DPO. Decoder-only vs encoder-decoder. The contrast is where understanding lives.

I can describe the cost or latency impact

PMs are paid to think about tradeoffs. If a technique has no cost in your explanation, you're missing the engineering reality.

I can give a real product example

Pick a shipped product that uses it. Concrete grounding kills hand-waving instantly.

A peer can repeat it back

The final test. Have a non-AI friend read your explanation and paraphrase. If they can't, simplify and retry.