30 Most Important AI Concepts Every Aspiring AI Product Manager Must Know

Cluster 1 — Foundations (Concepts 1-7)

1. Tokens

The unit of text an LLM reads and bills for. Roughly 3/4 of a word. Token cost shapes pricing, latency, and context limits.

2. Embeddings

High-dimensional vectors that represent meaning. The core primitive behind semantic search, recommendations, and RAG.

3. Attention

The mechanism that lets a model decide which earlier tokens matter for the current prediction. Quadratic in context length — which is why long contexts cost more.

4. Pre-training vs. fine-tuning

Pre-training builds general capability; fine-tuning specializes it. Fine-tuning is rarely the right first move for a PM.

5. Context window

How much text a model can see at once. Larger windows enable document analysis but degrade in the middle (the "lost in the middle" effect).

6. Temperature

A sampling knob that controls output randomness. Low = deterministic, high = creative. Production AI mostly runs at low temperature.

7. System prompt

The instruction layer that sets model behavior. Often the highest-leverage change in an AI feature.

Cluster 2 — Retrieval & Memory (Concepts 8-13)

8. RAG (Retrieval-Augmented Generation)

Inject external documents into the prompt at runtime. The default architecture for grounding AI in your data.

9. Vector database

Stores embeddings and finds nearest neighbors fast. Pinecone, Weaviate, pgvector are common choices.

10. Chunking

How you split documents before embedding. Bad chunking is the #1 cause of bad RAG quality.

11. Reranking

A second-pass model that reorders retrieved chunks. Often a bigger quality lift than upgrading the LLM.

12. Hybrid search

Combine keyword (BM25) and vector search. Each catches what the other misses.

13. Memory systems

Long-term context management for agents and chatbots. Includes summary memory, episodic memory, and entity memory.

Cluster 3 — Evaluation (Concepts 14-19)

14. Eval set (golden set)

A curated set of inputs with known good outputs. Without one, you can't measure regressions.

15. LLM-as-judge

Use a model to score model outputs. Cheap, fast, and noisy — best paired with a small human-graded set.

16. Pass@k

The probability that at least one of k samples is correct. Useful for code, planning, and any task with multiple valid answers.

17. Hallucination

Confidently generated false content. Mitigated by RAG, citations, and refusal training — never eliminated.

18. Drift

Quality degrading over time due to model updates, prompt changes, or distribution shift. Continuous eval catches it.

19. Red teaming

Adversarial testing for failure modes: jailbreaks, prompt injection, harmful outputs. Required for any production AI.

Master These Concepts in the AI PM Masterclass

Reading definitions builds vocabulary; doing exercises builds intuition. The masterclass walks through every concept on this list with hands-on labs and 1:1 review.

Cluster 4 — Deployment (Concepts 20-25)

20. Inference

Running the model to produce outputs. Distinct from training. Most AI PM cost lives here.

21. Latency vs. throughput

Latency = single-request speed. Throughput = total requests per second. They often trade off.

22. Streaming

Send tokens to the user as they generate. The single biggest perceived-latency win in chat UIs.

23. Caching

Reuse responses for repeated inputs. Prompt caching and semantic caching cut cost dramatically.

24. Quantization

Compress model weights to lower precision. Cuts cost and latency at modest quality loss.

25. Distillation

Train a smaller model to mimic a larger one. The classic way to ship a cheap model with most of the quality.

Cluster 5 — Safety & Trust (Concepts 26-30)

26. Guardrails

Programmatic filters on inputs and outputs. Distinct from model safety training — guardrails enforce rules at runtime.

27. Prompt injection

User input that hijacks model behavior. The OWASP top risk for LLM apps.

28. Content filters

Classifiers that block harmful, off-topic, or policy-violating outputs. Always layered, never the only defense.

29. Provenance / citations

Show the user where the answer came from. The single highest-leverage trust intervention in RAG products.

30. Human in the loop (HITL)

Route low-confidence or high-risk outputs to humans. The right design for high-stakes AI features.