Bayesian Statistics
The Doctor’s Dilemma (Revisited)
Remember from our probability module: A test for a rare disease (1 in 1000 people) comes back positive. The test is 99% accurate. What’s the probability you actually have the disease? Most people say “99%.” The real answer: About 9%. This counterintuitive result is the essence of Bayesian thinking. Let’s understand it deeply.Estimated Time: 4-5 hours
Difficulty: Intermediate
Prerequisites: Probability and Distributions modules
What You’ll Build: Spam filter, A/B test analyzer, and diagnostic tool
Difficulty: Intermediate
Prerequisites: Probability and Distributions modules
What You’ll Build: Spam filter, A/B test analyzer, and diagnostic tool
Bayes’ Theorem: The Core Formula
In plain English:- P(A|B): Probability of A given we observed B (posterior)
- P(B|A): Probability of observing B if A is true (likelihood)
- P(A): Prior probability of A before seeing evidence
- P(B): Total probability of observing B
The Medical Test Example
Let’s define:- A: You have the disease
- B: Test is positive
- P(disease) = 1/1000 = 0.001 (prior)
- P(positive|disease) = 0.99 (sensitivity)
- P(positive|no disease) = 0.01 (false positive rate)
Why Is It So Low?
Frequentist vs Bayesian: Two Philosophies
The Coin Flip Example
You flip a coin 10 times and get 7 heads. What’s the probability of heads? Frequentist Answer:- The “true” probability is a fixed (unknown) number
- Our best estimate is 7/10 = 0.70
- Confidence interval: roughly 0.35 to 0.93 (wide because small sample)
- We start with a prior belief (maybe 50-50)
- We update based on evidence
- We get a posterior distribution over possible values
- Strong priors resist change (you need lots of data to override them)
- Weak priors let the data speak
- With enough data, all priors converge to the same answer
Building a Bayesian Spam Filter
This is exactly how the original spam filters worked!Bayesian A/B Testing
The Problem with Frequentist A/B Tests
Traditional A/B tests give you a yes/no answer: “statistically significant” or not. Bayesian A/B testing gives you richer information:- Probability that B is better than A
- Expected improvement
- Distribution of possible outcomes
Bayesian Linear Regression
Unlike regular linear regression (which gives point estimates), Bayesian regression gives distributions over parameters.When to Use Bayesian vs Frequentist
| Situation | Recommended Approach |
|---|---|
| Large dataset, no prior info | Frequentist (simpler) |
| Small dataset, strong prior knowledge | Bayesian |
| Need probability statements about parameters | Bayesian |
| Need to update beliefs as data arrives | Bayesian |
| Regulatory/scientific publishing | Often Frequentist (tradition) |
| Business decisions with uncertainty | Bayesian |
Practice Exercises
Exercise 1: Updating Beliefs
Exercise 1: Updating Beliefs
Problem: You think a coin is fair (prior: Beta(10, 10)). After 30 flips, you see 22 heads. What’s your posterior belief about P(heads)?Calculate the posterior distribution and 95% credible interval.
Exercise 2: Multi-Armed Bandit
Exercise 2: Multi-Armed Bandit
Problem: You’re testing 3 different ad creatives. After day 1:
- Ad A: 50 clicks, 5 conversions
- Ad B: 50 clicks, 8 conversions
- Ad C: 50 clicks, 3 conversions
Exercise 3: Hierarchical Model
Exercise 3: Hierarchical Model
Problem: You have conversion rates from 5 different stores. Build a hierarchical Bayesian model that shares information across stores to get better estimates for stores with little data.
Summary
| Concept | Frequentist | Bayesian |
|---|---|---|
| Parameters | Fixed but unknown | Random variables |
| Probability | Long-run frequency | Degree of belief |
| Prior knowledge | Not used | Explicitly encoded |
| Result | Point estimate + CI | Full distribution |
| Interpretation | ”True value is in CI with 95% confidence" | "95% probability parameter is in interval” |
Key Takeaway: Bayesian statistics lets you formally combine prior knowledge with data to get calibrated uncertainty estimates. It’s especially valuable when data is limited or when you need to make decisions under uncertainty.