The Prediction Game
Starting With Something You Already Know
Forget Python. Forget libraries. Forget math notation. Let’s play a game.Round 1: The House Price Game
You’re a real estate agent. A client asks: “How much is this house worth?” They give you some info:| Feature | Value |
|---|---|
| Bedrooms | 3 |
| Bathrooms | 2 |
| Square Feet | 1,500 |
| Age (years) | 10 |
| Has Pool | No |
Your Brain’s Algorithm
Without realizing it, you do this:- Think of similar houses you’ve seen
- Remember what they sold for
- Adjust based on differences
- Make a guess
Round 2: Let’s Be More Systematic
What if I told you the average house in your area sells for:- Base price: $200,000
- Each bedroom adds about $25,000
- Each bathroom adds about $15,000
- Each square foot adds about $150
The formula you just used:
price = base + (bedrooms × weight1) + (bathrooms × weight2) + (sqft × weight3)Those “weights” (15k, $150) are what machine learning learns automatically from data.Let’s Code It (Still No Libraries!)
The Million Dollar Question
But wait… how did we know those weights?- Why 30,000?
- Why 200?
Real Data, Real Problem
Here’s actual data (simplified):Step 1: How Wrong Are We?
If we use our guessed weights, let’s see how we do:Step 2: Measure Total “Wrongness”
We need a single number that tells us how wrong we are overall. Simple approach: Sum of all errorsWhy squared?
- No negative numbers (errors can’t cancel out)
- Big errors get penalized more than small errors
- It has nice mathematical properties (smooth, differentiable)
Step 3: Try Different Weights
What if we try different values?The Insight: Systematic Search
What if we:- Start with random weights
- Check how wrong we are
- Slightly adjust weights
- If error goes down, keep the change
- Repeat until error stops improving
What You Just Learned
Let’s recap with proper ML terminology:| What You Did | ML Term |
|---|---|
| Used past house sales | Training Data |
| Features like bedrooms, sqft | Input Features (X) |
| The actual price | Target/Label (y) |
| The weights (15k, etc.) | Model Parameters |
| The prediction formula | Model |
| How wrong our predictions were | Loss/Error |
| Sum of squared errors | Loss Function |
| Trying to minimize error | Training/Optimization |
The Mathematical Connection
When you calculated:🚀 Mini Projects
Project 1
Build a house price estimator from scratch
Project 2
Create a used car valuation tool
Project 3
Visualize prediction errors and find patterns
Key Takeaways
ML is Pattern Matching
Find patterns in past data, apply to new data
Weights Capture Knowledge
The learned weights encode what matters
Loss Measures Wrongness
Lower loss = better predictions
Training = Minimizing Loss
Find weights that make predictions best match reality
Practice Challenge
Try this on your own:Solution Hints
Solution Hints
Key insight: Unlike houses where more is usually better, for cars:
- Older cars are worth less (negative weight for age)
- Higher mileage is worth less (negative weight for mileage)
- More horsepower is worth more (positive weight)
Next Up
In the next module, we’ll learn:- How to systematically find the best weights (not just random guessing)
- The key insight of gradient descent - following the slope downhill
- How this connects to calculus
Continue to Module 2: Learning From Mistakes
Discover gradient descent - the algorithm that powers all modern ML
🔗 Math → ML Connection
What you learned in this module connects to formal ML:
Next module: We’ll replace “random guessing” with a systematic approach called gradient descent - the same algorithm that trains ChatGPT!
| Concept in This Module | Formal ML Term | Where It’s Used |
|---|---|---|
| Guessing weights | Model parameters | Every ML model has parameters to learn |
Formula: price = base + weight × feature | Linear model | Neural network layers, linear regression |
| Measuring “wrongness” | Loss function | Training any model (MSE, cross-entropy, etc.) |
| Finding better weights | Optimization | Gradient descent, Adam, SGD |
| Past data with answers | Training data | Supervised learning |
🚀 Going Deeper (Optional)
The Mathematics Behind Linear Models
The Mathematics Behind Linear Models
For learners who want the formal treatment:Can be written in matrix form as:
Where:
Matrix Formulation
What we wrote as:- is the feature matrix (each row is a house, each column is a feature)
- is the weight vector
- is the prediction vector
Why Squared Error?
We use squared error (not absolute error) because:- It’s differentiable - we can compute gradients (needed for Module 2)
- It penalizes large errors more - a 50K errors
- It leads to closed-form solutions in linear regression
Closed-Form Solution
For linear regression, there’s actually a formula that gives optimal weights directly: We’ll derive this in Linear Regression module.Recommended Resources
- 3Blue1Brown: Essence of Linear Algebra - Visual intuition
- Our Linear Algebra Course - Full treatment