> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# The Prediction Game

> Your first machine learning model - no libraries, just logic

# The Prediction Game

<Frame>
  <img src="https://mintcdn.com/devweeekends/1cs3K7TO-w20cKuc/images/courses/ml-mastery/prediction-concept.svg?fit=max&auto=format&n=1cs3K7TO-w20cKuc&q=85&s=71f2dedb4edd8d8fb75a652499954497" alt="ML Prediction Concept - Input to Output" width="1080" height="1080" data-path="images/courses/ml-mastery/prediction-concept.svg" />
</Frame>

## Starting With Something You Already Know

Forget Python. Forget libraries. Forget math notation.

Let's play a game.

***

## Round 1: The House Price Game

You're a real estate agent. A client asks: *"How much is this house worth?"*

They give you some info:

| Feature     | Value |
| ----------- | ----- |
| Bedrooms    | 3     |
| Bathrooms   | 2     |
| Square Feet | 1,500 |
| Age (years) | 10    |
| Has Pool    | No    |

What's your guess?

### Your Brain's Algorithm

Without realizing it, you do this:

1. Think of similar houses you've seen
2. Remember what they sold for
3. Adjust based on differences
4. Make a guess

**That's machine learning.** You learned patterns from past data and applied them to new data.

<Frame>
  <img src="https://mintcdn.com/devweeekends/1cs3K7TO-w20cKuc/images/courses/ml-mastery/prediction-real-world.svg?fit=max&auto=format&n=1cs3K7TO-w20cKuc&q=85&s=47ec5eaffee5ebf409beba56595b6643" alt="Real World ML - Email Spam Filtering" width="1080" height="1080" data-path="images/courses/ml-mastery/prediction-real-world.svg" />
</Frame>

***

## Round 2: Let's Be More Systematic

What if I told you the average house in your area sells for:

* **Base price**: \$200,000
* Each bedroom adds about **\$25,000**
* Each bathroom adds about **\$15,000**
* Each square foot adds about **\$150**

Now you can compute:

```
Base:                           $200,000
+ 3 bedrooms × $25,000:         + $75,000
+ 2 bathrooms × $15,000:        + $30,000
+ 1,500 sq ft × $150:          + $225,000
                               ----------
Predicted price:                $530,000
```

You just built your first **linear model**!

<Info>
  **The formula you just used:**

  `price = base + (bedrooms × weight1) + (bathrooms × weight2) + (sqft × weight3)`

  Those "weights" ($25k, $15k, \$150) are what machine learning learns automatically from data.
</Info>

***

## Let's Code It (Still No Libraries!)

```python theme={null}
# Your first "model" - just a function!
def predict_house_price(bedrooms, bathrooms, sqft):
    base = 200000
    bedroom_value = 25000
    bathroom_value = 15000
    sqft_value = 150
    
    predicted = (
        base + 
        bedrooms * bedroom_value + 
        bathrooms * bathroom_value + 
        sqft * sqft_value
    )
    return predicted

# Test it
house1 = predict_house_price(3, 2, 1500)
print(f"House 1 predicted: ${house1:,}")  # $530,000

house2 = predict_house_price(4, 3, 2200)
print(f"House 2 predicted: ${house2:,}")  # $675,000
```

***

## The Million Dollar Question

But wait... how did we know those weights?

* Why $25,000 per bedroom and not $30,000?
* Why $150 per sq ft and not $200?

**We guessed.** And our guesses might be wrong.

**Machine learning answers this**: Given a bunch of houses with known prices, can we *figure out* the best weights automatically?

***

## Real Data, Real Problem

Here's actual data (simplified):

```python theme={null}
# Past house sales (our "training data")
houses = [
    # [bedrooms, bathrooms, sqft] -> actual_price
    {"features": [2, 1, 1000], "price": 250000},
    {"features": [3, 2, 1500], "price": 380000},
    {"features": [4, 2, 1800], "price": 450000},
    {"features": [3, 3, 2000], "price": 520000},
    {"features": [5, 4, 3000], "price": 750000},
]
```

**Our goal**: Find weights that make our predictions match these actual prices as closely as possible.

***

## Step 1: How Wrong Are We?

If we use our guessed weights, let's see how we do:

```python theme={null}
def predict_house_price(features):
    bedrooms, bathrooms, sqft = features
    base = 200000
    return base + bedrooms * 25000 + bathrooms * 15000 + sqft * 150

# Check each house
for house in houses:
    predicted = predict_house_price(house["features"])
    actual = house["price"]
    error = predicted - actual
    print(f"Predicted: ${predicted:,}, Actual: ${actual:,}, Error: ${error:,}")
```

**Output:**

```
Predicted: $430,000, Actual: $250,000, Error: $180,000  (too high!)
Predicted: $530,000, Actual: $380,000, Error: $150,000  (too high!)
Predicted: $595,000, Actual: $450,000, Error: $145,000  (too high!)
Predicted: $620,000, Actual: $520,000, Error: $100,000  (too high!)
Predicted: $825,000, Actual: $750,000, Error: $75,000   (too high!)
```

We're consistently too high! Our weights are off.

***

## Step 2: Measure Total "Wrongness"

We need a single number that tells us how wrong we are overall.

**Simple approach**: Sum of all errors

```python theme={null}
total_error = 0
for house in houses:
    predicted = predict_house_price(house["features"])
    actual = house["price"]
    error = predicted - actual
    total_error += error

print(f"Total error: ${total_error:,}")  # $650,000 too high overall
```

**Problem**: What if some errors are positive and some negative? They cancel out!

**Better approach**: Sum of squared errors

```python theme={null}
total_squared_error = 0
for house in houses:
    predicted = predict_house_price(house["features"])
    actual = house["price"]
    error = predicted - actual
    total_squared_error += error ** 2

print(f"Total squared error: {total_squared_error:,.0f}")
```

This is called the **Loss Function** or **Cost Function**. Lower is better!

<Note>
  **Why squared?** Think of it like grading a student's exam:

  1. **No negative numbers** -- errors can't cancel out (a +$50K overshoot shouldn't "forgive" a -$50K undershoot)
  2. **Big errors get penalized more** -- being off by $100K is more than twice as bad as being off by $50K. Squaring enforces this: $100K^2 = 10B$ vs $50K^2 = 2.5B$ (a 4x penalty for a 2x error)
  3. **Smooth and differentiable** -- the curve has no sharp corners, so gradient descent can glide smoothly toward the minimum (we'll need this in Module 2)

  There are alternatives -- Mean Absolute Error (MAE) treats all errors equally and is more robust to outliers. But MSE is the default starting point because its math is cleaner and it punishes the predictions you're most embarrassingly wrong about.
</Note>

***

## Step 3: Try Different Weights

What if we try different values?

```python theme={null}
def calculate_total_error(base, bed_weight, bath_weight, sqft_weight):
    total_squared_error = 0
    for house in houses:
        bedrooms, bathrooms, sqft = house["features"]
        predicted = base + bedrooms * bed_weight + bathrooms * bath_weight + sqft * sqft_weight
        actual = house["price"]
        error = predicted - actual
        total_squared_error += error ** 2
    return total_squared_error

# Our original guess
error1 = calculate_total_error(200000, 25000, 15000, 150)
print(f"Original weights error: {error1:,.0f}")

# Try lowering everything
error2 = calculate_total_error(100000, 20000, 10000, 100)
print(f"Lower weights error: {error2:,.0f}")

# Try something else
error3 = calculate_total_error(50000, 15000, 25000, 175)
print(f"Alternative weights error: {error3:,.0f}")
```

**The challenge**: There are infinite combinations of weights. How do we find the best ones?

***

## The Insight: Systematic Search

What if we:

1. Start with random weights
2. Check how wrong we are
3. Slightly adjust weights
4. If error goes down, keep the change
5. Repeat until error stops improving

This is the core idea behind **Gradient Descent** - which we'll explore in the next module!

Think of it like tuning a radio dial in the dark. Random search is spinning the dial blindly and hoping for a good station. Gradient descent is *listening to the static* -- when it gets quieter, you keep turning that direction.

```python theme={null}
# A simple (but slow) approach: try lots of combinations
# This is "random search" -- the brute force method.
# It works, but it's like trying every combination on a lock
# instead of listening for the click.
best_error = float('inf')
best_weights = None

import random

for _ in range(10000):  # Try 10,000 random combinations
    base = random.randint(0, 200000)
    bed = random.randint(5000, 50000)
    bath = random.randint(5000, 50000)
    sqft = random.randint(50, 300)
    
    error = calculate_total_error(base, bed, bath, sqft)
    
    if error < best_error:
        best_error = error
        best_weights = (base, bed, bath, sqft)

print(f"Best weights found: {best_weights}")
print(f"Best error: {best_error:,.0f}")
# Note: with 4 weights and infinite possible values,
# random search has astronomically low odds of finding
# the true optimum. We need something smarter.
```

***

## What You Just Learned

Let's recap with proper ML terminology:

| What You Did                   | ML Term                   |
| ------------------------------ | ------------------------- |
| Used past house sales          | **Training Data**         |
| Features like bedrooms, sqft   | **Input Features** (X)    |
| The actual price               | **Target/Label** (y)      |
| The weights ($25k, $15k, etc.) | **Model Parameters**      |
| The prediction formula         | **Model**                 |
| How wrong our predictions were | **Loss/Error**            |
| Sum of squared errors          | **Loss Function**         |
| Trying to minimize error       | **Training/Optimization** |

***

## The Mathematical Connection

When you calculated:

```
price = base + (bedrooms × weight1) + (bathrooms × weight2) + (sqft × weight3)
```

In math notation, this is:

$$
\hat{y} = w_0 + w_1 x_1 + w_2 x_2 + w_3 x_3
$$

Or in matrix form (from our [Linear Algebra course](/courses/math-for-ml-linear-algebra/03-matrices)):

$$
\hat{y} = \mathbf{w} \cdot \mathbf{x}
$$

This is a **dot product** - the same operation you do when calculating weighted grades!

***

## 🚀 Mini Projects

<CardGroup cols={2}>
  <Card title="Project 1" icon="house" color="#3B82F6">
    Build a house price estimator from scratch
  </Card>

  <Card title="Project 2" icon="car" color="#10B981">
    Create a used car valuation tool
  </Card>

  <Card title="Project 3" icon="chart-simple" color="#8B5CF6">
    Visualize prediction errors and find patterns
  </Card>
</CardGroup>

<details>
  <summary>**Project 1: House Price Estimator** - Build your first ML predictor</summary>

  **Objective**: Create a simple house price predictor and manually tune the weights.

  **Tasks**:

  1. Implement a prediction function with adjustable weights
  2. Calculate the total squared error
  3. Manually tune weights to minimize error
  4. Predict prices for new houses

  ```python theme={null}
  import numpy as np

  # Training data: houses with known prices
  houses = [
      {"bedrooms": 2, "bathrooms": 1, "sqft": 1000, "price": 250000},
      {"bedrooms": 3, "bathrooms": 2, "sqft": 1500, "price": 380000},
      {"bedrooms": 4, "bathrooms": 2, "sqft": 1800, "price": 450000},
      {"bedrooms": 3, "bathrooms": 3, "sqft": 2000, "price": 520000},
      {"bedrooms": 5, "bathrooms": 4, "sqft": 3000, "price": 750000},
      {"bedrooms": 2, "bathrooms": 1, "sqft": 900, "price": 220000},
      {"bedrooms": 4, "bathrooms": 3, "sqft": 2500, "price": 620000},
  ]

  def predict_price(house, weights):
      """Predict house price using weighted features."""
      base, w_bed, w_bath, w_sqft = weights
      return (base + 
              w_bed * house["bedrooms"] + 
              w_bath * house["bathrooms"] + 
              w_sqft * house["sqft"])

  def calculate_error(houses, weights):
      """Calculate total squared error."""
      total_error = 0
      for house in houses:
          predicted = predict_price(house, weights)
          actual = house["price"]
          total_error += (predicted - actual) ** 2
      return total_error

  # TODO: Try different weights and minimize the error
  # Start with these weights:
  initial_weights = [100000, 30000, 20000, 100]  # [base, per_bedroom, per_bathroom, per_sqft]

  # Your goal: Find weights that give low total error
  # Try adjusting each weight up and down to see the effect
  ```

  **Solution**:

  ```python theme={null}
  # Grid search for best weights (simplified brute force)
  best_error = float('inf')
  best_weights = None

  # Search over a range of weight combinations
  for base in range(50000, 150000, 25000):
      for w_bed in range(10000, 50000, 10000):
          for w_bath in range(10000, 40000, 10000):
              for w_sqft in range(50, 200, 25):
                  weights = [base, w_bed, w_bath, w_sqft]
                  error = calculate_error(houses, weights)
                  if error < best_error:
                      best_error = error
                      best_weights = weights

  print(f"Best weights: {best_weights}")
  print(f"Best total error: ${best_error:,.0f}")

  # Verify predictions
  print("\n--- Predictions vs Actual ---")
  for house in houses:
      pred = predict_price(house, best_weights)
      actual = house["price"]
      print(f"Predicted: ${pred:,.0f}, Actual: ${actual:,}, Error: ${abs(pred-actual):,.0f}")

  # Predict new house
  new_house = {"bedrooms": 3, "bathrooms": 2, "sqft": 1700}
  prediction = predict_price(new_house, best_weights)
  print(f"\nNew house prediction: ${prediction:,.0f}")
  ```
</details>

<details>
  <summary>**Project 2: Used Car Valuation Tool** - Handle negative relationships</summary>

  **Objective**: Build a car price predictor where some features decrease value (age, mileage).

  **Key Learning**: Not all features have positive relationships with the target!

  ```python theme={null}
  import numpy as np

  # Car data: age and mileage should decrease price!
  cars = [
      {"age_years": 1, "mileage_k": 10, "horsepower": 200, "price": 35000},
      {"age_years": 3, "mileage_k": 35, "horsepower": 180, "price": 25000},
      {"age_years": 5, "mileage_k": 60, "horsepower": 220, "price": 22000},
      {"age_years": 2, "mileage_k": 25, "horsepower": 300, "price": 45000},
      {"age_years": 7, "mileage_k": 90, "horsepower": 160, "price": 12000},
      {"age_years": 4, "mileage_k": 45, "horsepower": 250, "price": 28000},
      {"age_years": 1, "mileage_k": 8, "horsepower": 350, "price": 55000},
      {"age_years": 10, "mileage_k": 150, "horsepower": 180, "price": 8000},
  ]

  def predict_car_price(car, weights):
      """
      Predict car price.
      Note: age and mileage should have NEGATIVE weights!
      """
      base, w_age, w_mileage, w_hp = weights
      return (base + 
              w_age * car["age_years"] + 
              w_mileage * car["mileage_k"] + 
              w_hp * car["horsepower"])

  # TODO: Find weights where age and mileage are negative
  # Hint: Age weight might be around -2000 to -4000 per year
  # Hint: Mileage weight might be around -100 to -300 per 1000 miles
  ```

  **Solution**:

  ```python theme={null}
  def calculate_car_error(cars, weights):
      """Calculate mean squared error."""
      total = 0
      for car in cars:
          pred = predict_car_price(car, weights)
          total += (pred - car["price"]) ** 2
      return total / len(cars)

  # Search with negative weights for age and mileage
  best_error = float('inf')
  best_weights = None

  for base in range(40000, 60000, 5000):
      for w_age in range(-5000, -1000, 500):  # Negative!
          for w_mileage in range(-300, -50, 50):  # Negative!
              for w_hp in range(50, 200, 25):  # Positive
                  weights = [base, w_age, w_mileage, w_hp]
                  error = calculate_car_error(cars, weights)
                  if error < best_error:
                      best_error = error
                      best_weights = weights

  print(f"Best weights: {best_weights}")
  print(f"  Base price: ${best_weights[0]:,}")
  print(f"  Per year age: ${best_weights[1]:,} (negative = older costs less)")
  print(f"  Per 1K miles: ${best_weights[2]:,} (negative = more miles costs less)")
  print(f"  Per horsepower: ${best_weights[3]:,}")

  # Interpret
  print("\n--- Interpretation ---")
  print(f"Each year of age reduces value by ${abs(best_weights[1]):,}")
  print(f"Each 10,000 miles reduces value by ${abs(best_weights[2] * 10):,}")
  print(f"Each horsepower adds ${best_weights[3]:,}")

  # Test predictions
  print("\n--- Validation ---")
  for car in cars[:3]:
      pred = predict_car_price(car, best_weights)
      print(f"{car['age_years']}yr old, {car['mileage_k']}k mi, {car['horsepower']}hp")
      print(f"  Predicted: ${pred:,.0f}, Actual: ${car['price']:,}\n")
  ```
</details>

<details>
  <summary>**Project 3: Error Analysis Dashboard** - Visualize and understand errors</summary>

  **Objective**: Analyze prediction errors to understand model behavior.

  **Key Learning**: Visualizing errors reveals patterns and helps improve models.

  ```python theme={null}
  import numpy as np
  import matplotlib.pyplot as plt

  # House data
  houses = [
      {"sqft": 1000, "price": 250000},
      {"sqft": 1200, "price": 290000},
      {"sqft": 1500, "price": 380000},
      {"sqft": 1800, "price": 420000},
      {"sqft": 2000, "price": 500000},
      {"sqft": 2200, "price": 550000},
      {"sqft": 2500, "price": 620000},
      {"sqft": 3000, "price": 780000},
  ]

  # Simple model: price = base + sqft * price_per_sqft
  def analyze_model(base, price_per_sqft):
      """Analyze a simple linear model."""
      sqfts = [h["sqft"] for h in houses]
      actuals = [h["price"] for h in houses]
      predictions = [base + price_per_sqft * sqft for sqft in sqfts]
      errors = [pred - actual for pred, actual in zip(predictions, actuals)]
      
      # Statistics
      mse = np.mean([e**2 for e in errors])
      mae = np.mean([abs(e) for e in errors])
      
      return sqfts, actuals, predictions, errors, mse, mae

  # TODO: Try different base and price_per_sqft values
  # Analyze which gives best results
  ```

  **Solution**:

  ```python theme={null}
  # Test multiple models
  models = [
      {"base": 50000, "price_per_sqft": 200, "name": "Model A: High per-sqft"},
      {"base": 100000, "price_per_sqft": 150, "name": "Model B: Balanced"},
      {"base": 0, "price_per_sqft": 250, "name": "Model C: No base"},
  ]

  fig, axes = plt.subplots(2, 3, figsize=(15, 10))

  for i, model in enumerate(models):
      sqfts, actuals, predictions, errors, mse, mae = analyze_model(
          model["base"], model["price_per_sqft"]
      )
      
      # Top row: Predictions vs Actuals
      ax1 = axes[0, i]
      ax1.scatter(sqfts, actuals, label='Actual', s=100)
      ax1.plot(sqfts, predictions, 'r-', label='Predicted', linewidth=2)
      ax1.set_xlabel('Square Feet')
      ax1.set_ylabel('Price ($)')
      ax1.set_title(f'{model["name"]}\nMSE: ${mse:,.0f}')
      ax1.legend()
      
      # Bottom row: Error distribution
      ax2 = axes[1, i]
      colors = ['green' if e < 0 else 'red' for e in errors]
      ax2.bar(range(len(errors)), errors, color=colors, alpha=0.7)
      ax2.axhline(y=0, color='black', linestyle='-', linewidth=0.5)
      ax2.set_xlabel('House Index')
      ax2.set_ylabel('Error ($)')
      ax2.set_title(f'MAE: ${mae:,.0f}')

  plt.tight_layout()
  plt.savefig('error_analysis.png', dpi=100)
  print("Saved error_analysis.png")

  # Find best model
  best_model = min(models, key=lambda m: analyze_model(m["base"], m["price_per_sqft"])[4])
  print(f"\nBest model: {best_model['name']}")
  print(f"Formula: price = ${best_model['base']:,} + sqft × ${best_model['price_per_sqft']}")

  # Pattern analysis
  print("\n--- Error Pattern Analysis ---")
  _, _, _, errors, _, _ = analyze_model(best_model["base"], best_model["price_per_sqft"])
  if all(e < 0 for e in errors[:3]) and all(e > 0 for e in errors[-3:]):
      print("Pattern: Underpredicting small houses, overpredicting large houses")
      print("Suggestion: Reduce price_per_sqft, increase base")
  elif all(e > 0 for e in errors):
      print("Pattern: Overpredicting all houses")
      print("Suggestion: Reduce base or price_per_sqft")
  else:
      print("Errors are mixed - model is reasonably balanced")
  ```
</details>

***

## Key Takeaways

<CardGroup cols={2}>
  <Card title="ML is Pattern Matching" icon="search">
    Find patterns in past data, apply to new data
  </Card>

  <Card title="Weights Capture Knowledge" icon="brain">
    The learned weights encode what matters
  </Card>

  <Card title="Loss Measures Wrongness" icon="gauge">
    Lower loss = better predictions
  </Card>

  <Card title="Training = Minimizing Loss" icon="chart-line-down">
    Find weights that make predictions best match reality
  </Card>
</CardGroup>

***

## Practice Challenge

Try this on your own:

```python theme={null}
# New dataset: Car prices
cars = [
    # [age_years, mileage_k, horsepower] -> price
    {"features": [2, 15, 200], "price": 35000},
    {"features": [5, 50, 180], "price": 22000},
    {"features": [1, 8, 250], "price": 45000},
    {"features": [8, 100, 150], "price": 12000},
    {"features": [3, 30, 220], "price": 32000},
]

# Your task:
# 1. Create a predict_car_price function with guessed weights
# 2. Calculate total squared error
# 3. Try different weights and find better ones
# 4. What patterns do you notice? (age and mileage should be negative!)
```

<Accordion title="Solution Hints">
  **Key insight**: Unlike houses where more is usually better, for cars:

  * **Older** cars are worth **less** (negative weight for age)
  * **Higher mileage** is worth **less** (negative weight for mileage)
  * **More horsepower** is worth **more** (positive weight)

  Try something like:

  ```python theme={null}
  price = 50000 - (age * 3000) - (mileage * 200) + (horsepower * 100)
  ```
</Accordion>

***

## Next Up

In the next module, we'll learn:

* How to **systematically** find the best weights (not just random guessing)
* The key insight of **gradient descent** - following the slope downhill
* How this connects to [calculus](/courses/math-for-ml-calculus/01-derivatives)

<Card title="Continue to Module 2: Learning From Mistakes" icon="arrow-right" href="/courses/ml-mastery/02-learning-from-mistakes">
  Discover gradient descent - the algorithm that powers all modern ML
</Card>

***

## 🔗 Math → ML Connection

<Note>
  **What you learned in this module connects to formal ML:**

  | Concept in This Module                     | Formal ML Term       | Where It's Used                               |
  | ------------------------------------------ | -------------------- | --------------------------------------------- |
  | Guessing weights                           | **Model parameters** | Every ML model has parameters to learn        |
  | Formula: `price = base + weight × feature` | **Linear model**     | Neural network layers, linear regression      |
  | Measuring "wrongness"                      | **Loss function**    | Training any model (MSE, cross-entropy, etc.) |
  | Finding better weights                     | **Optimization**     | Gradient descent, Adam, SGD                   |
  | Past data with answers                     | **Training data**    | Supervised learning                           |

  **Next module**: We'll replace "random guessing" with a systematic approach called **gradient descent** - the same algorithm that trains ChatGPT!
</Note>

***

## 🚀 Going Deeper (Optional)

<Accordion title="The Mathematics Behind Linear Models" icon="graduation-cap">
  **For learners who want the formal treatment:**

  ### Matrix Formulation

  What we wrote as:

  ```
  price = base + w1×bedrooms + w2×bathrooms + w3×sqft
  ```

  Can be written in matrix form as:
  $\hat{y} = X \mathbf{w}$

  Where:

  * $X$ is the **feature matrix** (each row is a house, each column is a feature)
  * $\mathbf{w}$ is the **weight vector**
  * $\hat{y}$ is the **prediction vector**

  ### Why Squared Error?

  We use squared error (not absolute error) because:

  1. It's **differentiable** - we can compute gradients (needed for Module 2)
  2. It **penalizes large errors more** - a $100K error is worse than two $50K errors
  3. It leads to **closed-form solutions** in linear regression

  ### Closed-Form Solution

  For linear regression, there's actually a formula that gives optimal weights directly:
  $\mathbf{w}^* = (X^T X)^{-1} X^T y$

  We'll derive this in [Linear Regression module](/courses/ml-mastery/03-linear-regression).

  ### Recommended Resources

  * [3Blue1Brown: Essence of Linear Algebra](https://www.3blue1brown.com/topics/linear-algebra) - Visual intuition
  * [Our Linear Algebra Course](/courses/math-for-ml-linear-algebra/02-vectors) - Full treatment
</Accordion>
