> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Machine Learning Mastery

> Learn machine learning the right way - starting with problems you already understand

# Machine Learning Mastery

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/devweeekends/images/courses/ml-mastery/ml-mastery-hero.svg" alt="Machine Learning Mastery Course" />
</Frame>

## The Course That Makes ML Click

**This isn't just another ML course.** It's designed to take you from "I've heard of machine learning" to "I build production ML systems" through a carefully crafted journey that prioritizes understanding over memorization.

<CardGroup cols={3}>
  <Card title="50+ Hours of Content" icon="clock">
    26 comprehensive modules with projects, exercises, and real-world applications
  </Card>

  <Card title="10 Portfolio Projects" icon="folder-open">
    Build real ML systems you can showcase to employers
  </Card>

  <Card title="Industry-Ready Skills" icon="briefcase">
    Learn the same tools and techniques used at top tech companies
  </Card>
</CardGroup>

***

## You Already Think Like a Machine Learning Engineer

Before we write a single line of code, let me prove something to you.

### The House Price Game

Imagine you're helping a friend buy a house. They show you a listing:

**House A**: 3 bedrooms, 2 bathrooms, 1,800 sq ft, good school district, 15 years old

Your brain immediately does something remarkable. Based on houses you've seen before, you estimate: *"Probably around \$450,000?"*

Now they show you another:

**House B**: 5 bedrooms, 4 bathrooms, 3,500 sq ft, excellent school district, brand new

You think: *"Maybe \$850,000?"*

**Congratulations. You just did machine learning.**

You:

1. **Learned from examples** (houses you've seen before with their prices)
2. **Identified patterns** (more bedrooms = higher price, newer = higher price)
3. **Made predictions** on new, unseen data

That's literally all machine learning is. Think of it like learning to cook -- you don't memorize every recipe, you learn patterns (high heat = crispy, low heat = tender) and apply them to new ingredients. ML does the same thing, but with numbers instead of taste buds.

<Info>
  **Estimated Time**: 50-60 hours total\
  **Difficulty**: Beginner-friendly (we assume no ML background)\
  **Prerequisites**: Basic Python (variables, loops, functions)\
  **What You'll Build**: Real predictive models on real data\
  **Modules**: 24 comprehensive chapters from basics to production\
  **Math Required**: We'll teach you as we go, with links to our [Linear Algebra](/courses/math-for-ml-linear-algebra/01-introduction) and [Calculus](/courses/math-for-ml-calculus/00-introduction) courses
</Info>

***

## The Core Question of ML

Every machine learning problem boils down to one question:

**"Given things I know, can I predict something I don't know?"**

| What You Know     | What You Want to Predict | ML Name        |
| ----------------- | ------------------------ | -------------- |
| House features    | House price              | Regression     |
| Email text        | Spam or not spam         | Classification |
| Customer history  | Will they buy again?     | Classification |
| Movie preferences | Movie rating (1-5)       | Regression     |
| Photo pixels      | Is it a cat or dog?      | Classification |
| Purchase patterns | What else they might buy | Recommendation |

***

## Why This Course Is Different

Most ML courses start with math formulas, confusing Greek symbols, and abstract theory.

We start with **problems you already understand**:

* How would you predict house prices?
* How would you decide if an email is spam?
* How would you recommend movies to someone?

Then we show you that the math is just **formalizing what you already do naturally**.

<Warning>
  **Real Talk**: You don't need a PhD to do ML. You need:

  1. Curiosity about patterns
  2. Willingness to experiment
  3. Patience to iterate

  If you can estimate house prices in your head, you can learn ML.
</Warning>

***

## 🎯 What You'll Be Able to Do After This Course

<Steps>
  <Step title="Build ML Models from Scratch">
    Understand how algorithms work at a fundamental level - not just calling library functions
  </Step>

  <Step title="Select the Right Algorithm">
    Know when to use linear regression vs. random forest vs. neural networks
  </Step>

  <Step title="Handle Real-World Data">
    Clean messy data, engineer features, handle missing values and outliers
  </Step>

  <Step title="Evaluate Models Properly">
    Go beyond accuracy to precision, recall, AUC, and business metrics
  </Step>

  <Step title="Deploy to Production">
    Build APIs, monitor models, and handle the full ML lifecycle
  </Step>

  <Step title="Communicate Results">
    Explain model decisions to non-technical stakeholders
  </Step>
</Steps>

***

## 💼 Career Impact: What ML Engineers Earn

<Accordion title="Industry Salary Data (2024-2025)" icon="money-bill">
  | Role                   | Experience | US Salary Range | Key Skills From This Course  |
  | ---------------------- | ---------- | --------------- | ---------------------------- |
  | **Junior ML Engineer** | 0-2 years  | $90K - $130K    | Modules 1-10 (Fundamentals)  |
  | **ML Engineer**        | 2-5 years  | $130K - $180K   | Modules 11-19 (Advanced)     |
  | **Senior ML Engineer** | 5+ years   | $180K - $250K   | Full course + specialization |
  | **ML Lead/Manager**    | 7+ years   | $200K - $300K   | Course + leadership skills   |
  | **Research Scientist** | PhD + exp  | $180K - $350K   | Deep math + research skills  |

  **Top Companies Hiring ML Engineers:**

  * **FAANG**: Google, Meta, Amazon, Apple, Netflix
  * **AI-First**: OpenAI, Anthropic, DeepMind, Cohere
  * **Finance**: Citadel, Two Sigma, Jane Street, Goldman
  * **Startups**: Thousands of well-funded AI startups

  **This course prepares you for roles like:**

  * Machine Learning Engineer
  * Data Scientist
  * Applied Scientist
  * ML Platform Engineer
  * AI/ML Product Manager (technical)
</Accordion>

***

## 🏆 Success Stories: What Learners Build

<CardGroup cols={2}>
  <Card title="Customer Churn Predictor" icon="users">
    A model that identifies at-risk customers 2 weeks before they leave, saving a SaaS company \$2M/year in retention costs.
  </Card>

  <Card title="Fraud Detection System" icon="shield">
    Real-time fraud detection catching 94% of fraudulent transactions while only flagging 0.1% false positives.
  </Card>

  <Card title="Demand Forecasting" icon="chart-line">
    Inventory prediction reducing overstock by 30% for an e-commerce company.
  </Card>

  <Card title="Content Recommendation" icon="thumbs-up">
    A recommendation engine increasing user engagement by 40% for a media platform.
  </Card>
</CardGroup>

***

## Your Learning Path

### Part 1: The Foundation (This Is Not Scary)

<CardGroup cols={2}>
  <Card title="Module 1: The Prediction Game" icon="crosshairs" href="/courses/ml-mastery/01-prediction-game">
    Start with a simple question: can we predict house prices? Build your first model with just arithmetic.
  </Card>

  <Card title="Module 2: Learning From Mistakes" icon="graduation-cap" href="/courses/ml-mastery/02-learning-from-mistakes">
    How do we measure "wrong"? How do we get "less wrong"? The core ideas that power all of ML.
  </Card>

  <Card title="Module 3: Linear Regression" icon="chart-line" href="/courses/ml-mastery/03-linear-regression">
    Your first "real" ML algorithm. Spoiler: it's just fitting a line through points.
  </Card>

  <Card title="Module 4: Classification" icon="tags" href="/courses/ml-mastery/04-classification">
    What if the answer isn't a number but a category? Spam or not spam? Cat or dog?
  </Card>
</CardGroup>

### Part 2: Core Algorithms

<CardGroup cols={2}>
  <Card title="Module 4a: K-Nearest Neighbors" icon="people-arrows" href="/courses/ml-mastery/04a-knn">
    The simplest idea: find similar examples and use their answers. Intuitive yet powerful.
  </Card>

  <Card title="Module 5: Decision Trees" icon="tree" href="/courses/ml-mastery/05-decision-trees">
    How would YOU make decisions? ML trees do the same thing, just faster.
  </Card>

  <Card title="Module 5a: Support Vector Machines" icon="border-none" href="/courses/ml-mastery/05a-svm">
    Find the perfect boundary between classes with maximum margin.
  </Card>

  <Card title="Module 5b: Naive Bayes" icon="percent" href="/courses/ml-mastery/05b-naive-bayes">
    Probabilistic classification - surprisingly powerful for text data.
  </Card>

  <Card title="Module 6: Ensemble Methods" icon="layer-group" href="/courses/ml-mastery/06-ensemble-methods">
    What if we asked 100 models and took a vote? Random Forests and Gradient Boosting.
  </Card>

  <Card title="Module 7: Model Evaluation" icon="chart-column" href="/courses/ml-mastery/07-model-evaluation">
    How do you know if your model is actually good? Metrics beyond accuracy.
  </Card>
</CardGroup>

### Part 3: Professional Skills

<CardGroup cols={2}>
  <Card title="Module 8: Feature Engineering" icon="wand-magic-sparkles" href="/courses/ml-mastery/08-feature-engineering">
    The secret weapon. 80% of the magic is in data preparation.
  </Card>

  <Card title="Module 9: Hyperparameter Tuning" icon="sliders" href="/courses/ml-mastery/09-hyperparameter-tuning">
    Find the best settings for any model automatically.
  </Card>

  <Card title="Module 10: End-to-End Project" icon="rocket" href="/courses/ml-mastery/10-end-to-end-project">
    Build a complete ML project from start to finish.
  </Card>

  <Card title="Module 11: Clustering" icon="object-group" href="/courses/ml-mastery/11-clustering">
    Unsupervised learning: find groups when you don't have labels.
  </Card>
</CardGroup>

### Part 4: Advanced Topics

<CardGroup cols={2}>
  <Card title="Module 12: Neural Networks" icon="network-wired" href="/courses/ml-mastery/12-neural-networks">
    From biology to code: understand how deep learning works.
  </Card>

  <Card title="Module 13: Regularization" icon="shield-halved" href="/courses/ml-mastery/13-regularization">
    Fight overfitting with L1, L2, dropout, and more.
  </Card>

  <Card title="Module 14: Model Deployment" icon="cloud-arrow-up" href="/courses/ml-mastery/14-model-deployment">
    Take your model from notebook to production API.
  </Card>

  <Card title="Module 15: Time Series" icon="clock" href="/courses/ml-mastery/15-time-series">
    Predict the future from sequential data - trends, seasonality, forecasting.
  </Card>
</CardGroup>

### Part 5: Theory & Best Practices

<CardGroup cols={2}>
  <Card title="Module 16: Bias-Variance Tradeoff" icon="scale-balanced" href="/courses/ml-mastery/16-bias-variance">
    The fundamental tradeoff that governs all machine learning.
  </Card>

  <Card title="Module 17: Data Leakage" icon="droplet" href="/courses/ml-mastery/17-data-leakage">
    The silent killer of ML models in production - learn to avoid it.
  </Card>

  <Card title="Module 18: Dimensionality Reduction" icon="compress" href="/courses/ml-mastery/18-dimensionality-reduction">
    PCA, t-SNE, UMAP - handle high-dimensional data effectively.
  </Card>

  <Card title="Module 19: Capstone Project" icon="graduation-cap" href="/courses/ml-mastery/19-capstone-project">
    Build a complete ML system from problem definition to production.
  </Card>
</CardGroup>

### Part 6: Real-World Challenges

<CardGroup cols={2}>
  <Card title="Module 20: Imbalanced Data" icon="scale-unbalanced" href="/courses/ml-mastery/20-imbalanced-data">
    When 99% of data is one class - SMOTE, class weights, and resampling.
  </Card>

  <Card title="Module 21: Model Explainability" icon="lightbulb" href="/courses/ml-mastery/21-explainability">
    SHAP, LIME, feature importance - understand why models decide.
  </Card>

  <Card title="Module 22: ML Pipelines" icon="diagram-project" href="/courses/ml-mastery/22-ml-pipelines">
    Build reproducible, production-ready workflows with sklearn pipelines.
  </Card>

  <Card title="Module 23: Common Mistakes" icon="triangle-exclamation" href="/courses/ml-mastery/23-common-mistakes">
    Avoid the pitfalls that trip up even experienced practitioners.
  </Card>
</CardGroup>

***

## Math Prerequisites: We've Got You Covered

This course links to our math courses when needed. Don't worry - we explain the intuition first, then link to the math if you want to go deeper.

<CardGroup cols={3}>
  <Card title="Linear Algebra" icon="square-root-variable" href="/courses/math-for-ml-linear-algebra/01-introduction">
    Vectors, matrices, similarity measures - the language of data.
  </Card>

  <Card title="Calculus" icon="function" href="/courses/math-for-ml-calculus/00-introduction">
    Derivatives and gradients - how models learn.
  </Card>

  <Card title="Statistics" icon="chart-bar" href="/courses/statistics-for-ml/01-introduction">
    Probability and inference - understanding uncertainty.
  </Card>
</CardGroup>

***

## 🎯 Model Selection: When to Use What

**One of the biggest challenges in ML is choosing the right model.** Here's your decision framework:

<Accordion title="Quick Model Selection Guide" icon="sitemap">
  ### By Problem Type:

  | Your Problem                                  | First Try           | If It's Not Enough         | Advanced Option                 |
  | --------------------------------------------- | ------------------- | -------------------------- | ------------------------------- |
  | **Predict a number** (house prices)           | Linear Regression   | Random Forest Regressor    | Gradient Boosting (XGBoost)     |
  | **Predict a category** (spam/not spam)        | Logistic Regression | Random Forest Classifier   | Gradient Boosting or Neural Net |
  | **Group similar items** (customer segments)   | K-Means             | Hierarchical Clustering    | DBSCAN for weird shapes         |
  | **Find patterns in sequences** (stock prices) | ARIMA               | Prophet                    | LSTM Neural Network             |
  | **Images** (cat vs dog)                       | CNN (pretrained)    | Fine-tune ResNet           | Custom architecture             |
  | **Text** (sentiment analysis)                 | Naive Bayes         | BERT embeddings + Logistic | Fine-tune transformer           |

  ### By Dataset Size:

  | Dataset Size           | Best Approaches                         | Why                                                                                   |
  | ---------------------- | --------------------------------------- | ------------------------------------------------------------------------------------- |
  | **\< 1,000 rows**      | Simple models (Linear, Naive Bayes)     | Not enough data for complex models to generalize -- they'll memorize instead of learn |
  | **1,000-100,000 rows** | Tree ensembles (Random Forest, XGBoost) | Sweet spot for most algorithms; enough signal without needing GPU infrastructure      |
  | **> 100,000 rows**     | Deep learning becomes viable            | Enough data to learn complex patterns; XGBoost still often wins on tabular data       |
  | **Millions of rows**   | Neural networks, XGBoost with sampling  | Can exploit complex patterns, but watch for training time and diminishing returns     |

  ### By Interpretability Need:

  | Need to Explain Predictions?      | Use These                                 | Avoid These                        |
  | --------------------------------- | ----------------------------------------- | ---------------------------------- |
  | **Yes (healthcare, finance)**     | Linear models, Decision Trees, Rule-based | Deep neural nets, Ensemble methods |
  | **Somewhat (business reporting)** | Tree ensembles + SHAP                     | Black-box deep learning            |
  | **No (internal optimization)**    | Anything that works!                      | N/A                                |
</Accordion>

<Accordion title="Model Tradeoffs Cheat Sheet" icon="scale-balanced">
  ### Understanding the Tradeoffs:

  | Model                   | Accuracy | Speed | Interpretability | Handles Missing Data | Needs Feature Scaling |
  | ----------------------- | -------- | ----- | ---------------- | -------------------- | --------------------- |
  | **Linear Regression**   | ★★☆      | ★★★   | ★★★              | No                   | Yes                   |
  | **Logistic Regression** | ★★☆      | ★★★   | ★★★              | No                   | Yes                   |
  | **Decision Tree**       | ★★☆      | ★★★   | ★★★              | Yes                  | No                    |
  | **Random Forest**       | ★★★      | ★★☆   | ★☆☆              | Yes                  | No                    |
  | **XGBoost**             | ★★★      | ★★☆   | ★☆☆              | Yes                  | No                    |
  | **SVM**                 | ★★★      | ★☆☆   | ★☆☆              | No                   | Yes                   |
  | **KNN**                 | ★★☆      | ★☆☆   | ★★☆              | No                   | Yes                   |
  | **Neural Network**      | ★★★      | ★☆☆   | ★☆☆              | No                   | Yes                   |
  | **Naive Bayes**         | ★★☆      | ★★★   | ★★★              | Yes                  | No                    |

  ### Common Mistakes to Avoid:

  | Mistake                            | Why It's Bad                                | What to Do Instead                                                                                 |
  | ---------------------------------- | ------------------------------------------- | -------------------------------------------------------------------------------------------------- |
  | Starting with neural nets          | Overkill for tabular data, hard to debug    | Start with Random Forest/XGBoost -- they're the workhorse of Kaggle competitions for a reason      |
  | Ignoring baselines                 | Can't tell if your model is actually good   | Always compare to simple models; a "predict the mean" baseline catches embarrassing surprises      |
  | Tuning before feature engineering  | Features matter more than hyperparameters   | Get features right first -- a great feature beats a perfectly tuned model every time               |
  | Using accuracy for imbalanced data | 99% accuracy if you always predict majority | Use precision, recall, F1, AUC -- see Module 7 for the full breakdown                              |
  | Not looking at your data first     | You'll build models on garbage              | Always do EDA -- plot distributions, check for nulls, look at correlations before touching sklearn |
</Accordion>

***

## The Philosophy: Math As Needed

We don't front-load math. Instead:

1. **You encounter a problem** (Why isn't my prediction getting better?)
2. **We show the intuition** (You need to find the "slope" that minimizes error)
3. **We link to the math** (That's what [derivatives](/courses/math-for-ml-calculus/01-derivatives) do!)
4. **You understand why it matters**

This way, you never wonder "why am I learning this?" — you know exactly why.

***

## 🧹 Real-World Data: It's Never Clean

Textbook ML examples use clean, perfect datasets. Reality is different:

<Accordion title="Messy Data Problems We'll Tackle" icon="broom">
  | Real-World Problem      | Where We Cover It | What You'll Learn                              |
  | ----------------------- | ----------------- | ---------------------------------------------- |
  | **Missing values**      | Module 8, 10      | Imputation strategies, when to drop vs fill    |
  | **Outliers**            | Module 8, 7       | Detection methods, robust models               |
  | **Imbalanced classes**  | Module 20         | SMOTE, class weights, threshold tuning         |
  | **Feature types mixed** | Module 8          | Encoding categoricals, handling text + numbers |
  | **Data leakage**        | Module 17         | The silent killer of models in production      |
  | **Distribution shift**  | Module 14, 23     | When training ≠ production data                |
  | **Noisy labels**        | Module 7, 23      | Dealing with human labeling errors             |

  **Our approach**: Every end-to-end project uses *real* messy datasets. You'll learn to:

  1. **Diagnose** data quality issues before modeling
  2. **Clean** appropriately without destroying information
  3. **Validate** that your cleaning didn't introduce bias
  4. **Document** your decisions for reproducibility
</Accordion>

<Note>
  **🔗 Math-to-ML Connection**: Throughout this course, you'll see explicit callouts like this showing how math concepts power ML algorithms:

  | Math Concept                  | ML Application                               |
  | ----------------------------- | -------------------------------------------- |
  | **Dot product**               | Similarity in KNN, attention in transformers |
  | **Matrix multiplication**     | Every neural network layer                   |
  | **Gradient**                  | How any model learns (backpropagation)       |
  | **Probability distributions** | Loss functions, Naive Bayes, uncertainty     |
  | **Eigenvalues**               | PCA for dimensionality reduction             |

  Look for the 🔗 symbol to see these connections!
</Note>

***

## What You'll Build

By the end of this course, you'll have built:

| Project                      | What It Does                  | Skills Practiced                        |
| ---------------------------- | ----------------------------- | --------------------------------------- |
| **House Price Predictor**    | Estimate prices for any house | Linear regression, feature engineering  |
| **Email Spam Detector**      | Filter spam automatically     | Classification, Naive Bayes, thresholds |
| **Movie Recommender**        | Suggest similar movies        | KNN, distance metrics, similarity       |
| **Customer Churn Predictor** | Identify who might leave      | End-to-end pipeline, business impact    |
| **Customer Segments**        | Group similar customers       | Clustering, unsupervised learning       |
| **Stock Forecaster**         | Predict time series trends    | ARIMA, Prophet, feature engineering     |
| **Digit Recognizer**         | Classify handwritten digits   | Neural networks, deep learning intro    |
| **Production API**           | Deploy and monitor a model    | FastAPI, Docker, monitoring             |
| **Full Capstone**            | Complete churn system         | Problem to production pipeline          |

***

## 🎮 Interactive Learning Tools

<CardGroup cols={2}>
  <Card title="Scikit-Learn Playground" icon="flask" href="https://scikit-learn.org/stable/auto_examples/index.html">
    Interactive examples for every algorithm we cover. Run code directly in your browser.
  </Card>

  <Card title="TensorFlow Playground" icon="brain" href="https://playground.tensorflow.org/">
    Visualize neural networks learning in real-time. Adjust layers, neurons, and watch decision boundaries form.
  </Card>

  <Card title="Kaggle Notebooks" icon="notebook" href="https://www.kaggle.com/code">
    Free GPU-enabled notebooks with datasets. Perfect for practicing after each module.
  </Card>

  <Card title="MLflow Tracking" icon="chart-line" href="https://mlflow.org/">
    Track experiments like a pro. We'll use this in Modules 14+.
  </Card>
</CardGroup>

***

## 📚 Course Roadmap: Your 8-Week Journey

<Accordion title="Recommended Learning Schedule" icon="calendar">
  ### Week 1-2: Foundation (Modules 1-4)

  **Goal**: Understand what ML is and build your first models

  | Day | Module                           | Time | Outcome                                     |
  | --- | -------------------------------- | ---- | ------------------------------------------- |
  | 1-2 | Module 1: Prediction Game        | 3h   | Build model from scratch, no libraries      |
  | 3-4 | Module 2: Learning From Mistakes | 3h   | Understand loss functions, gradient descent |
  | 5-6 | Module 3: Linear Regression      | 4h   | Complete regression with scikit-learn       |
  | 7-8 | Module 4: Classification         | 4h   | Logistic regression, spam detector          |

  ### Week 3-4: Core Algorithms (Modules 4a-7)

  **Goal**: Master the fundamental ML algorithms

  | Day   | Module                          | Time | Outcome                               |
  | ----- | ------------------------------- | ---- | ------------------------------------- |
  | 9-10  | Module 4a-5: KNN & Trees        | 4h   | Two intuitive classifiers             |
  | 11-12 | Module 5a-5b: SVM & Naive Bayes | 4h   | Two more powerful classifiers         |
  | 13-14 | Module 6: Ensemble Methods      | 4h   | Random Forest, Gradient Boosting      |
  | 15-16 | Module 7: Model Evaluation      | 4h   | Metrics, cross-validation, comparison |

  ### Week 5-6: Professional Skills (Modules 8-14)

  **Goal**: Learn real-world ML practices

  | Day   | Module                                    | Time | Outcome                            |
  | ----- | ----------------------------------------- | ---- | ---------------------------------- |
  | 17-18 | Module 8: Feature Engineering             | 4h   | Transform raw data to features     |
  | 19-20 | Module 9-10: Tuning & End-to-End          | 6h   | Complete ML project                |
  | 21-22 | Module 11-12: Clustering & NNs            | 5h   | Unsupervised + deep learning intro |
  | 23-24 | Module 13-14: Regularization & Deployment | 5h   | Production-ready models            |

  ### Week 7-8: Advanced & Capstone (Modules 15-26)

  **Goal**: Handle real-world challenges, build portfolio project

  | Day   | Module                                             | Time | Outcome                  |
  | ----- | -------------------------------------------------- | ---- | ------------------------ |
  | 25-26 | Modules 15-17: Time Series, Bias-Variance, Leakage | 5h   | Advanced concepts        |
  | 27-28 | Modules 18-21: PCA, Imbalanced, Explainability     | 5h   | Real-world challenges    |
  | 29-30 | Modules 22-23: Pipelines, Common Mistakes          | 4h   | Best practices           |
  | 31-32 | Module 19: Capstone Project                        | 8h   | Complete portfolio piece |

  **Total: \~60 hours over 8 weeks (7-8 hours/week)**
</Accordion>

***

## ⚡ Quick Start: Environment Setup

```bash theme={null}
# Create a virtual environment
python -m venv ml-mastery-env

# Activate it (Windows)
ml-mastery-env\\Scripts\\activate

# Activate it (Mac/Linux)
source ml-mastery-env/bin/activate

# Install dependencies
pip install numpy pandas matplotlib seaborn scikit-learn jupyter
pip install xgboost lightgbm catboost  # Gradient boosting
pip install plotly ipywidgets          # Interactive visualizations
pip install mlflow                     # Experiment tracking

# Start Jupyter
jupyter notebook
```

<Tip>
  **Pro Tip**: Use Google Colab if you don't want to set up locally. It's free, has GPU support, and all libraries pre-installed!
</Tip>

***

## Prerequisites Check

You're ready if you can:

```python theme={null}
# 1. Write a function
def calculate_average(numbers):
    total = 0
    for num in numbers:
        total += num
    return total / len(numbers)

# 2. Work with lists
prices = [250000, 300000, 450000]
print(calculate_average(prices))  # 333333.33

# 3. Use basic conditionals
if price > 400000:
    print("Expensive!")
```

If that looks familiar, you're good to go.

<Accordion title="🧪 Diagnostic Quiz: Test Your Readiness" icon="flask">
  **Answer these questions to gauge your preparation:**

  **1. Python Basics**

  ```python theme={null}
  data = [3, 1, 4, 1, 5, 9, 2, 6]
  result = [x * 2 for x in data if x > 3]
  print(result)
  ```

  <details>
    <summary>What does this print?</summary>
    `[8, 10, 18, 12]` - It doubles numbers greater than 3. If you got this, your Python is ready!
  </details>

  **2. Math Intuition**
  If a house with 2000 sq ft costs $400,000, and a house with 3000 sq ft costs $500,000, what might a 2500 sq ft house cost?

  <details>
    <summary>Answer</summary>
    Around \$450,000 (linear interpolation). If you reasoned this way, you already have ML intuition!
  </details>

  **3. Data Thinking**
  You have 1000 emails labeled spam/not-spam. 950 are not spam, 50 are spam. A model that always predicts "not spam" gets 95% accuracy. Is this model good?

  <details>
    <summary>Answer</summary>
    No! It catches 0% of actual spam. You need to look at precision/recall for imbalanced data. We cover this in Module 7.
  </details>

  **Remediation Paths:**

  | If you struggled with... | Do this first                                       |
  | ------------------------ | --------------------------------------------------- |
  | Python syntax            | [Python Crash Course](/courses/python-crash-course) |
  | List operations          | Python Crash Course - Lists section                 |
  | Math intuition           | Proceed! We'll teach what you need                  |
</Accordion>

***

## Ready?

<CardGroup cols={1}>
  <Card title="Start Module 1: The Prediction Game" icon="play" href="/courses/ml-mastery/01-prediction-game">
    Let's predict some house prices. No libraries, no frameworks, just logic and arithmetic.
  </Card>
</CardGroup>

***

## 📖 Additional Resources

<Accordion title="Books, Courses, and Communities" icon="book">
  **Books (Free Online)**

  * *Hands-On ML with Scikit-Learn & TensorFlow* by Aurélien Géron - The practical bible
  * *The Hundred-Page ML Book* by Andriy Burkov - Concise theory
  * *Pattern Recognition and ML* by Bishop - Deep theory (advanced)

  **Practice Platforms**

  * **Kaggle**: Competitions, datasets, notebooks (kaggle.com)
  * **HuggingFace**: Models, datasets, demos (huggingface.co)
  * **Papers With Code**: Research with implementation (paperswithcode.com)

  **Communities**

  * **r/MachineLearning**: Research and news
  * **r/learnmachinelearning**: Beginner-friendly
  * **ML Discord servers**: Real-time help
  * **Local ML Meetups**: Networking

  **YouTube Channels**

  * **StatQuest**: Best visual explanations
  * **3Blue1Brown**: Math intuition
  * **Yannic Kilcher**: Paper reviews
  * **Two Minute Papers**: Latest research
</Accordion>
