Hyperparameter Tuning
Parameters vs Hyperparameters
The Tuning Problem
Grid Search: Try Everything
Visualizing Grid Search Results
Random Search: Smart Sampling
Bayesian Optimization: Learn from History
How Bayesian Optimization Works
Optuna: Modern Hyperparameter Tuning
Practical Tips
1. Start Coarse, Then Refine
2. Use Early Stopping for Speed
3. Different Metrics for Different Problems
4. Nested Cross-Validation
Common Hyperparameters by Model
Random Forest
Gradient Boosting / XGBoost
SVM
Neural Networks
🚀 Mini Projects
Project 1: Search Strategy Comparison
Project 2: Learning Curve Analyzer
Project 3: Custom Hyperparameter Optimizer
Project 4: Auto-ML Mini Framework
Key Takeaways
What’s Next?

Hyperparameter Tuning

Parameters vs Hyperparameters

Parameters: Learned from data during training

Weights in linear regression
Split points in decision trees

Hyperparameters: Set before training

Learning rate
Number of trees in Random Forest
Maximum depth of trees

You choose hyperparameters. The model learns parameters.

The Tuning Problem

A Random Forest has many hyperparameters:

from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(
    n_estimators=100,       # How many trees?
    max_depth=10,           # How deep?
    min_samples_split=2,    # Min samples to split?
    min_samples_leaf=1,     # Min samples in leaf?
    max_features='sqrt',    # Features per split?
    bootstrap=True,         # Sample with replacement?
    random_state=42
)

How do you find the best combination?

Grid Search: Try Everything

Define a grid of values and try every combination:

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load data
cancer = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    cancer.data, cancer.target, test_size=0.2, random_state=42
)

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [5, 10, 15, None],
    'min_samples_split': [2, 5, 10]
}

# Total combinations: 3 × 4 × 3 = 36

# Grid search
grid_search = GridSearchCV(
    RandomForestClassifier(random_state=42),
    param_grid,
    cv=5,               # 5-fold cross-validation
    scoring='accuracy', # Metric to optimize
    n_jobs=-1,          # Use all CPU cores
    verbose=1           # Show progress
)

grid_search.fit(X_train, y_train)

# Results
print(f"Best parameters: {grid_search.best_params_}")
print(f"Best CV score: {grid_search.best_score_:.4f}")
print(f"Test score: {grid_search.score(X_test, y_test):.4f}")

Visualizing Grid Search Results

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Get results as DataFrame
results = pd.DataFrame(grid_search.cv_results_)
results = results[['param_n_estimators', 'param_max_depth', 'mean_test_score', 'std_test_score']]
print(results.sort_values('mean_test_score', ascending=False).head(10))

# Heatmap for 2 parameters
pivot = results.pivot_table(
    values='mean_test_score',
    index='param_max_depth',
    columns='param_n_estimators'
)

plt.figure(figsize=(10, 6))
plt.imshow(pivot, cmap='viridis', aspect='auto')
plt.colorbar(label='Mean CV Score')
plt.xlabel('n_estimators')
plt.ylabel('max_depth')
plt.xticks(range(len(pivot.columns)), pivot.columns)
plt.yticks(range(len(pivot.index)), pivot.index)
plt.title('Grid Search Results')

# Annotate
for i in range(len(pivot.index)):
    for j in range(len(pivot.columns)):
        plt.text(j, i, f'{pivot.iloc[i, j]:.3f}', ha='center', va='center', color='white')

plt.tight_layout()
plt.show()

Random Search: Smart Sampling

Grid search has a problem: exponential explosion.

5 hyperparameters
5 values each
5^5 = 3,125 combinations!

Random Search samples randomly from parameter distributions:

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint, uniform

# Define parameter distributions
param_distributions = {
    'n_estimators': randint(50, 300),           # Integer between 50-300
    'max_depth': randint(3, 20),                # Integer between 3-20
    'min_samples_split': randint(2, 15),        # Integer between 2-15
    'min_samples_leaf': randint(1, 10),         # Integer between 1-10
    'max_features': ['sqrt', 'log2', None]      # Categorical
}

# Random search
random_search = RandomizedSearchCV(
    RandomForestClassifier(random_state=42),
    param_distributions,
    n_iter=50,          # Try 50 random combinations
    cv=5,
    scoring='accuracy',
    n_jobs=-1,
    random_state=42,
    verbose=1
)

random_search.fit(X_train, y_train)

print(f"Best parameters: {random_search.best_params_}")
print(f"Best CV score: {random_search.best_score_:.4f}")

Research shows: Random search often finds good hyperparameters faster than grid search, especially when some hyperparameters matter more than others.

Bayesian Optimization: Learn from History

Instead of random sampling, use past results to guide the search:

# pip install scikit-optimize
from skopt import BayesSearchCV
from skopt.space import Integer, Real, Categorical

# Define search space
search_space = {
    'n_estimators': Integer(50, 300),
    'max_depth': Integer(3, 20),
    'min_samples_split': Integer(2, 15),
    'min_samples_leaf': Integer(1, 10),
    'max_features': Categorical(['sqrt', 'log2', None])
}

# Bayesian search
bayes_search = BayesSearchCV(
    RandomForestClassifier(random_state=42),
    search_space,
    n_iter=50,
    cv=5,
    scoring='accuracy',
    n_jobs=-1,
    random_state=42,
    verbose=1
)

bayes_search.fit(X_train, y_train)

print(f"Best parameters: {bayes_search.best_params_}")
print(f"Best CV score: {bayes_search.best_score_:.4f}")

How Bayesian Optimization Works

Try some random points
Build a model of: parameter values → score
Use model to find promising regions
Evaluate and update model
Repeat

Balances exploration (try new areas) and exploitation (focus on promising areas).

Optuna: Modern Hyperparameter Tuning

# pip install optuna
import optuna
from sklearn.model_selection import cross_val_score

def objective(trial):
    # Suggest hyperparameters
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 50, 300),
        'max_depth': trial.suggest_int('max_depth', 3, 20),
        'min_samples_split': trial.suggest_int('min_samples_split', 2, 15),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features': trial.suggest_categorical('max_features', ['sqrt', 'log2', None])
    }
    
    # Create and evaluate model
    model = RandomForestClassifier(**params, random_state=42)
    scores = cross_val_score(model, X_train, y_train, cv=5, scoring='accuracy')
    
    return scores.mean()

# Run optimization
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50, show_progress_bar=True)

print(f"Best trial: {study.best_trial.params}")
print(f"Best value: {study.best_value:.4f}")

# Visualization
optuna.visualization.plot_optimization_history(study)
optuna.visualization.plot_param_importances(study)

Practical Tips

1. Start Coarse, Then Refine

# Step 1: Coarse search
param_grid_coarse = {
    'n_estimators': [50, 100, 200],
    'max_depth': [5, 10, 20, None]
}

grid_coarse = GridSearchCV(model, param_grid_coarse, cv=3)
grid_coarse.fit(X_train, y_train)
# Best: n_estimators=100, max_depth=10

# Step 2: Fine search around best
param_grid_fine = {
    'n_estimators': [80, 100, 120],
    'max_depth': [8, 10, 12]
}

grid_fine = GridSearchCV(model, param_grid_fine, cv=5)
grid_fine.fit(X_train, y_train)

2. Use Early Stopping for Speed

from sklearn.ensemble import GradientBoostingClassifier

# Only tune what matters most
param_grid = {
    'n_estimators': [100, 200, 500],
    'learning_rate': [0.01, 0.1, 0.3],
    'max_depth': [3, 5, 7]
}

3. Different Metrics for Different Problems

from sklearn.model_selection import GridSearchCV

# Classification
scoring_classification = ['accuracy', 'f1', 'roc_auc', 'precision', 'recall']

# Use refit to choose final model
grid = GridSearchCV(
    model, 
    param_grid,
    cv=5,
    scoring=scoring_classification,
    refit='f1'  # Final model optimizes for F1
)

4. Nested Cross-Validation

For unbiased evaluation of the tuning process:

from sklearn.model_selection import cross_val_score, GridSearchCV

# Inner loop: tune hyperparameters
inner_cv = GridSearchCV(
    RandomForestClassifier(random_state=42),
    param_grid,
    cv=5
)

# Outer loop: evaluate the whole tuning process
outer_scores = cross_val_score(inner_cv, X, y, cv=5)
print(f"Nested CV Score: {outer_scores.mean():.4f} (+/- {outer_scores.std():.4f})")

Common Hyperparameters by Model

Random Forest

{
    'n_estimators': [100, 200, 500],
    'max_depth': [5, 10, 15, None],
    'min_samples_split': [2, 5, 10],
    'max_features': ['sqrt', 'log2']
}

Gradient Boosting / XGBoost

{
    'n_estimators': [100, 200, 500],
    'learning_rate': [0.01, 0.1, 0.3],
    'max_depth': [3, 5, 7],
    'subsample': [0.8, 1.0],
    'colsample_bytree': [0.8, 1.0]  # XGBoost
}

SVM

{
    'C': [0.1, 1, 10, 100],
    'gamma': ['scale', 'auto', 0.1, 1],
    'kernel': ['rbf', 'poly']
}

Neural Networks

{
    'hidden_layer_sizes': [(50,), (100,), (50, 50), (100, 50)],
    'alpha': [0.0001, 0.001, 0.01],
    'learning_rate_init': [0.001, 0.01]
}

🚀 Mini Projects

Project 1: Search Strategy Comparison

Compare Grid, Random, and Bayesian search

Project 2: Learning Curve Analyzer

Diagnose underfitting vs overfitting with tuning

Project 3: Custom Hyperparameter Optimizer

Build your own optimization algorithm

Project 4: Auto-ML Mini Framework

Create an automated model tuning system

Project 1: Search Strategy Comparison

Compare different hyperparameter search strategies on the same problem.

Project 2: Learning Curve Analyzer

Use learning curves to determine if more data or different hyperparameters would help.

Project 3: Custom Hyperparameter Optimizer

Build a simple Bayesian-style optimizer from scratch.

Project 4: Auto-ML Mini Framework

Create an automated model tuning system that handles multiple models.

Key Takeaways

Grid Search

Exhaustive but slow. Good for small spaces.

Random Search

Often better than grid. Use for larger spaces.

Bayesian Optimization

Smart search. Best for expensive evaluations.

Nested CV

Unbiased estimate of tuning performance.

What’s Next?

You’ve learned individual algorithms. Now let’s see how to tackle real-world ML projects end-to-end!

Continue to Module 10: End-to-End ML Project

Apply everything in a complete machine learning project

Feature Engineering End-to-End Project

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Hyperparameter Tuning

​Parameters vs Hyperparameters

​The Tuning Problem

​Grid Search: Try Everything

​Visualizing Grid Search Results

​Random Search: Smart Sampling

​Bayesian Optimization: Learn from History

​How Bayesian Optimization Works

​Optuna: Modern Hyperparameter Tuning

​Practical Tips

​1. Start Coarse, Then Refine

​2. Use Early Stopping for Speed

​3. Different Metrics for Different Problems

​4. Nested Cross-Validation

​Common Hyperparameters by Model

​Random Forest

​Gradient Boosting / XGBoost

​SVM

​Neural Networks

​🚀 Mini Projects

Project 1: Search Strategy Comparison

Project 2: Learning Curve Analyzer

Project 3: Custom Hyperparameter Optimizer

Project 4: Auto-ML Mini Framework

​Project 1: Search Strategy Comparison

​Project 2: Learning Curve Analyzer

​Project 3: Custom Hyperparameter Optimizer

​Project 4: Auto-ML Mini Framework

​Key Takeaways

Grid Search

Random Search

Bayesian Optimization

Nested CV

​What’s Next?

Continue to Module 10: End-to-End ML Project

Hyperparameter Tuning

Parameters vs Hyperparameters

The Tuning Problem

Grid Search: Try Everything

Visualizing Grid Search Results

Random Search: Smart Sampling

Bayesian Optimization: Learn from History

How Bayesian Optimization Works

Optuna: Modern Hyperparameter Tuning

Practical Tips

1. Start Coarse, Then Refine

2. Use Early Stopping for Speed

3. Different Metrics for Different Problems

4. Nested Cross-Validation

Common Hyperparameters by Model

Random Forest

Gradient Boosting / XGBoost

SVM

Neural Networks

🚀 Mini Projects

Project 1: Search Strategy Comparison

Project 2: Learning Curve Analyzer

Project 3: Custom Hyperparameter Optimizer

Project 4: Auto-ML Mini Framework

Key Takeaways

What’s Next?