> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Classification

> Predict categories - spam or not spam, cat or dog, buy or don't buy

# Classification

<Frame>
  <img src="https://mintcdn.com/devweeekends/1cs3K7TO-w20cKuc/images/courses/ml-mastery/classification-concept.svg?fit=max&auto=format&n=1cs3K7TO-w20cKuc&q=85&s=ce526be49eb5a773732392195d0d92df" alt="Classification - Decision Boundary" width="1080" height="1080" data-path="images/courses/ml-mastery/classification-concept.svg" />
</Frame>

## A Different Kind of Prediction

In regression, we predict numbers: *"This house costs \$450,000"*

In classification, we predict categories: *"This email is SPAM"*

**Real-world classification problems**:

* Is this transaction fraudulent? (Yes/No)
* What digit is in this image? (0-9)
* Will this customer buy? (Yes/No)
* What disease does this patient have? (A, B, C, D)
* Is this review positive or negative? (Positive/Negative)

<Frame>
  <img src="https://mintcdn.com/devweeekends/1cs3K7TO-w20cKuc/images/courses/ml-mastery/classification-real-world.svg?fit=max&auto=format&n=1cs3K7TO-w20cKuc&q=85&s=3a1323a978a7ce5a65bb9502d89425d7" alt="Medical Diagnosis Classification" width="1080" height="1080" data-path="images/courses/ml-mastery/classification-real-world.svg" />
</Frame>

***

## The Email Spam Problem

Let's build a spam detector from scratch.

### The Data

Imagine each email is represented by features:

* Number of exclamation marks
* Contains word "FREE"
* Contains word "WINNER"
* Sender in contacts
* Length of email

```python theme={null}
import numpy as np

# Email features: [exclamation_count, has_free, has_winner, in_contacts, length_bucket]
# Labels: 0 = not spam, 1 = spam

emails = np.array([
    [5, 1, 1, 0, 1],   # Short, has FREE and WINNER, lots of !!! -> likely spam
    [0, 0, 0, 1, 3],   # Long, from contact, no sketchy words -> not spam
    [3, 1, 0, 0, 1],   # Has FREE, some !!! -> maybe spam
    [0, 0, 0, 1, 2],   # From contact -> not spam
    [10, 1, 1, 0, 1],  # Very spammy
    [1, 0, 0, 1, 3],   # Normal email from contact
    [8, 1, 1, 0, 1],   # Spammy
    [0, 0, 0, 0, 2],   # Normal email
])

labels = np.array([1, 0, 1, 0, 1, 0, 1, 0])  # 1=spam, 0=not spam
```

### Why Not Just Use Linear Regression?

Let's try:

```python theme={null}
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(emails, labels)

# Predict
predictions = model.predict(emails)
print("Predictions:", predictions)
# Output: [0.89, 0.12, 0.67, 0.15, 1.12, 0.18, 0.95, 0.22]
```

**Problems**:

1. Predictions can be > 1 or \< 0 (what does 1.12 "spam" mean?)
2. We want probabilities (0 to 1), not arbitrary numbers
3. We want a clear decision: spam or not spam

***

## The Sigmoid Function: Squashing to Probabilities

We need a function that:

* Takes any number (from -∞ to +∞)
* Outputs a value between 0 and 1
* Acts like a probability

Enter the **sigmoid function** -- nature's favorite dimmer switch:

$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

Think of it like a confidence meter. The linear model produces a raw score (could be -47 or +312), and sigmoid translates it into "how confident are we?" on a 0-to-1 scale. Very negative scores become "almost certainly not spam" (near 0), and very positive scores become "almost certainly spam" (near 1). Zero is the tipping point -- 50/50.

```python theme={null}
def sigmoid(z):
    """Squash any number to range (0, 1)"""
    return 1 / (1 + np.exp(-z))

# Test it
for z in [-10, -2, 0, 2, 10]:
    print(f"sigmoid({z:3d}) = {sigmoid(z):.4f}")
```

**Output**:

```
sigmoid(-10) = 0.0000  # Very negative -> close to 0
sigmoid( -2) = 0.1192  # Negative -> small
sigmoid(  0) = 0.5000  # Zero -> 0.5 (uncertain)
sigmoid(  2) = 0.8808  # Positive -> close to 1
sigmoid( 10) = 1.0000  # Very positive -> close to 1
```

***

## Logistic Regression

Combine linear regression with sigmoid:

$$
P(spam) = \sigma(w_0 + w_1 x_1 + w_2 x_2 + ... + w_n x_n)
$$

1. Compute a weighted sum (like linear regression)
2. Pass through sigmoid to get a probability
3. If probability > 0.5, predict "spam"

```python theme={null}
def logistic_regression_predict_proba(X, w):
    """
    Predict probability of class 1.
    """
    z = X @ w  # Linear combination
    return sigmoid(z)  # Squash to probability

def logistic_regression_predict(X, w, threshold=0.5):
    """
    Predict class labels (0 or 1).
    """
    probabilities = logistic_regression_predict_proba(X, w)
    return (probabilities >= threshold).astype(int)
```

***

## Training Logistic Regression

### The Loss Function

For classification, we use **Binary Cross-Entropy** (log loss):

$$
L = -\frac{1}{n} \sum_{i=1}^{n} [y_i \log(\hat{y}_i) + (1-y_i) \log(1-\hat{y}_i)]
$$

**Why not use MSE like in regression?** Because MSE creates a loss surface with many flat plateaus for classification, making gradient descent painfully slow. Cross-entropy has steep slopes that push the model to fix its confident-but-wrong predictions aggressively.

**Intuition** -- think of it as a "surprise" score:

* If actual is 1 and we predict 0.9 -- small loss (not surprised, good prediction!)
* If actual is 1 and we predict 0.1 -- large loss (very surprised, terrible prediction!)
* If actual is 1 and we predict 0.001 -- *enormous* loss (the log function explodes as predictions approach 0, heavily penalizing confident wrong answers)

```python theme={null}
def binary_cross_entropy(y_true, y_pred):
    """
    Compute binary cross-entropy loss.
    """
    # Clip predictions to avoid log(0)
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    loss = -np.mean(
        y_true * np.log(y_pred) + 
        (1 - y_true) * np.log(1 - y_pred)
    )
    return loss
```

### Gradient Descent for Logistic Regression

```python theme={null}
def train_logistic_regression(X, y, learning_rate=0.1, num_epochs=1000):
    """
    Train logistic regression using gradient descent.
    """
    # Add bias column
    X_bias = np.column_stack([np.ones(len(X)), X])
    
    # Initialize weights
    w = np.zeros(X_bias.shape[1])
    
    for epoch in range(num_epochs):
        # Forward pass
        z = X_bias @ w
        predictions = sigmoid(z)
        
        # Compute loss
        loss = binary_cross_entropy(y, predictions)
        
        # Compute gradient
        errors = predictions - y
        gradient = X_bias.T @ errors / len(y)
        
        # Update weights
        w = w - learning_rate * gradient
        
        if epoch % 100 == 0:
            print(f"Epoch {epoch}: Loss = {loss:.4f}")
    
    return w

# Train on our email data
weights = train_logistic_regression(emails, labels)

# Make predictions
X_bias = np.column_stack([np.ones(len(emails)), emails])
probs = sigmoid(X_bias @ weights)
preds = (probs >= 0.5).astype(int)

print("\nPredictions vs Actual:")
for i in range(len(emails)):
    print(f"Email {i}: P(spam)={probs[i]:.2f}, Predicted={preds[i]}, Actual={labels[i]}")
```

***

## Using scikit-learn

```python theme={null}
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Create and train model.
# Despite its name, LogisticRegression is a CLASSIFIER, not a regressor.
# The "regression" in the name refers to the mathematical technique
# (fitting a logistic function), not the type of problem.
model = LogisticRegression()
model.fit(emails, labels)

# Predict hard labels (0 or 1)
predictions = model.predict(emails)

# Predict probabilities -- often more useful than hard labels.
# [:, 1] selects the probability of class 1 (spam).
# Use these for ranking, threshold tuning, or when downstream
# decisions depend on confidence level.
probabilities = model.predict_proba(emails)[:, 1]  # P(spam)

print("Predictions:", predictions)
print("Probabilities:", probabilities)
print(f"Accuracy: {accuracy_score(labels, predictions):.2%}")
```

***

## Real Example: Breast Cancer Detection

```python theme={null}
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

# Load data
cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
print("Features:", cancer.feature_names[:5], "...")
print("Classes:", cancer.target_names)  # ['malignant' 'benign']

# Split and scale
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train -- max_iter=5000 gives the optimizer enough iterations to converge.
# Logistic regression uses an iterative solver internally, and the default
# 100 iterations isn't always enough for high-dimensional data.
model = LogisticRegression(max_iter=5000)
model.fit(X_train_scaled, y_train)

# Evaluate on data the model has never seen
y_pred = model.predict(X_test_scaled)
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=cancer.target_names))

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
# In medical contexts, pay special attention to False Negatives (FN):
# a patient with cancer classified as benign. This is more dangerous
# than a False Positive (healthy person flagged for further testing).
```

***

## Understanding the Confusion Matrix

```
                  Predicted
                  Neg   Pos
Actual  Neg  [  TN    FP  ]
        Pos  [  FN    TP  ]
```

* **True Positive (TP)**: Predicted spam, was spam
* **True Negative (TN)**: Predicted not spam, was not spam
* **False Positive (FP)**: Predicted spam, was not spam (annoying!)
* **False Negative (FN)**: Predicted not spam, was spam (dangerous!)

***

## Key Metrics

```python theme={null}
from sklearn.metrics import precision_score, recall_score, f1_score

# Precision: Of all spam predictions, how many were correct?
# "When we say spam, how often are we right?"
precision = precision_score(y_test, y_pred)

# Recall: Of all actual spam, how many did we catch?
# "What % of spam did we catch?"
recall = recall_score(y_test, y_pred)

# F1: Harmonic mean of precision and recall
f1 = f1_score(y_test, y_pred)

print(f"Precision: {precision:.2%}")
print(f"Recall:    {recall:.2%}")
print(f"F1 Score:  {f1:.2%}")
```

<Note>
  **When to prioritize which metric?**

  Think of it as a cost-of-mistakes analysis:

  * **High Precision needed**: Spam filter -- if you mark a real email as spam, your user misses an important message. The cost of a false positive is high.
  * **High Recall needed**: Disease detection -- if you miss a sick patient and send them home, the consequences could be fatal. The cost of a false negative is high.
  * **F1 Score**: When you need balance between both, or when you're not sure which type of mistake is worse. F1 is the harmonic mean, which means it punishes you if *either* precision or recall is low.

  **A senior engineer's shortcut**: Ask the business stakeholder "What's worse -- a false alarm or a missed catch?" Their answer tells you which metric to optimize.
</Note>

***

## Multi-Class Classification

What if there are more than 2 classes?

```python theme={null}
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

# Load iris data (3 classes of flowers)
iris = load_iris()
X, y = iris.data, iris.target
print("Classes:", iris.target_names)  # ['setosa' 'versicolor' 'virginica']

# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train (scikit-learn handles multi-class automatically!)
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred, target_names=iris.target_names))

# Get probabilities for each class
probs = model.predict_proba(X_test[:3])
print("\nProbabilities for first 3 samples:")
for i, p in enumerate(probs):
    print(f"Sample {i}: {dict(zip(iris.target_names, p.round(3)))}")
```

***

## The Decision Boundary

Logistic regression creates a linear decision boundary:

```python theme={null}
import matplotlib.pyplot as plt

# Use just 2 features for visualization
X_2d = iris.data[:, :2]  # sepal length and width
y = iris.target

# Train
model = LogisticRegression(max_iter=1000)
model.fit(X_2d, y)

# Create a mesh grid for decision boundary
x_min, x_max = X_2d[:, 0].min() - 0.5, X_2d[:, 0].max() + 0.5
y_min, y_max = X_2d[:, 1].min() - 0.5, X_2d[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                     np.arange(y_min, y_max, 0.02))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plot
plt.figure(figsize=(10, 6))
plt.contourf(xx, yy, Z, alpha=0.3, cmap='viridis')
plt.scatter(X_2d[:, 0], X_2d[:, 1], c=y, cmap='viridis', edgecolors='black')
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.title('Logistic Regression Decision Boundary')
plt.show()
```

***

## Key Takeaways

<CardGroup cols={2}>
  <Card title="Classification = Categories" icon="tags">
    Predict discrete labels, not numbers
  </Card>

  <Card title="Sigmoid = Probability" icon="percent">
    Squash outputs to 0-1 range
  </Card>

  <Card title="Threshold = Decision" icon="scale-balanced">
    P > 0.5 means positive class
  </Card>

  <Card title="Metrics Matter" icon="chart-pie">
    Accuracy isn't always enough
  </Card>
</CardGroup>

***

## 🚀 Mini Projects

<CardGroup cols={2}>
  <Card title="Project 1" icon="envelope" color="#3B82F6">
    Build a spam detector from scratch
  </Card>

  <Card title="Project 2" icon="heart" color="#10B981">
    Medical diagnosis classifier with metrics analysis
  </Card>

  <Card title="Project 3" icon="user-xmark" color="#8B5CF6">
    Customer churn prediction system
  </Card>
</CardGroup>

<details>
  <summary>**Project 1: Spam Email Detector** - Text classification basics</summary>

  **Objective**: Build a simple spam classifier using word features.

  ```python theme={null}
  import numpy as np
  from sklearn.linear_model import LogisticRegression
  from sklearn.model_selection import train_test_split
  from sklearn.metrics import classification_report, confusion_matrix

  # Simulated email data (word counts)
  # Features: [contains_free, contains_winner, contains_meeting, contains_urgent, word_count]
  emails = [
      # [free, winner, meeting, urgent, word_count], is_spam
      ([1, 1, 0, 1, 50], 1),    # "You're a FREE WINNER! URGENT!"
      ([1, 0, 0, 1, 30], 1),    # "FREE offer URGENT!"
      ([0, 1, 0, 0, 45], 1),    # "You are the WINNER!"
      ([0, 0, 1, 0, 120], 0),   # "Meeting scheduled for Monday"
      ([0, 0, 1, 0, 85], 0),    # "Team meeting notes"
      ([0, 0, 0, 1, 200], 0),   # "Urgent: Project deadline"
      ([1, 1, 0, 1, 25], 1),    # "FREE WINNER URGENT!"
      ([0, 0, 1, 0, 150], 0),   # "Meeting agenda attached"
      ([1, 0, 0, 0, 100], 0),   # "Free trial of software"
      ([0, 0, 0, 0, 300], 0),   # Normal work email
  ]

  # More data for training
  np.random.seed(42)
  n_spam = 100
  n_ham = 150

  # Generate spam emails (high free, winner, urgent, low word count)
  spam_data = np.column_stack([
      np.random.binomial(1, 0.7, n_spam),  # free
      np.random.binomial(1, 0.5, n_spam),  # winner
      np.random.binomial(1, 0.1, n_spam),  # meeting
      np.random.binomial(1, 0.6, n_spam),  # urgent
      np.random.normal(40, 15, n_spam),    # word count (short)
  ])
  spam_labels = np.ones(n_spam)

  # Generate ham emails (low free, winner, high meeting, longer)
  ham_data = np.column_stack([
      np.random.binomial(1, 0.1, n_ham),   # free
      np.random.binomial(1, 0.05, n_ham),  # winner
      np.random.binomial(1, 0.4, n_ham),   # meeting
      np.random.binomial(1, 0.2, n_ham),   # urgent
      np.random.normal(150, 50, n_ham),    # word count (longer)
  ])
  ham_labels = np.zeros(n_ham)

  # Combine
  X = np.vstack([spam_data, ham_data])
  y = np.concatenate([spam_labels, ham_labels])

  # Split and train
  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

  model = LogisticRegression()
  model.fit(X_train, y_train)

  # Evaluate
  y_pred = model.predict(X_test)
  print("=== Spam Classifier Results ===")
  print(classification_report(y_test, y_pred, target_names=['Ham', 'Spam']))

  # Feature importance
  feature_names = ['contains_free', 'contains_winner', 'contains_meeting', 'contains_urgent', 'word_count']
  print("=== Feature Importance ===")
  for name, coef in zip(feature_names, model.coef_[0]):
      indicator = "→ SPAM" if coef > 0.5 else "→ HAM" if coef < -0.5 else ""
      print(f"{name}: {coef:+.3f} {indicator}")

  # Test new emails
  new_emails = [
      [1, 1, 0, 1, 30],  # Looks spammy
      [0, 0, 1, 0, 200], # Looks legitimate
      [1, 0, 1, 0, 100], # Mixed signals
  ]
  print("\n=== New Email Predictions ===")
  for email in new_emails:
      prob = model.predict_proba([email])[0]
      pred = "SPAM" if prob[1] > 0.5 else "HAM"
      print(f"Features {email}: {pred} (P(spam)={prob[1]:.2f})")
  ```
</details>

<details>
  <summary>**Project 2: Medical Diagnosis Classifier** - Handle class imbalance</summary>

  **Objective**: Classify patients as healthy or having a disease with proper metrics.

  ```python theme={null}
  import numpy as np
  from sklearn.linear_model import LogisticRegression
  from sklearn.model_selection import train_test_split
  from sklearn.metrics import (classification_report, confusion_matrix, 
                               roc_auc_score, precision_recall_curve)
  from sklearn.preprocessing import StandardScaler

  # Simulate medical data (imbalanced: 5% positive)
  np.random.seed(42)
  n_healthy = 950
  n_disease = 50

  # Features: [age, blood_pressure, cholesterol, glucose, bmi]
  healthy = np.column_stack([
      np.random.normal(40, 15, n_healthy),   # age
      np.random.normal(120, 10, n_healthy),  # bp
      np.random.normal(180, 30, n_healthy),  # cholesterol
      np.random.normal(90, 10, n_healthy),   # glucose
      np.random.normal(24, 3, n_healthy),    # bmi
  ])

  disease = np.column_stack([
      np.random.normal(55, 12, n_disease),   # older
      np.random.normal(145, 15, n_disease),  # higher bp
      np.random.normal(240, 40, n_disease),  # higher cholesterol
      np.random.normal(130, 25, n_disease),  # higher glucose
      np.random.normal(30, 4, n_disease),    # higher bmi
  ])

  X = np.vstack([healthy, disease])
  y = np.array([0]*n_healthy + [1]*n_disease)

  # Split
  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

  # Scale
  scaler = StandardScaler()
  X_train_scaled = scaler.fit_transform(X_train)
  X_test_scaled = scaler.transform(X_test)

  # Train with class weights (handle imbalance)
  model = LogisticRegression(class_weight='balanced')
  model.fit(X_train_scaled, y_train)

  # Predictions
  y_pred = model.predict(X_test_scaled)
  y_prob = model.predict_proba(X_test_scaled)[:, 1]

  print("=== Medical Diagnosis Results ===")
  print(f"Disease prevalence: {y.mean():.1%}")
  print(classification_report(y_test, y_pred, target_names=['Healthy', 'Disease']))

  # Confusion matrix
  cm = confusion_matrix(y_test, y_pred)
  print("Confusion Matrix:")
  print(f"              Pred Healthy  Pred Disease")
  print(f"True Healthy       {cm[0,0]:4d}          {cm[0,1]:4d}")
  print(f"True Disease       {cm[1,0]:4d}          {cm[1,1]:4d}")

  # Calculate key metrics for medical context
  tn, fp, fn, tp = cm.ravel()
  sensitivity = tp / (tp + fn)  # Recall for disease
  specificity = tn / (tn + fp)  # Recall for healthy
  ppv = tp / (tp + fp)          # Precision for disease
  npv = tn / (tn + fn)          # Precision for healthy

  print(f"\n=== Medical Metrics ===")
  print(f"Sensitivity (True Positive Rate): {sensitivity:.1%}")
  print(f"Specificity (True Negative Rate): {specificity:.1%}")
  print(f"PPV (Positive Predictive Value): {ppv:.1%}")
  print(f"NPV (Negative Predictive Value): {npv:.1%}")
  print(f"ROC-AUC: {roc_auc_score(y_test, y_prob):.3f}")

  # Medical interpretation
  print("\n=== Interpretation ===")
  print(f"Of patients WE predict have disease, {ppv:.0%} actually do (PPV)")
  print(f"Of patients WHO HAVE disease, we catch {sensitivity:.0%} (Sensitivity)")
  print(f"Missing {fn} out of {tp+fn} disease cases is concerning!")
  ```
</details>

<details>
  <summary>**Project 3: Customer Churn Prediction** - Business ML application</summary>

  **Objective**: Predict which customers will cancel their subscription.

  ```python theme={null}
  import numpy as np
  from sklearn.linear_model import LogisticRegression
  from sklearn.model_selection import train_test_split
  from sklearn.preprocessing import StandardScaler
  from sklearn.metrics import classification_report, roc_auc_score

  # Simulate customer data
  np.random.seed(42)
  n_customers = 1000

  # Features
  tenure_months = np.random.exponential(24, n_customers)
  monthly_charges = np.random.normal(70, 20, n_customers)
  support_tickets = np.random.poisson(2, n_customers)
  contract_type = np.random.choice([0, 1, 2], n_customers, p=[0.4, 0.35, 0.25])  # monthly, 1yr, 2yr
  num_services = np.random.randint(1, 6, n_customers)

  # Churn probability model
  def sigmoid(x):
      return 1 / (1 + np.exp(-np.clip(x, -500, 500)))

  churn_prob = sigmoid(
      -1 
      - 0.03 * tenure_months
      + 0.01 * monthly_charges
      + 0.4 * support_tickets
      - 0.8 * contract_type
      - 0.1 * num_services
  )
  churned = (np.random.random(n_customers) < churn_prob).astype(int)

  print(f"Overall churn rate: {churned.mean():.1%}")

  # Create features
  X = np.column_stack([tenure_months, monthly_charges, support_tickets, contract_type, num_services])
  y = churned

  # Split and train
  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
  scaler = StandardScaler()
  X_train_scaled = scaler.fit_transform(X_train)
  X_test_scaled = scaler.transform(X_test)

  model = LogisticRegression()
  model.fit(X_train_scaled, y_train)

  # Evaluate
  y_pred = model.predict(X_test_scaled)
  y_prob = model.predict_proba(X_test_scaled)[:, 1]

  print("\n=== Churn Prediction Model ===")
  print(classification_report(y_test, y_pred, target_names=['Stay', 'Churn']))
  print(f"ROC-AUC: {roc_auc_score(y_test, y_prob):.3f}")

  # Feature importance
  features = ['tenure', 'monthly_charges', 'support_tickets', 'contract_type', 'num_services']
  print("\n=== Churn Risk Factors ===")
  for name, coef in zip(features, model.coef_[0]):
      effect = "↑ churn risk" if coef > 0 else "↓ churn risk"
      print(f"{name}: {coef:+.3f} ({effect})")

  # Business simulation
  print("\n=== Business Impact Simulation ===")
  avg_customer_value = 1000  # Annual value
  intervention_cost = 50
  intervention_success_rate = 0.3

  # Identify high-risk customers
  high_risk = y_prob > 0.5
  n_high_risk = high_risk.sum()

  print(f"High-risk customers identified: {n_high_risk}")
  print(f"Intervention cost: ${n_high_risk * intervention_cost:,}")

  # Expected saves
  expected_churners_in_high_risk = (y_test[high_risk] == 1).sum()
  expected_saves = expected_churners_in_high_risk * intervention_success_rate
  expected_value_saved = expected_saves * avg_customer_value

  print(f"Expected saves: {expected_saves:.0f} customers")
  print(f"Expected value saved: ${expected_value_saved:,.0f}")
  print(f"Net benefit: ${expected_value_saved - n_high_risk * intervention_cost:,.0f}")
  ```
</details>

***

## What's Next?

Before moving to more complex algorithms, let's learn K-Nearest Neighbors - an even more intuitive approach to classification!

<Card title="Continue to Module 4a: K-Nearest Neighbors" icon="arrow-right" href="/courses/ml-mastery/04a-knn">
  Classify by finding similar examples - the simplest ML algorithm
</Card>
