Singular Value Decomposition (SVD)
The Magic Behind “People Like You Also Bought…”
An Everyday Mystery
You buy running shoes on Amazon. Suddenly Amazon knows you might want:- Protein powder
- A fitness tracker
- Compression socks
- A foam roller
- Spotify knows your music taste after 10 songs
- Netflix predicts you’ll rate a movie 4.2 stars
- YouTube knows which videos you’ll watch next
Before We Dive In: Why SVD Matters
| Your Data | Hidden Patterns Found | Business Value |
|---|---|---|
| Purchase history | ”Customer types” | Personalized recommendations |
| Song listening | ”Music taste dimensions" | "Discover Weekly” playlist |
| Movie ratings | ”Genre preferences" | "Because you watched…” |
| Job applications | ”Candidate profiles” | Better matching |
| Photos | ”Visual features” | Face recognition, search |
Estimated Time: 4-5 hours
Difficulty: Intermediate to Advanced
Prerequisites: Eigenvalues and PCA modules
Key Insight: Any table of data can be decomposed into hidden factors
Difficulty: Intermediate to Advanced
Prerequisites: Eigenvalues and PCA modules
Key Insight: Any table of data can be decomposed into hidden factors
A Non-Math Example: Restaurant Preferences
The Problem
Five friends rate 6 restaurants. Can we predict what Alice thinks of restaurants she hasn’t tried?The Human Insight
Looking at this, you notice:- Alice, Bob, Eve like hearty food (pizza, steak, tacos)
- Carol, Dave like lighter options (sushi, salad, ramen)
- “Hearty eater” factor
- “Light eater” factor
SVD Discovers This Automatically!
Predict Missing Ratings
The Math: Breaking Any Matrix Into Patterns
The Core Formula
Any matrix can be broken into 3 simpler matrices: Where:- = left singular vectors (patterns in rows — users/people)
- = singular values (how important each pattern is)
- = right singular vectors (patterns in columns — items/restaurants)
SVD vs Eigendecomposition
| Aspect | Eigendecomposition | SVD |
|---|---|---|
| Works on | Square matrices only | Any matrix (m × n) |
| Formula | ||
| Values | Eigenvalues (can be negative) | Singular values (always ≥ 0) |
| Requirement | Matrix must be diagonalizable | Always works! |
The Key Relationship
For a matrix :- Singular values of = square roots of eigenvalues of
Low-Rank Approximation
The magic of SVD: Keep only the top-k singular values for a best approximation! This is optimal in the Frobenius norm: is the best rank-k approximation to .Example 1: Movie Recommendation System (Netflix-Style)
The Setup
Apply SVD
Discover Hidden Factors
→ Romance movies (Movies 3, 4, 5)
Example 2: House Price Patterns
The Problem
Apply SVD
Interpret Patterns
Example 3: Student Performance Prediction
The Problem
Apply SVD
Predict Missing Grades
SVD vs PCA vs Eigenvalues
Comparison
| Method | Input | Output | Use Case |
|---|---|---|---|
| Eigenvalues | Square matrix | Eigenvalues + eigenvectors | Feature importance |
| PCA | Data matrix | Principal components | Dimensionality reduction |
| SVD | Any matrix | 3 matrices (U, Σ, V) | Recommendation, compression |
When to Use Each
Use Eigenvalues when:- You have a square matrix (covariance, adjacency)
- You want to find important directions
- Example: PageRank, stability analysis
- You want to reduce features
- You want to visualize high-D data
- Example: Compress 10 features → 3
- You have a rectangular matrix (users × items)
- You want to fill missing values
- You want to discover hidden patterns
- Example: Recommendations, collaborative filtering
SVD Applications
1. Image Compression
2. Noise Reduction
3. Latent Semantic Analysis (LSA)
🎯 Practice Exercises & Real-World Applications
Challenge yourself! These exercises demonstrate SVD applications powering billion-dollar companies.
Exercise 1: Build a Movie Recommendation Engine 🎬
Create a Netflix-style recommendation system:💡 Solution
💡 Solution
Exercise 2: Image Compression with SVD 📷
Compress an image using low-rank approximation:💡 Solution
💡 Solution
Exercise 3: Latent Semantic Analysis (LSA) for Search 🔍
Build a semantic search engine that understands meaning:💡 Solution
💡 Solution
Exercise 4: Noise Reduction in Signals 📡
Use SVD to clean noisy sensor data:💡 Solution
💡 Solution
Key Takeaways
SVD Core Concepts:
- ✅ Universal Decomposition - Any matrix A = UΣVᵀ (works for non-square!)
- ✅ Low-Rank Approximation - Keep top k singular values for compression
- ✅ Collaborative Filtering - Predict missing ratings via latent factors
- ✅ Latent Factors - Discover hidden patterns (movie genres, user preferences)
- ✅ Noise Reduction - Large singular values = signal, small = noise
Interview Prep: SVD Questions
Common SVD Interview Questions
Common SVD Interview Questions
Q: What’s the difference between PCA and SVD?
PCA uses eigendecomposition of the covariance matrix (square, symmetric). SVD works directly on any matrix (even non-square). For centered data, SVD of X gives the same principal components as PCA. SVD is more numerically stable.Q: How does Netflix use SVD for recommendations?
User-movie rating matrix is decomposed into user factors and movie factors. Each user/movie is represented by a latent vector capturing hidden preferences (action lover, comedy hater). Predictions = dot product of user and movie vectors.Q: How do you choose the rank k for truncated SVD?
Methods: (1) Energy/variance retained (e.g., 95%), (2) Cross-validation for prediction tasks, (3) Visualization of singular value decay, (4) Domain knowledge about expected rank.Q: What are singular values vs eigenvalues?
Singular values are always non-negative real numbers and exist for any matrix. Eigenvalues can be complex and only exist for square matrices. For symmetric A: singular values = |eigenvalues|.
Common Pitfalls
Course Complete!
🎉 Congratulations! You’ve mastered Linear Algebra for Machine Learning!Course Complete!
You’ve completed Linear Algebra for Machine Learning!You now have the mathematical foundation to understand neural networks, dimensionality reduction, recommendation systems, and much more. These concepts appear everywhere in ML—from word embeddings to attention mechanisms to image processing.
Your Linear Algebra Toolkit:
- ✅ Vectors - Data representation, similarity, embeddings
- ✅ Matrices - Transformations, neural network layers, data batches
- ✅ Eigenvalues - Feature importance, stability analysis
- ✅ PCA - Dimensionality reduction, visualization, noise filtering
- ✅ SVD - Recommendations, compression, matrix approximation