Orthogonality & Projections
Why Orthogonality Matters
In machine learning and data science, orthogonality is everywhere:| Application | How Orthogonality Is Used |
|---|---|
| PCA | Find orthogonal directions of maximum variance |
| Least Squares | Project data onto the column space of features |
| Signal Processing | Decompose signals into orthogonal frequency components |
| QR Decomposition | Numerically stable way to solve linear systems |
| Gram-Schmidt | Create orthonormal bases for any subspace |
Estimated Time: 3-4 hours
Difficulty: Intermediate
Prerequisites: Vectors and Matrices modules
What You’ll Build: Image compression, signal denoising, and robust regression
Difficulty: Intermediate
Prerequisites: Vectors and Matrices modules
What You’ll Build: Image compression, signal denoising, and robust regression
The Intuition: Perpendicular = Independent
A Simple Example
Two vectors are orthogonal (perpendicular) if they point in completely independent directions.The Mathematical Test
Two vectors and are orthogonal if and only if:Geometric Interpretation: When the dot product is zero, the vectors form a 90° angle. No component of one vector points in the direction of the other.
Orthonormal Bases: The Gold Standard
An orthonormal basis is a set of vectors that are:- Orthogonal: Every pair has dot product zero
- Normalized: Each vector has length 1
Why Orthonormal Is Amazing
Gram-Schmidt: Making Any Basis Orthonormal
Given any set of linearly independent vectors, we can create an orthonormal basis.Projections: The Heart of Least Squares
Projecting onto a Line
The projection of vector onto vector gives the component of in the direction of .Projecting onto a Subspace (Least Squares!)
When we have a matrix with columns spanning a subspace, the projection of onto this subspace is: This is exactly the normal equations for least squares!QR Decomposition: The Practical Tool
QR decomposition factors a matrix as: Where:- is orthogonal (columns are orthonormal)
- is upper triangular
Why QR Is Better Than Normal Equations
Application: Signal Denoising with Orthogonal Bases
The Discrete Cosine Transform (DCT)
Signals can be decomposed into orthogonal frequency components.Application: Image Compression
Images can be compressed by keeping only the most important orthogonal components.Practice Exercises
Exercise 1: Orthogonal Projection
Exercise 1: Orthogonal Projection
Problem: Given vectors and (which are orthogonal), find the projection of onto the plane spanned by and .Hint: For orthogonal bases, projections can be computed independently and summed.
Exercise 2: Gram-Schmidt by Hand
Exercise 2: Gram-Schmidt by Hand
Problem: Apply Gram-Schmidt to vectors and .Steps:
Exercise 3: Orthogonal Regression
Exercise 3: Orthogonal Regression
Problem: In standard linear regression, we minimize vertical errors. What if we minimize perpendicular (orthogonal) distances to the line? This is called orthogonal regression or total least squares.Implement orthogonal regression using PCA: the first principal component gives the best-fit line that minimizes orthogonal distances.
Summary
| Concept | Definition | Use Case |
|---|---|---|
| Orthogonal | Dot product = 0 | Independent directions |
| Orthonormal | Orthogonal + unit length | Convenient bases |
| Gram-Schmidt | Algorithm to orthonormalize | QR decomposition |
| Projection | Component in a direction | Least squares, dimensionality reduction |
| QR Decomposition | A = QR | Numerically stable solving |
Key Takeaway: Orthogonality simplifies everything. Decomposing into orthogonal components makes calculations independent and numerically stable. This is why PCA (finding orthogonal directions of variance) is so powerful for dimensionality reduction.