Skip to main content
Linear Algebra for Machine Learning

Linear Algebra for Machine Learning

Have You Ever Wondered…

  • How does Spotify know that if you like Coldplay, you might also like Imagine Dragons?
  • How does Instagram apply those fancy filters to your photos in milliseconds?
  • How does Netflix predict you’ll rate a movie 4.2 stars before you’ve even watched it?
  • How does Google Photos find all pictures of your dog without you tagging them?
The answer to ALL of these is Linear Algebra. Not calculus. Not statistics. Linear Algebra. The math of lists, tables, and transformations.
Real Talk: You probably took linear algebra in college, got confused by abstract proofs about “vector spaces” and “linear independence,” passed the exam, and forgot everything.This time is different. We’re going to make you see linear algebra, use it, and actually enjoy it.
Estimated Time: 16-20 hours
Difficulty: Beginner-friendly (we assume you forgot everything)
Prerequisites: Basic Python, willingness to experiment
What You’ll Build: Spotify-style song recommender, Instagram-style filters, Netflix-style rating predictor
Before starting, make sure you can:Python Basics
  • Create and manipulate lists: my_list = [1, 2, 3]
  • Write simple loops: for i in range(10)
  • Define and call functions: def my_func(x): return x * 2
  • Use basic NumPy: import numpy as np; arr = np.array([1, 2, 3])
Math Comfort Level
  • Basic arithmetic (you can use a calculator!)
  • Understand coordinates on a graph (x, y)
  • Comfortable with the idea that letters can represent numbers
You DON’T need:
  • Previous linear algebra (we start from zero)
  • Calculus knowledge
  • Matrix manipulation experience
  • Any ML/AI background
If you’re missing Python basics, check out our Python Crash Course first (4-6 hours).
Try these quick checks to gauge your readiness:Python Check (can you read this code?):
def find_max(numbers):
    max_val = numbers[0]
    for n in numbers:
        if n > max_val:
            max_val = n
    return max_val

print(find_max([3, 1, 4, 1, 5, 9]))  # What prints?
Math Check (can you solve this?): If point A is at (2, 3) and point B is at (5, 7), what’s the distance between them?Remediation Paths:
Gap IdentifiedRecommended Action
Python syntaxPython Crash Course - 4-6 hours
NumPy basicsNumPy section of Python course - 1-2 hours
Coordinate geometryWe cover it in Module 1! Just proceed.
Graph readingYouTube: “Reading graphs basics” - 30 min
Career Impact: Linear algebra is the most practical math you’ll ever learn for tech. It’s used in AI, graphics, data science, finance, and more. Engineers who truly understand it command $150K+ salaries because they can optimize, debug, and innovate where others can’t.

The “Aha!” Moment: Everything is a List of Numbers

Here’s the secret that unlocks all of machine learning: Anything can be turned into a list of numbers. And once it’s numbers, math can work magic.

Your Favorite Song → Numbers

# Spotify represents every song as ~12 numbers
billie_eilish_bad_guy = [
    0.70,   # danceability (0-1)
    0.43,   # energy (0-1)  
    0.56,   # speechiness (0-1)
    0.32,   # acousticness (0-1)
    0.00,   # instrumentalness (0-1)
    0.36,   # liveness (0-1)
    0.68,   # valence/happiness (0-1)
    135.0,  # tempo (BPM)
    # ... more features
]

# This list IS a vector. That's it. A vector is just a list of numbers.

Your Face → Numbers

# A 100x100 pixel selfie = 10,000 numbers (brightness of each pixel)
# A neural network can compress this to just 128 numbers that capture "you-ness"

your_face_embedding = [0.23, -0.45, 0.89, ..., 0.12]  # 128 numbers

# Similar faces have similar numbers!

A Netflix Movie → Numbers

# Every movie can be described by hidden factors
inception = [
    0.95,   # "mind-bending" factor
    0.80,   # "action" factor  
    0.20,   # "romance" factor
    0.60,   # "visual spectacle" factor
    # ...
]
This is the core insight: Once everything is numbers, we can:
  • Compare things (how similar are two songs?)
  • Transform things (apply a filter to a photo)
  • Find patterns (what do users who liked X also like?)
  • Compress things (store a 10MB image in 100KB)
Everything is Numbers
🔗 ML Connection: This “everything is numbers” insight is the foundation of ALL machine learning:
ML ConceptLinear Algebra Foundation
Word Embeddings (GPT, BERT)Words → vectors of 768+ numbers
Neural Network LayersMatrix multiplication transforms
Attention MechanismDot products measure relevance
Image RecognitionPixels → feature vectors → classification
Recommendation SystemsUsers & items as vectors in shared space
Every module in this course connects directly to these ML applications!

Who Uses This (Companies & Salaries)

OpenAI

GPT-4 does 100+ trillion matrix operations per prompt. Every AI breakthrough is linear algebra at scale.

Pixar/Disney

Every frame of Toy Story involves millions of matrix transformations for 3D rendering.

Google Search

PageRank uses eigenvalues to rank websites. It’s why Google won the search wars.
RoleHow They Use Linear AlgebraMedian Salary
ML EngineerNeural network weights, transformations, embeddings$175K
Data ScientistPCA, clustering, recommendation systems$150K
Graphics Engineer3D transformations, shaders, physics$180K
Quantitative AnalystPortfolio optimization, risk modeling$250K+
Robotics EngineerKinematics, sensor fusion, SLAM$165K

Mathematical Notation Quick Reference

Before we dive in, here’s a cheat sheet of the notation you’ll encounter. Don’t memorize it — just come back here when you see something unfamiliar.
SymbolMeaningExample
v\mathbf{v} or v\vec{v}A vector (bold or arrow)v=[3,4,5]\mathbf{v} = [3, 4, 5]
viv_iThe ii-th element of vector v\mathbf{v}v2=4v_2 = 4
v\|\mathbf{v}\|Length (magnitude) of vectorv=32+42+52=50\|\mathbf{v}\| = \sqrt{3^2 + 4^2 + 5^2} = \sqrt{50}
ab\mathbf{a} \cdot \mathbf{b}Dot product[1,2][3,4]=1(3)+2(4)=11[1,2] \cdot [3,4] = 1(3) + 2(4) = 11
aT\mathbf{a}^TTranspose (row ↔ column)[1,2,3]T=[123][1, 2, 3]^T = \begin{bmatrix}1\\2\\3\end{bmatrix}
SymbolMeaningExample
AA, BB, MMMatrices (capital letters)A=[1234]A = \begin{bmatrix}1 & 2\\3 & 4\end{bmatrix}
AijA_{ij} or aija_{ij}Element at row ii, column jjA12=2A_{12} = 2
ATA^TTranspose (flip rows/columns)[1234]T=[1324]\begin{bmatrix}1 & 2\\3 & 4\end{bmatrix}^T = \begin{bmatrix}1 & 3\\2 & 4\end{bmatrix}
A1A^{-1}Inverse of matrix AAAA1=IAA^{-1} = I
IIIdentity matrix[1001]\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix}
det(A)\det(A) or $A$Determinantdet[abcd]=adbc\det\begin{bmatrix}a & b\\c & d\end{bmatrix} = ad - bc
SymbolMeaningExample
i=1n\sum_{i=1}^{n}Sum from i=1i=1 to nni=13i=1+2+3=6\sum_{i=1}^{3} i = 1 + 2 + 3 = 6
i=1n\prod_{i=1}^{n}Product from i=1i=1 to nni=13i=1×2×3=6\prod_{i=1}^{3} i = 1 \times 2 \times 3 = 6
R\mathbb{R}Real numbersxRx \in \mathbb{R} means xx is a real number
Rn\mathbb{R}^nnn-dimensional real spacevR3\mathbf{v} \in \mathbb{R}^3 is a 3D vector
SymbolMeaningML Context
λ\lambda (lambda)EigenvalueHow much a direction stretches
σ\sigma (sigma)Singular valueImportance of a pattern in SVD
\nabla (nabla/del)Gradient operatorDirection of steepest change
θ\theta (theta)Model parametersWeights in neural networks
\approxApproximately equalπ3.14\pi \approx 3.14

Quick Math Examples

Vector Addition — Add component by component: [123]+[456]=[1+42+53+6]=[579]\begin{bmatrix}1\\2\\3\end{bmatrix} + \begin{bmatrix}4\\5\\6\end{bmatrix} = \begin{bmatrix}1+4\\2+5\\3+6\end{bmatrix} = \begin{bmatrix}5\\7\\9\end{bmatrix} Scalar Multiplication — Multiply each component: 3×[123]=[369]3 \times \begin{bmatrix}1\\2\\3\end{bmatrix} = \begin{bmatrix}3\\6\\9\end{bmatrix} Dot Product — Multiply corresponding elements and sum: [123][456]=(1)(4)+(2)(5)+(3)(6)=4+10+18=32\begin{bmatrix}1\\2\\3\end{bmatrix} \cdot \begin{bmatrix}4\\5\\6\end{bmatrix} = (1)(4) + (2)(5) + (3)(6) = 4 + 10 + 18 = 32 Matrix × Vector — Each output is a dot product: [1234][56]=[(1)(5)+(2)(6)(3)(5)+(4)(6)]=[1739]\begin{bmatrix}1 & 2\\3 & 4\end{bmatrix} \begin{bmatrix}5\\6\end{bmatrix} = \begin{bmatrix}(1)(5)+(2)(6)\\(3)(5)+(4)(6)\end{bmatrix} = \begin{bmatrix}17\\39\end{bmatrix}
Pro Tip: When you see scary-looking equations in ML papers, break them down into these simple operations. Most “complex” formulas are just combinations of addition, multiplication, and dot products!
Want more mathematical rigor? Each module includes optional “Going Deeper” sections that cover:
ModuleAdvanced TopicWhy It Matters
VectorsVector spaces, basis, spanUnderstand why neural network layers work
MatricesLinear transformations, rankDebug dimensionality issues in ML models
EigenvaluesSpectral theorem, diagonalizationOptimize PCA computation, understand graph neural networks
SVDMatrix approximation theory, Eckart-YoungWhy truncated SVD is optimal for compression
These sections are OPTIONAL. You can build all the projects and understand ML applications without them. They’re for learners who:
  • Have a math/physics background and want the formal treatment
  • Plan to pursue ML research or read academic papers
  • Are simply curious about the “why” behind the formulas
Recommended Resources for Deep Dives:
  • Linear Algebra Done Right by Sheldon Axler (rigorous but readable)
  • Gilbert Strang’s MIT lectures on YouTube (free, excellent)
  • Mathematics for Machine Learning textbook (free PDF at mml-book.github.io)

What You’ll Actually Learn (And Why You’ll Care)

Real-World Examples You Already Know:
  • GPS Navigation: Your location is a vector [latitude, longitude]. Distance between two places? Vector math.
  • Fitness Trackers: Your daily stats [steps, calories, heart_rate, sleep_hours] — that’s a vector describing your day.
  • Job Matching: LinkedIn represents you as [skills, experience, education, location] and finds similar candidates.
  • Dating Apps: Tinder/Hinge match you based on preference vectors. Similar vectors = potential match.
What You’ll Build: A similarity search engine (works for songs, jobs, or anything).
Real-World Examples You Already Know:
  • Photo Editing: Every Instagram filter is a matrix multiplication. Brightness, contrast, blur — all matrix operations.
  • Video Games: When you rotate your character, move the camera, or zoom in — that’s matrix math happening 60 times per second.
  • Spreadsheets: Excel pivot tables, VLOOKUP across sheets — you’re doing matrix operations without knowing it.
  • Maps/GPS: Transforming GPS coordinates to screen pixels involves matrix multiplication.
What You’ll Build: Your own photo filter app and a 2D game transformation engine.
Real-World Examples You Already Know:
  • Surveys: 50 questions reduce to 3-4 “personality types” — that’s PCA finding the key dimensions.
  • Stock Market: Hundreds of stocks move together because of 5-10 hidden factors (economy, interest rates, oil prices).
  • Customer Segments: Millions of customers cluster into 5-6 types based on purchasing patterns.
  • Compression: JPEG images keep 90% quality with 10% file size by keeping only the important eigenvectors.
What You’ll Build: Image compressor and customer segmentation system.
Real-World Examples You Already Know:
  • “Customers who bought X also bought Y”: Amazon uses matrix factorization to find these patterns.
  • YouTube Recommendations: “Because you watched X” — they decomposed your viewing history.
  • Spell Check: “Did you mean…?” often uses SVD to find similar words.
  • Fraud Detection: Normal transactions form patterns; fraud breaks those patterns.
What You’ll Build: A working recommendation engine using real MovieLens data.

Your Learning Journey

1

Week 1-2: Vectors

Learn to see everything as vectors. Build a song/image similarity search engine.
2

Week 2-3: Matrices

Master transformations. Build Instagram-style photo filters from scratch.
3

Week 3-4: Eigenvalues & PCA

Find hidden patterns. Compress images, reduce dimensions, and understand what your data really contains.
4

Week 4-5: SVD & Recommendations

The crown jewel. Build a Netflix-style recommendation engine that predicts ratings.

Why Most Math Courses Fail (And How This One’s Different)

  1. Definition of a vector space
  2. Axioms of vector addition
  3. Proof of linear independence
  4. Abstract theorem
  5. “Exercise left to the reader”
  6. Student falls asleep
Our Promise: Every concept will be:
  • Explained with a real-world app you use daily
  • Visualized with clear diagrams
  • Coded in Python you can run
  • Practiced with projects you’ll want to show off

Prerequisites (Honestly, Not Much)

What You Need:
  • Basic Python: Variables, lists, loops, functions
  • Willingness to experiment: Run code, break things, learn
  • Curiosity: Wonder how apps work under the hood
What You DON’T Need:
  • Previous linear algebra knowledge (we start from scratch)
  • Mathematical proofs (we focus on intuition and code)
  • Perfect grades in math (many engineers struggle with math — that’s okay!)

Setup (5 Minutes)

# Create a new environment and install what we need
pip install numpy matplotlib jupyter scikit-learn pillow plotly ipywidgets

# Start Jupyter to follow along
jupyter notebook
That’s it. No complex setup. Let’s go.
🎮 Interactive Learning: Throughout this course, you’ll find interactive visualizations marked with 🎮. These let you:
  • Drag vectors and see transformations in real-time
  • Adjust parameters with sliders to build intuition
  • Experiment without breaking anything
We’ve added plotly and ipywidgets for these interactive elements. They’re optional but highly recommended!

🎮 Interactive Visualization Tools

We’ve designed this course to be highly visual. Use these tools alongside the course:
🔗 When to Use These Tools:
  • Module 1 (Vectors): GeoGebra to visualize addition, dot products
  • Module 2 (Matrices): Desmos to see transformations on 2D shapes
  • Module 3 (Eigenvalues): 3Blue1Brown video + Desmos visualization
  • Module 4 (PCA/SVD): Our built-in interactive widgets

The Projects You’ll Build

By the end of this course, you’ll have a portfolio of real, working projects:

Song Recommender

Find similar songs using vector similarity. Input: a song you like. Output: 10 songs you’ll probably love.

Photo Filter App

Apply blur, sharpen, edge detection, and custom effects using matrix operations.

Image Compressor

Compress images to 10% of their size while keeping them recognizable. Understand how JPEG works.

Movie Recommender

Predict user ratings for movies they haven’t seen. The actual technique Netflix uses.

Quick Taste: Vector Similarity in Action

Before we dive deep, let’s see the magic in action. This is what you’ll fully understand by the end of Module 1:
import numpy as np

# Three songs represented as vectors [energy, danceability, acousticness]
blinding_lights = np.array([0.73, 0.51, 0.00])  # The Weeknd
levitating = np.array([0.69, 0.70, 0.03])        # Dua Lipa
someone_like_you = np.array([0.34, 0.50, 0.75])  # Adele

def similarity(song_a, song_b):
    """Cosine similarity - how similar are two vectors?"""
    return np.dot(song_a, song_b) / (np.linalg.norm(song_a) * np.linalg.norm(song_b))

print(f"Blinding Lights vs Levitating: {similarity(blinding_lights, levitating):.2f}")
print(f"Blinding Lights vs Someone Like You: {similarity(blinding_lights, someone_like_you):.2f}")

# Output:
# Blinding Lights vs Levitating: 0.94  (very similar! both are upbeat pop)
# Blinding Lights vs Someone Like You: 0.54  (less similar - different vibes)
That’s it. That’s the core of how Spotify recommendations work. Vectors + similarity. Now imagine doing this with 100 dimensions instead of 3, and millions of songs. That’s what you’ll build.

By the End of This Course

You will: See vectors and matrices everywhere (in apps, in data, in neural networks)
Build 4 portfolio-worthy ML projects from scratch
Read ML papers and actually understand the notation
Debug ML models because you understand what’s happening inside
Explain to others why linear algebra powers AI
Most importantly: You’ll stop being scared of math in ML papers. When you see: y=Wx+b\mathbf{y} = W\mathbf{x} + \mathbf{b} You’ll think: “Oh, that’s just transforming a vector with a matrix. Like applying a filter to an image.”

Ready?

Let’s stop talking and start building. The next module introduces vectors by asking a simple question: “How does Spotify know what song to play next?”

Next: Vectors — The Language of Similarity

Learn what vectors really are, why everything is a vector, and how to measure similarity between any two things.