Linear Systems & Applications
The Problem Behind Every ML Model
The Classic Problem: Three Equations, Three Unknowns
A Real Scenario: Pricing Mystery
Setting Up the System
Method 1: Gaussian Elimination (The Classic)
The Idea: Simplify Step by Step
Method 2: LU Decomposition (The Efficient Way)
The Insight: Factor Once, Solve Many
The Least Squares Problem: When There’s No Exact Solution
Real-World Data is Messy
The Normal Equations
Application 1: Network Flow Analysis
The Problem
Application 2: Electrical Circuit Analysis
Kirchhoff’s Laws as Linear Systems
When Systems Fail: Singular Matrices
Case 1: No Solution (Inconsistent System)
Case 2: Infinite Solutions (Underdetermined)
Case 3: Numerical Instability (Ill-Conditioned)
The Connection to ML
Linear Regression IS Solving a Linear System
Practice Exercises
Summary

$Linear Systems & Applications$

Linear Systems & Applications

The Problem Behind Every ML Model

Every machine learning model eventually boils down to solving a system of linear equations. When you train a linear regression, you’re solving:

w₁ * x₁₁ + w₂ * x₁₂ + ... + wₙ * x₁ₙ = y₁
w₁ * x₂₁ + w₂ * x₂₂ + ... + wₙ * x₂ₙ = y₂
...
w₁ * xₘ₁ + w₂ * xₘ₂ + ... + wₙ * xₘₙ = yₘ

Finding the best weights

w

means solving this system!

Why This Matters: Understanding how to solve linear systems is crucial for:

Implementing ML algorithms from scratch
Debugging numerical issues in training
Optimizing performance of your models
Understanding when models will fail

Estimated Time: 3-4 hours
Difficulty: Intermediate
Prerequisites: Matrices module
What You’ll Build: Equation solver, network flow optimizer, least squares regression

The Classic Problem: Three Equations, Three Unknowns

A Real Scenario: Pricing Mystery

You’re a detective investigating suspicious pricing at three stores. Each store sells the same three products (apples, bananas, oranges) but only shows the total bill:

Store	Apples Bought	Bananas Bought	Oranges Bought	Total Bill
Store A	2	3	1	$8.50
Store B	1	2	3	$9.00
Store C	3	1	2	$8.00

Question: What is the price of each fruit?

Setting Up the System

Let’s call:

$a$ = price of an apple
$b$ = price of a banana
$o$ = price of an orange

We get three equations:

\begin{align} 2a + 3b + 1o &= 8.50 \\ 1a + 2b + 3o &= 9.00 \\ 3a + 1b + 2o &= 8.00 \end{align}

In matrix form:

A\mathbf{x} = \mathbf{b}

\begin{bmatrix}2 & 3 & 1\\1 & 2 & 3\\3 & 1 & 2\end{bmatrix} \begin{bmatrix}a\\b\\o\end{bmatrix} = \begin{bmatrix}8.50\\9.00\\8.00\end{bmatrix}

Method 1: Gaussian Elimination (The Classic)

This is the method you (probably) learned in high school, but let’s see it with fresh eyes.

The Idea: Simplify Step by Step

Transform the system into a simpler form where the solution becomes obvious.

import numpy as np

def gaussian_elimination(A, b):
    """
    Solve Ax = b using Gaussian elimination with partial pivoting.
    
    This is the 'textbook' method - great for understanding!
    """
    n = len(b)
    
    # Create augmented matrix [A | b]
    Ab = np.column_stack([A.astype(float), b.astype(float)])
    
    # Forward elimination
    for col in range(n):
        # Find the row with largest absolute value in current column (pivoting)
        max_row = col + np.argmax(np.abs(Ab[col:, col]))
        
        # Swap rows if needed
        Ab[[col, max_row]] = Ab[[max_row, col]]
        
        # Check for zero pivot (singular matrix)
        if np.abs(Ab[col, col]) < 1e-10:
            raise ValueError("Matrix is singular or nearly singular!")
        
        # Eliminate entries below the pivot
        for row in range(col + 1, n):
            factor = Ab[row, col] / Ab[col, col]
            Ab[row, col:] -= factor * Ab[col, col:]
        
        print(f"\nAfter eliminating column {col + 1}:")
        print(Ab.round(3))
    
    # Back substitution
    x = np.zeros(n)
    for i in range(n - 1, -1, -1):
        x[i] = (Ab[i, -1] - np.dot(Ab[i, i+1:n], x[i+1:])) / Ab[i, i]
    
    return x

# Our pricing mystery
A = np.array([
    [2, 3, 1],
    [1, 2, 3],
    [3, 1, 2]
])
b = np.array([8.50, 9.00, 8.00])

print("Solving the pricing mystery...")
print("Original system:")
print(f"2a + 3b + 1o = 8.50")
print(f"1a + 2b + 3o = 9.00")
print(f"3a + 1b + 2o = 8.00")

prices = gaussian_elimination(A, b)

print(f"\nApple price: ${prices[0]:.2f}")
print(f"Banana price: ${prices[1]:.2f}")
print(f"Orange price: ${prices[2]:.2f}")

# Verify
print(f"\nVerification:")
for i, (coeffs, total) in enumerate(zip(A, b)):
    calculated = np.dot(coeffs, prices)
    print(f"Store {chr(65+i)}: {calculated:.2f} == {total} (verified)")

Output:

Apple price: $1.00
Banana price: $1.50
Orange price: $2.00

Method 2: LU Decomposition (The Efficient Way)

The Insight: Factor Once, Solve Many

What if you need to solve

A\mathbf{x} = \mathbf{b}

for many different

\mathbf{b}

values? Gaussian elimination from scratch each time is wasteful. Instead, we factor

A

once:

A = LU

Where:

$L$ = Lower triangular matrix (all zeros above diagonal)
$U$ = Upper triangular matrix (all zeros below diagonal)

Then solving

A\mathbf{x} = \mathbf{b}

becomes:

Solve $L\mathbf{y} = \mathbf{b}$ for $\mathbf{y}$ (forward substitution - easy!)
Solve $U\mathbf{x} = \mathbf{y}$ for $\mathbf{x}$ (back substitution - easy!)

from scipy.linalg import lu, lu_factor, lu_solve

# Factor A once
A = np.array([
    [2, 3, 1],
    [1, 2, 3],
    [3, 1, 2]
], dtype=float)

# Get L, U decomposition
P, L, U = lu(A)

print("Original matrix A:")
print(A)
print("\nL (lower triangular):")
print(L.round(3))
print("\nU (upper triangular):")
print(U.round(3))
print("\nVerify: P @ L @ U = A")
print((P @ L @ U).round(3))

# Now solve for multiple b vectors efficiently
lu_factored = lu_factor(A)

# Different scenarios (different total bills)
scenarios = [
    [8.50, 9.00, 8.00],   # Original
    [10.00, 11.00, 9.50], # Higher prices?
    [6.00, 7.00, 6.50],   # Discount day
]

for i, b in enumerate(scenarios):
    x = lu_solve(lu_factored, b)
    print(f"\nScenario {i+1}: Bills = {b}")
    print(f"  Prices: Apple=${x[0]:.2f}, Banana=${x[1]:.2f}, Orange=${x[2]:.2f}")

Performance: LU factorization is

O(n^3)

once, then each solve is only

O(n^2)

. For large systems solved repeatedly, this is a huge win!

The Least Squares Problem: When There’s No Exact Solution

Real-World Data is Messy

In ML, we rarely have an exact solution. We have:

More equations than unknowns (overdetermined system)
Noisy measurements
Contradictory data points

Example: Fitting a line through 100 points

import matplotlib.pyplot as plt

# Generate noisy data
np.random.seed(42)
n_points = 100

# True relationship: y = 2x + 1 + noise
x_data = np.linspace(0, 10, n_points)
y_true = 2 * x_data + 1
y_noisy = y_true + np.random.normal(0, 2, n_points)

# We want to find: y = w₀ + w₁*x
# This gives us 100 equations, 2 unknowns!

# Set up the system: X @ w = y
X = np.column_stack([np.ones(n_points), x_data])  # [1, x] for each point

print(f"X shape: {X.shape}")  # (100, 2)
print(f"y shape: {y_noisy.shape}")  # (100,)
print(f"\nWe have 100 equations but only 2 unknowns!")
print("There's no exact solution - we need LEAST SQUARES")

The Normal Equations

The least squares solution minimizes the squared error:

\min_{\mathbf{w}} \|A\mathbf{w} - \mathbf{b}\|^2

The solution is:

\mathbf{w} = (A^T A)^{-1} A^T \mathbf{b}

# Method 1: Normal equations (direct formula)
def least_squares_normal(X, y):
    """Solve using the normal equations."""
    return np.linalg.inv(X.T @ X) @ X.T @ y

# Method 2: Using NumPy's lstsq (more numerically stable)
def least_squares_numpy(X, y):
    """Solve using NumPy's least squares."""
    w, residuals, rank, s = np.linalg.lstsq(X, y, rcond=None)
    return w

# Solve both ways
w_normal = least_squares_normal(X, y_noisy)
w_numpy = least_squares_numpy(X, y_noisy)

print(f"Normal equations: w₀ = {w_normal[0]:.3f}, w₁ = {w_normal[1]:.3f}")
print(f"NumPy lstsq:      w₀ = {w_numpy[0]:.3f}, w₁ = {w_numpy[1]:.3f}")
print(f"True values:      w₀ = 1.000, w₁ = 2.000")

# Visualize
plt.figure(figsize=(10, 6))
plt.scatter(x_data, y_noisy, alpha=0.5, label='Noisy data')
plt.plot(x_data, y_true, 'g--', linewidth=2, label='True line (y = 2x + 1)')
plt.plot(x_data, X @ w_numpy, 'r-', linewidth=2, 
         label=f'Fitted line (y = {w_numpy[0]:.2f} + {w_numpy[1]:.2f}x)')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Least Squares: Finding the Best Fit Line')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

Application 1: Network Flow Analysis

The Problem

You’re managing a water distribution network. Water flows from sources through pipes to destinations. You need to find the flow in each pipe.

# Network structure:
#
#   Source A ----[p1]----> Node 1 ----[p3]----> Destination X
#                            |
#   Source B ----[p2]--------+-----[p4]----> Destination Y
#
# Conservation law: Flow in = Flow out at each node

# Node 1 receives from p1 and p2, sends to p3 and p4
# Source A supplies 10 units
# Source B supplies 15 units  
# Destination X receives 12 units
# Destination Y receives 13 units

# Variables: flow in pipes p1, p2, p3, p4

# Equations (conservation at each node):
# At Node 1:  p1 + p2 = p3 + p4  =>  p1 + p2 - p3 - p4 = 0
# At Source A: p1 = 10
# At Source B: p2 = 15
# At Dest X:   p3 = 12
# At Dest Y:   p4 = 13

A_flow = np.array([
    [1, 1, -1, -1],  # Conservation at Node 1
    [1, 0, 0, 0],     # Source A
    [0, 1, 0, 0],     # Source B
    [0, 0, 1, 0],     # Dest X
])

b_flow = np.array([0, 10, 15, 12])

flows = np.linalg.solve(A_flow, b_flow)

print("Network Flow Solution:")
print(f"  Pipe 1 (A → Node 1): {flows[0]:.1f} units")
print(f"  Pipe 2 (B → Node 1): {flows[1]:.1f} units")
print(f"  Pipe 3 (Node 1 → X): {flows[2]:.1f} units")
print(f"  Pipe 4 (Node 1 → Y): {flows[3]:.1f} units")

# Check conservation at Node 1
print(f"\nConservation check at Node 1:")
print(f"  In:  {flows[0] + flows[1]:.1f}")
print(f"  Out: {flows[2] + flows[3]:.1f}")

Application 2: Electrical Circuit Analysis

Kirchhoff’s Laws as Linear Systems

# Circuit with 3 resistors and 2 voltage sources
#
#    +--[R1=2Ω]--+--[R2=3Ω]--+
#    |           |           |
#   [V1=5V]      |         [V2=3V]
#    |           |           |
#    +---[R3=4Ω]-+-----------+
#
# Using Kirchhoff's laws, we can set up a system for currents

# Let I1 = current through R1, I2 = through R2, I3 = through R3
# Voltage law (loop 1): V1 = R1*I1 + R3*I3  =>  2*I1 + 4*I3 = 5
# Voltage law (loop 2): V2 = R2*I2 + R3*I3  =>  3*I2 + 4*I3 = 3
# Current law (node):   I1 = I2 + I3         =>  I1 - I2 - I3 = 0

R1, R2, R3 = 2, 3, 4
V1, V2 = 5, 3

A_circuit = np.array([
    [R1, 0, R3],      # Loop 1: V1 = R1*I1 + R3*I3
    [0, R2, R3],      # Loop 2: V2 = R2*I2 + R3*I3
    [1, -1, -1],      # Node: I1 = I2 + I3
])

b_circuit = np.array([V1, V2, 0])

currents = np.linalg.solve(A_circuit, b_circuit)

print("Circuit Analysis:")
print(f"  I₁ (through R1): {currents[0]:.3f} A")
print(f"  I₂ (through R2): {currents[1]:.3f} A")
print(f"  I₃ (through R3): {currents[2]:.3f} A")

# Power dissipation
power = [R1 * currents[0]**2, R2 * currents[1]**2, R3 * currents[2]**2]
print(f"\nPower dissipation:")
print(f"  R1: {power[0]:.3f} W")
print(f"  R2: {power[1]:.3f} W")
print(f"  R3: {power[2]:.3f} W")
print(f"  Total: {sum(power):.3f} W")

When Systems Fail: Singular Matrices

Not all systems have solutions. Understanding when and why is crucial.

Case 1: No Solution (Inconsistent System)

# Three parallel lines - no intersection!
A_nosol = np.array([
    [1, 1],
    [1, 1],
    [1, 1]
])
b_nosol = np.array([2, 3, 4])

print("Trying to solve an inconsistent system...")
try:
    x = np.linalg.lstsq(A_nosol, b_nosol, rcond=None)[0]
    print(f"Least squares 'solution': {x}")
    print("(This minimizes error but doesn't actually solve the system)")
except:
    print("No solution exists!")

Case 2: Infinite Solutions (Underdetermined)

# One equation, two unknowns
A_inf = np.array([[1, 2]])
b_inf = np.array([5])

print("Underdetermined system: x + 2y = 5")
print("Infinitely many solutions: x = 5 - 2y for any y")

# NumPy gives the minimum norm solution
x_min = np.linalg.lstsq(A_inf, b_inf, rcond=None)[0]
print(f"Minimum norm solution: x = {x_min[0]:.2f}, y = {x_min[1]:.2f}")

Case 3: Numerical Instability (Ill-Conditioned)

# The Hilbert matrix is notoriously ill-conditioned
def hilbert(n):
    """Generate n×n Hilbert matrix."""
    return np.array([[1/(i+j+1) for j in range(n)] for i in range(n)])

for n in [5, 10, 15]:
    H = hilbert(n)
    cond = np.linalg.cond(H)
    print(f"Hilbert matrix {n}×{n}: condition number = {cond:.2e}")
    
print("\nHigh condition number = numerically unstable!")
print("Small errors in input cause huge errors in output.")

Condition Number: A measure of how sensitive the solution is to small changes in input.

Condition number ≈ 1: Well-conditioned, stable
Condition number > 10⁶: Ill-conditioned, results may be unreliable
Condition number = ∞: Singular, no unique solution

The Connection to ML

Linear Regression IS Solving a Linear System

from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression

# Generate a regression problem
X, y = make_regression(n_samples=100, n_features=5, noise=10, random_state=42)

# Method 1: sklearn
model = LinearRegression()
model.fit(X, y)

# Method 2: Normal equations (what sklearn does internally)
X_aug = np.column_stack([np.ones(len(X)), X])  # Add bias column
w_direct = np.linalg.lstsq(X_aug, y, rcond=None)[0]

print("Coefficients comparison:")
print(f"sklearn:   intercept = {model.intercept_:.3f}")
print(f"           weights = {model.coef_}")
print(f"Direct:    intercept = {w_direct[0]:.3f}")
print(f"           weights = {w_direct[1:]}")

Practice Exercises

Exercise 1: Traffic Flow

Problem: Three intersections have the following traffic counts (cars/hour):

Into intersection A: 100 from north, 80 from east
Out of A to B: x
Out of A to C: y
Into B from A: x, out to east: 90
Into C from A: y, out to south: 90

Find x and y (flows between intersections).

Exercise 2: Chemical Balancing

Problem: Balance this chemical equation by finding a, b, c, d:

a·H₂ + b·O₂ → c·H₂O

Set up as a linear system where atoms are conserved.

Exercise 3: Portfolio Optimization

Problem: You have $10,000 to invest in three stocks. Based on historical data:

Stock A: Expected return 8%, risk factor 2
Stock B: Expected return 12%, risk factor 5
Stock C: Expected return 6%, risk factor 1

Find the investment in each to achieve:

Total investment: $10,000
Expected return: 9%
Total risk factor: 2.5

Summary

Concept	What It Is	When to Use
Gaussian Elimination	Step-by-step row operations	Understanding, small systems
LU Decomposition	Factor A = LU once	Multiple solves with same A
Least Squares	Minimize ‖Ax - b‖²	Overdetermined systems
Condition Number	Sensitivity measure	Checking numerical stability

Next Steps: Now that you understand how to solve linear systems, you’re ready for the capstone project where you’ll build a complete recommendation system using SVD!

SVD Project 2

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Linear Systems & Applications

​The Problem Behind Every ML Model

​The Classic Problem: Three Equations, Three Unknowns

​A Real Scenario: Pricing Mystery

​Setting Up the System

​Method 1: Gaussian Elimination (The Classic)

​The Idea: Simplify Step by Step

​Method 2: LU Decomposition (The Efficient Way)

​The Insight: Factor Once, Solve Many

​The Least Squares Problem: When There’s No Exact Solution

​Real-World Data is Messy

​The Normal Equations

​Application 1: Network Flow Analysis

​The Problem

​Application 2: Electrical Circuit Analysis

​Kirchhoff’s Laws as Linear Systems

​When Systems Fail: Singular Matrices

​Case 1: No Solution (Inconsistent System)

​Case 2: Infinite Solutions (Underdetermined)

​Case 3: Numerical Instability (Ill-Conditioned)

​The Connection to ML

​Linear Regression IS Solving a Linear System

​Practice Exercises

​Summary

Linear Systems & Applications

The Problem Behind Every ML Model

The Classic Problem: Three Equations, Three Unknowns

A Real Scenario: Pricing Mystery

Setting Up the System

Method 1: Gaussian Elimination (The Classic)

The Idea: Simplify Step by Step

Method 2: LU Decomposition (The Efficient Way)

The Insight: Factor Once, Solve Many

The Least Squares Problem: When There’s No Exact Solution

Real-World Data is Messy

The Normal Equations

Application 1: Network Flow Analysis

The Problem

Application 2: Electrical Circuit Analysis

Kirchhoff’s Laws as Linear Systems

When Systems Fail: Singular Matrices

Case 1: No Solution (Inconsistent System)

Case 2: Infinite Solutions (Underdetermined)

Case 3: Numerical Instability (Ill-Conditioned)

The Connection to ML

Linear Regression IS Solving a Linear System

Practice Exercises

Summary