You just launched your online store selling wireless headphones. Exciting! But now you face a critical decision:What price should you charge?You experiment with different prices over several weeks:
By the end of this module, you’ll answer questions like:✅ Your Business: What price maximizes YOUR profit?
✅ Your Learning: How many hours should YOU study for maximum score?
✅ Your ML Models: How should YOU adjust weights to reduce errors?
✅ Your Life: What’s YOUR optimal speed to minimize fuel consumption?Your tool: Derivatives - the mathematical way to find optimal solutions.
Estimated Time: 3-4 hours Difficulty: Beginner Prerequisites: Basic algebra You’ll Build: Your own pricing optimizer, learning rate finder, and simple neural network
At any price, you need to answer: “If I increase my price by $1, does my profit go up or down?”This is EXACTLY what a derivative tells you!Derivative = Rate of Change
Copy
def profit(price): customers = 1300 - 10 * price return (price - 20) * customers# Your current priceyour_price = 50# "If I increase my price by $1, how much does my profit change?"small_increase = 1profit_now = profit(your_price)profit_after = profit(your_price + small_increase)change_in_profit = profit_after - profit_nowprint(f"At your current price of ${your_price}:")print(f" Your profit now: ${profit_now:,.0f}")print(f" Your profit at ${your_price + small_increase}: ${profit_after:,.0f}")print(f" Change: ${change_in_profit:,.0f}")print(f" → Derivative ≈ {change_in_profit}")print(f" (your profit changes by ${change_in_profit} per $1 price increase)")if change_in_profit > 0: print(f"\n ✅ Your profit is INCREASING → You should raise your price!")elif change_in_profit < 0: print(f"\n ❌ Your profit is DECREASING → You should lower your price!")else: print(f"\n ⭐ Your profit is at MAXIMUM → You found the perfect price!")
Output:
Copy
At your current price of $50: Your profit now: $24,000 Your profit at $51: $24,490 Change: $490 → Derivative ≈ 490 (your profit changes by $490 per $1 price increase) ✅ Your profit is INCREASING → You should raise your price!
Your Reaction: “Wow! At 50,Ishouldincreasemyprice.Eachdollarincreaseadds490 to my profit!”
Think about driving a car:🚗 Position = where you are (e.g., mile marker 50)
📊 Speed = how fast your position is changing (e.g., 60 mph)
⚡ Acceleration = how fast your speed is changing (e.g., +5 mph/second)The speedometer shows your derivative!It tells you: “Right now, at this exact moment, you’re going 60 mph.”Mathematically:
import numpy as npdef f(x): """Our function: f(x) = x²""" return x**2# We want the derivative at x=3x = 3# Method 1: Numerical approximationprint("=== Numerical Approximation ===")for h in [0.1, 0.01, 0.001, 0.0001]: # Compute slope of secant line df = f(x + h) - f(x) # Change in f dx = h # Change in x derivative_approx = df / dx print(f"h = {h:7.4f} → f'(3) ≈ {derivative_approx:.6f}")print("\n=== Exact Answer ===")# For f(x) = x², the derivative is f'(x) = 2xexact_derivative = 2 * xprint(f"f'(3) = 2×3 = {exact_derivative}")print("\n=== Interpretation ===")print(f"At x=3, if we increase x by 1, f(x) increases by approximately {exact_derivative}")print(f"At x=3, the function is rising with a slope of {exact_derivative}")
Output:
Copy
=== Numerical Approximation ===h = 0.1000 → f'(3) ≈ 6.100000h = 0.0100 → f'(3) ≈ 6.010000h = 0.0010 → f'(3) ≈ 6.001000h = 0.0001 → f'(3) ≈ 6.000100=== Exact Answer ===f'(3) = 2×3 = 6=== Interpretation ===At x=3, if we increase x by 1, f(x) increases by approximately 6At x=3, the function is rising with a slope of 6
Key Insights:
✅ As h gets smaller, our approximation gets better
✅ The derivative is the instantaneous rate of change
✅ At x=3, the function x2 is rising steeply (slope = 6)
✅ This tells us: small changes in x cause BIG changes in f(x)
You’re optimizing ad spending. Your cost function is:C(x)=x2−10x+100Where x is ad spend in thousands of dollars.Goal: Find the spending level that minimizes cost.
A student’s test score depends on study hours:S(h)=−h2+12h+20Where h is hours studied per day.Question: How many hours should they study to maximize their score?
Example 1: Polynomialf(x)=3x4−2x3+5x−7Using power rule and sum rule:f′(x)=3(4x3)−2(3x2)+5(1)−0=12x3−6x2+5Example 2: Product Ruleh(x)=x2⋅exLet f=x2 and g=ex:h′(x)=(2x)(ex)+(x2)(ex)=ex(2x+x2)=ex⋅x(x+2)Example 3: Quotient Ruleq(x)=x+1x2Let f=x2 and g=x+1:q′(x)=(x+1)2(2x)(x+1)−(x2)(1)=(x+1)22x2+2x−x2=(x+1)2x2+2xExample 4: Chain Ruley=(3x+1)5Let outer f(u)=u5 and inner g(x)=3x+1:y′=5(3x+1)4⋅3=15(3x+1)4
Copy
import numpy as np# Verify chain rule example numericallydef y(x): return (3*x + 1)**5def y_prime(x): return 15 * (3*x + 1)**4x = 2h = 0.0001numerical = (y(x + h) - y(x)) / hanalytical = y_prime(x)print(f"Numerical: {numerical:.2f}")print(f"Analytical: {analytical}")# Both should be 31752015
# A company's profit function is:# P(x) = -2x² + 40x - 100# where x is production quantity in thousands# TODO:# 1. Find the derivative P'(x)# 2. Find the production quantity that maximizes profit# 3. What is the maximum profit?# 4. Verify it's a maximum using the second derivative
Real-World Insight: This is exactly how Uber’s pricing algorithm works! They continuously estimate demand curves and adjust prices to maximize profit while balancing rider satisfaction.
You’re studying for an exam. More study time = higher score, but with diminishing returns:
Copy
# Score model (realistic diminishing returns):# score(hours) = 100 × (1 - e^(-0.3 × hours))# # But studying has a cost: fatigue reduces retention# effective_score(hours) = score(hours) - 2 × hours# TODO:# 1. Find the derivative of effective_score# 2. Find optimal study hours# 3. What's your expected score?# 4. Plot the curve to visualize
💡 Solution
Copy
import numpy as npdef score(hours): """Base score: 100 × (1 - e^(-0.3h))""" return 100 * (1 - np.exp(-0.3 * hours))def fatigue_cost(hours): """Fatigue penalty: 2 points per hour""" return 2 * hoursdef effective_score(hours): """Net score after fatigue""" return score(hours) - fatigue_cost(hours)def score_derivative(hours): """d(score)/dh = 100 × 0.3 × e^(-0.3h) = 30 × e^(-0.3h)""" return 30 * np.exp(-0.3 * hours)def effective_derivative(hours): """d(effective_score)/dh = 30 × e^(-0.3h) - 2""" return score_derivative(hours) - 2print("📚 Optimal Study Time Analysis")print("=" * 50)# Find optimal: 30 × e^(-0.3h) - 2 = 0# e^(-0.3h) = 2/30 = 1/15# -0.3h = ln(1/15)# h = -ln(1/15) / 0.3optimal_hours = -np.log(1/15) / 0.3print(f"\n🎯 Optimal study time: {optimal_hours:.1f} hours")print(f" Base score: {score(optimal_hours):.1f}")print(f" Fatigue cost: -{fatigue_cost(optimal_hours):.1f}")print(f" Effective score: {effective_score(optimal_hours):.1f}")# Compare with over-studyingover_study = 15print(f"\n⚠️ Comparison: Studying {over_study} hours:")print(f" Base score: {score(over_study):.1f}")print(f" Fatigue cost: -{fatigue_cost(over_study):.1f}")print(f" Effective score: {effective_score(over_study):.1f}")print(f" You lost {effective_score(optimal_hours) - effective_score(over_study):.1f} points!")# Diminishing returns tableprint("\n📊 Diminishing Returns:")print(" Hours | Score Gain | Marginal Gain")print(" ------|------------|-------------")for h in [0, 2, 4, 6, 8, 10]: gain = score(h) marginal = score_derivative(h) if h > 0 else 30 print(f" {h:5} | {gain:10.1f} | {marginal:13.2f} pts/hr")
Real-World Insight: This “diminishing returns + cost” model applies everywhere: exercise (muscle gains vs. injury risk), marketing (ad spend vs. saturation), even eating (enjoyment vs. fullness)!
# Fuel consumption (gallons/hour) = 0.001 × speed² + 2# Distance traveled (miles/hour) = speed# # Fuel efficiency = miles per gallon = distance / fuel# efficiency(speed) = speed / (0.001 × speed² + 2)# TODO:# 1. Find the derivative of efficiency# 2. Find the speed that maximizes MPG# 3. What's the maximum MPG?# 4. Compare efficiency at 55 mph vs 75 mph
💡 Solution
Copy
import numpy as npdef fuel_consumption(speed): """Gallons per hour at given speed""" return 0.001 * speed**2 + 2def efficiency(speed): """Miles per gallon = speed / fuel_per_hour""" return speed / fuel_consumption(speed)def efficiency_derivative(speed): """Using quotient rule: d/dx [f/g] = (f'g - fg') / g²""" # f = speed, f' = 1 # g = 0.001*speed² + 2, g' = 0.002*speed f = speed g = 0.001 * speed**2 + 2 f_prime = 1 g_prime = 0.002 * speed return (f_prime * g - f * g_prime) / g**2print("🚗 Fuel Efficiency Optimization")print("=" * 50)# Find optimal: set derivative = 0# (1)(0.001*s² + 2) - (s)(0.002*s) = 0# 0.001*s² + 2 - 0.002*s² = 0# 2 - 0.001*s² = 0# s² = 2000# s = sqrt(2000) ≈ 44.7 mphoptimal_speed = np.sqrt(2000)print(f"\n🎯 Optimal speed: {optimal_speed:.1f} mph")print(f" Maximum efficiency: {efficiency(optimal_speed):.1f} MPG")# Compare different speedsprint("\n📊 Speed vs Efficiency:")print(" Speed (mph) | MPG | Fuel/100mi")print(" ------------|--------|----------")for speed in [35, 45, 55, 65, 75, 85]: mpg = efficiency(speed) fuel_per_100 = 100 / mpg marker = " ← optimal" if abs(speed - optimal_speed) < 5 else "" print(f" {speed:11} | {mpg:6.1f} | {fuel_per_100:10.2f} gal{marker}")# Cost analysis for a 300-mile tripprint("\n💰 Cost Analysis (300-mile trip, $3.50/gal):")for speed in [45, 55, 75]: gallons = 300 / efficiency(speed) cost = gallons * 3.50 time = 300 / speed print(f" {speed} mph: ${cost:.2f} ({time:.1f} hours)")# Trade-offprint("\n⚡ Time vs Money Trade-off:")print(" Going 75 vs 55 mph saves 1.3 hours")print(f" But costs ${300/efficiency(75)*3.5 - 300/efficiency(55)*3.5:.2f} extra in fuel")
Real-World Insight: This is why highway speed limits and eco-driving recommendations hover around 55-65 mph. Car manufacturers optimize engines for this range. Tesla’s efficiency curves show the same pattern!
You’re analyzing compound growth with continuous compounding:
Copy
# Investment value: V(t) = P × e^(r×t)# P = initial principal ($10,000)# r = annual rate (5% = 0.05)# t = years# You want to know:# 1. How fast is your money growing at year 10?# 2. How long until your money doubles?# 3. At what rate does money double in 10 years?
💡 Solution
Copy
import numpy as npdef value(t, P=10000, r=0.05): """Investment value at time t""" return P * np.exp(r * t)def growth_rate(t, P=10000, r=0.05): """d(V)/dt = r × P × e^(r×t) = r × V(t)""" return r * value(t, P, r)print("💹 Investment Growth Analysis")print("=" * 50)P = 10000 # Initial investmentr = 0.05 # 5% annual rate# 1. Growth rate at year 10t = 10V_10 = value(t)rate_10 = growth_rate(t)print(f"\n📈 After {t} years:")print(f" Value: ${V_10:,.2f}")print(f" Growing at: ${rate_10:,.2f}/year")print(f" Daily growth: ${rate_10/365:,.2f}/day")# 2. Time to double (doubling time)# 2P = P × e^(r×t)# 2 = e^(r×t)# ln(2) = r×t# t = ln(2) / rdoubling_time = np.log(2) / rprint(f"\n⏱️ Doubling time at {r*100}%: {doubling_time:.2f} years")print(f" (Rule of 72 estimate: {72/5:.1f} years)")# 3. Rate needed to double in 10 years# 2 = e^(r×10)# ln(2) = 10r# r = ln(2) / 10target_years = 10required_rate = np.log(2) / target_yearsprint(f"\n🎯 To double in {target_years} years:")print(f" Required rate: {required_rate*100:.2f}%")# Comparison tableprint("\n📊 Compound Growth Power:")print(" Years | 5% Rate | 7% Rate | 10% Rate")print(" ------|-----------|-----------|----------")for years in [5, 10, 20, 30]: v5 = value(years, P, 0.05) v7 = value(years, P, 0.07) v10 = value(years, P, 0.10) print(f" {years:5} | ${v5:9,.0f} | ${v7:9,.0f} | ${v10:9,.0f}")# Instantaneous vs average growthprint("\n💡 Key Insight:")print(f" At year 10, growth rate = r × V(t) = {r} × ${V_10:,.2f}")print(f" The derivative tells us: 'Right now, money is growing")print(f" at ${rate_10:,.2f}/year' - not the average, but THIS MOMENT!")
Real-World Insight: This is the “magic” of compound interest that Einstein allegedly called the 8th wonder of the world. The derivative shows that growth rate is proportional to current value - the rich get richer mathematically!
✅ Derivative = rate of change - How output changes with input
✅ Geometric view - Slope of tangent line
✅ Optimization - Set derivative = 0 to find min/max
✅ Second derivative - Tells you if it’s min or max
✅ ML connection - Gradient descent uses derivatives to learn
Mistakes that trip up beginners and even experienced practitioners:
❌ Confusing Derivative with Function Value
Wrong thinking: “The derivative of x2 at x=3 is x2=9”Correct: The derivative of x2 is 2x. At x=3, the derivative is 2(3)=6.The derivative tells you the slope, not the height!
Copy
# Wrongdef wrong_approach(x): return x**2 # This is f(x), not f'(x)!# Correctdef derivative(x): return 2*x # This is f'(x)print(f"Value at x=3: {3**2}") # 9print(f"Derivative at x=3: {2*3}") # 6 (the slope!)
❌ Forgetting the Chain Rule
Wrong: dxd(x2+1)3=3(x2+1)2Correct: dxd(x2+1)3=3(x2+1)2⋅2x=6x(x2+1)2Rule: When there’s a function inside another function, multiply by the derivative of the inner function!
❌ Numerical Instability with Small h
Trap: Using extremely small h values for numerical derivatives.
Copy
# Too small h causes numerical errors!h = 1e-15numerical_deriv = (f(x + h) - f(x)) / h # Can give wrong answer!# Safe range: h between 1e-5 and 1e-8h = 1e-7numerical_deriv = (f(x + h) - f(x - h)) / (2 * h) # Central difference is better
You now understand derivatives for single-variable functions. But ML models have MANY variables (thousands or millions!).How do we handle that? Gradients - the multi-variable version of derivatives!