Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Time Series Fundamentals

Time Series Fundamentals

What Makes Time Series Special?

In regular ML, data points are assumed to be independent. In time series, order matters and past values predict future values. Analogy: Regular ML is like looking at a bag of random photographs — each one is independent, and you can shuffle them freely. Time series is like watching a movie — each frame only makes sense in sequence, and what happened 5 seconds ago tells you a lot about what is happening now. Shuffling the frames destroys the information.
Regular MLTime Series
Each sample is independentSamples are correlated in time
Shuffle data for cross-validationMust respect time order
Features predict targetPast values predict future values
Random split for train/testTime-based split required
Estimated Time: 4-5 hours
Difficulty: Intermediate
Prerequisites: Descriptive Statistics, Probability, Regression
What You’ll Build: Stock price forecaster, seasonal decomposition tool

Time Series Components

Every time series can be decomposed into:
  1. Trend: Long-term increase or decrease
  2. Seasonality: Regular patterns that repeat (daily, weekly, yearly)
  3. Cyclical: Irregular fluctuations (business cycles)
  4. Residual: Random noise
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import signal

np.random.seed(42)

# Create a synthetic time series with all components
n = 365 * 3  # 3 years of daily data
t = np.arange(n)

# Components
trend = 0.05 * t  # Upward trend
seasonal = 10 * np.sin(2 * np.pi * t / 365)  # Yearly seasonality
weekly = 3 * np.sin(2 * np.pi * t / 7)  # Weekly pattern
noise = np.random.randn(n) * 2  # Random noise

# Combine
y = 100 + trend + seasonal + weekly + noise

# Create time index
dates = pd.date_range('2021-01-01', periods=n, freq='D')
ts = pd.Series(y, index=dates)

# Plot
fig, axes = plt.subplots(4, 1, figsize=(14, 12))

axes[0].plot(dates, y)
axes[0].set_title('Complete Time Series')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(dates, trend + 100)
axes[1].set_title('Trend Component')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

axes[2].plot(dates, seasonal)
axes[2].set_title('Seasonal Component (Yearly)')
axes[2].set_ylabel('Value')
axes[2].grid(True, alpha=0.3)

axes[3].plot(dates, noise)
axes[3].set_title('Residual (Noise)')
axes[3].set_ylabel('Value')
axes[3].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"Time Series Statistics:")
print(f"  Mean: {ts.mean():.2f}")
print(f"  Std: {ts.std():.2f}")
print(f"  Min: {ts.min():.2f}")
print(f"  Max: {ts.max():.2f}")

Stationarity: The Key Assumption

Most time series methods assume stationarity - statistical properties don’t change over time. Analogy: Stationarity means the “rules of the game” stay constant. A stationary time series is like a casino where the odds never change — the house edge is always the same, no matter when you play. A non-stationary series is like a game where the rules keep shifting — the average changes, the volatility changes, and any strategy you learned yesterday might not work tomorrow. Most forecasting models need stationary data because they assume the patterns they learned in the past will continue into the future.
ML Application — Distribution Shift Detection: Stationarity is directly related to one of the biggest problems in production ML: distribution shift (also called dataset shift or concept drift). When the statistical properties of incoming data change over time, your model’s training assumptions break. Monitoring stationarity of input features using rolling statistics and the Augmented Dickey-Fuller test — the same tools from time series analysis — is one of the most effective ways to detect when your production model needs retraining. Companies like Netflix and Uber run these checks continuously on their feature pipelines.

What Stationarity Means

PropertyStationaryNon-Stationary
MeanConstant over timeChanges (trend)
VarianceConstant over timeChanges (heteroscedasticity)
AutocorrelationDepends only on lagDepends on time
from scipy import stats
from statsmodels.tsa.stattools import adfuller

def check_stationarity(series, window=30):
    """
    Check if a series is stationary using:
    1. Visual inspection (rolling stats)
    2. Augmented Dickey-Fuller test
    """
    # Rolling statistics
    rolling_mean = series.rolling(window=window).mean()
    rolling_std = series.rolling(window=window).std()
    
    # Plot
    fig, axes = plt.subplots(2, 1, figsize=(14, 8))
    
    axes[0].plot(series, label='Original')
    axes[0].plot(rolling_mean, color='red', linewidth=2, label='Rolling Mean')
    axes[0].fill_between(series.index, rolling_mean - 2*rolling_std, 
                         rolling_mean + 2*rolling_std, alpha=0.2, color='red')
    axes[0].legend()
    axes[0].set_title('Rolling Mean and Std')
    axes[0].grid(True, alpha=0.3)
    
    # ADF test
    adf_result = adfuller(series.dropna())
    
    axes[1].text(0.1, 0.8, f"ADF Statistic: {adf_result[0]:.4f}", transform=axes[1].transAxes, fontsize=12)
    axes[1].text(0.1, 0.6, f"p-value: {adf_result[1]:.4f}", transform=axes[1].transAxes, fontsize=12)
    axes[1].text(0.1, 0.4, f"Critical values:", transform=axes[1].transAxes, fontsize=12)
    for key, value in adf_result[4].items():
        axes[1].text(0.15, 0.3 - float(key[0])*0.05, f"  {key}: {value:.4f}", 
                    transform=axes[1].transAxes, fontsize=10)
    
    is_stationary = adf_result[1] < 0.05
    axes[1].text(0.5, 0.5, "STATIONARY ✓" if is_stationary else "NON-STATIONARY ✗",
                transform=axes[1].transAxes, fontsize=20, 
                color='green' if is_stationary else 'red',
                fontweight='bold')
    axes[1].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    return is_stationary

# Check our trend series (non-stationary)
print("Checking series with trend:")
check_stationarity(ts)

# Make it stationary by differencing
ts_diff = ts.diff().dropna()
print("\nChecking differenced series:")
check_stationarity(ts_diff)

Making Series Stationary

ProblemSolution
TrendDifferencing: yt=ytyt1y'_t = y_t - y_{t-1}
SeasonalitySeasonal differencing: yt=ytytsy'_t = y_t - y_{t-s}
Changing varianceLog transform: yt=log(yt)y'_t = \log(y_t)
# Example: Stock price (multiplicative growth) → Log returns
np.random.seed(42)

# Simulate stock price with geometric random walk
returns = np.random.randn(500) * 0.02 + 0.001  # Daily returns
price = 100 * np.cumprod(1 + returns)  # Price = cumulative product

dates = pd.date_range('2022-01-01', periods=500, freq='D')
stock = pd.Series(price, index=dates)

# Transform to stationarity
log_stock = np.log(stock)
log_returns = log_stock.diff().dropna()

fig, axes = plt.subplots(3, 1, figsize=(14, 10))

axes[0].plot(stock)
axes[0].set_title('Stock Price (Non-Stationary)')
axes[0].set_ylabel('Price ($)')
axes[0].grid(True, alpha=0.3)

axes[1].plot(log_stock)
axes[1].set_title('Log Price (Still Non-Stationary)')
axes[1].set_ylabel('Log Price')
axes[1].grid(True, alpha=0.3)

axes[2].plot(log_returns)
axes[2].set_title('Log Returns (Stationary!)')
axes[2].set_ylabel('Log Return')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("Log returns statistics:")
print(f"  Mean: {log_returns.mean():.6f}")
print(f"  Std: {log_returns.std():.6f}")

Autocorrelation: Memory in Time Series

Autocorrelation measures how correlated a time series is with lagged versions of itself. Analogy: Autocorrelation measures “memory” in a time series. Think of it like asking: “If today was a hot day, how much does that tell me about tomorrow?” If there is high autocorrelation at lag 1, hot days tend to be followed by hot days (weather has memory). If autocorrelation is zero, each day is independent of the last (like coin flips). The ACF plot shows you exactly how many days of “memory” your data has — and that directly tells you how many past values to include as features in your forecasting model.
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import acf, pacf

# Generate a time series with different memory characteristics
np.random.seed(42)
n = 500

# AR(1) process: y_t = 0.8 * y_{t-1} + noise
ar1 = np.zeros(n)
for i in range(1, n):
    ar1[i] = 0.8 * ar1[i-1] + np.random.randn()

# Random walk: y_t = y_{t-1} + noise (AR with coefficient 1)
random_walk = np.cumsum(np.random.randn(n))

# White noise (no memory)
white_noise = np.random.randn(n)

# Plot ACF for each
fig, axes = plt.subplots(3, 2, figsize=(14, 12))

series_list = [
    ('White Noise (No Memory)', white_noise),
    ('AR(1) with φ=0.8 (Short Memory)', ar1),
    ('Random Walk (Infinite Memory)', random_walk),
]

for i, (name, series) in enumerate(series_list):
    axes[i, 0].plot(series[:100])
    axes[i, 0].set_title(name)
    axes[i, 0].grid(True, alpha=0.3)
    
    # ACF
    acf_values = acf(series, nlags=40)
    axes[i, 1].bar(range(len(acf_values)), acf_values)
    axes[i, 1].axhline(y=0, color='black', linewidth=0.5)
    axes[i, 1].axhline(y=1.96/np.sqrt(n), color='red', linestyle='--', label='95% CI')
    axes[i, 1].axhline(y=-1.96/np.sqrt(n), color='red', linestyle='--')
    axes[i, 1].set_title(f'Autocorrelation Function')
    axes[i, 1].set_xlabel('Lag')

plt.tight_layout()
plt.show()

ACF vs PACF

FunctionMeasuresUse
ACF (Autocorrelation)Total correlation at lag k (including indirect)Identify MA order
PACF (Partial ACF)Direct correlation at lag k onlyIdentify AR order
# Determine model order from ACF/PACF
fig, axes = plt.subplots(2, 2, figsize=(14, 8))

# AR(2) process
ar2 = np.zeros(n)
for i in range(2, n):
    ar2[i] = 0.5 * ar2[i-1] - 0.3 * ar2[i-2] + np.random.randn()

# MA(2) process
ma2 = np.random.randn(n)
for i in range(2, n):
    ma2[i] = ma2[i] + 0.5 * np.random.randn() - 0.3 * np.random.randn()

plot_acf(ar2, ax=axes[0, 0], lags=20, title='AR(2): ACF - Gradual Decay')
plot_pacf(ar2, ax=axes[0, 1], lags=20, title='AR(2): PACF - Cuts off at lag 2')

plot_acf(ma2, ax=axes[1, 0], lags=20, title='MA(2): ACF - Cuts off at lag 2')
plot_pacf(ma2, ax=axes[1, 1], lags=20, title='MA(2): PACF - Gradual Decay')

plt.tight_layout()
plt.show()

print("Identification Rules:")
print("  AR(p): ACF decays gradually, PACF cuts off at lag p")
print("  MA(q): ACF cuts off at lag q, PACF decays gradually")
print("  ARMA(p,q): Both decay gradually")

Simple Forecasting Methods

Moving Average

def moving_average_forecast(series, window=7, steps=30):
    """Simple moving average forecast."""
    last_values = series.values[-window:]
    forecast = np.mean(last_values)
    return np.full(steps, forecast)

# Test on our stock data
train = stock[:-30]
test = stock[-30:]

ma_forecast = moving_average_forecast(train, window=7, steps=30)

plt.figure(figsize=(14, 6))
plt.plot(train[-100:], label='Training Data')
plt.plot(test.index, test, label='Actual', color='green')
plt.plot(test.index, ma_forecast, label='MA Forecast', color='red', linestyle='--')
plt.axvline(x=train.index[-1], color='black', linestyle=':', label='Forecast Start')
plt.legend()
plt.title('Moving Average Forecast')
plt.grid(True, alpha=0.3)
plt.show()

Exponential Smoothing

def exponential_smoothing(series, alpha=0.3, steps=30):
    """
    Simple exponential smoothing.
    alpha: smoothing factor (0-1). Higher = more weight on recent values.
    """
    values = series.values
    smoothed = [values[0]]
    
    for i in range(1, len(values)):
        smoothed.append(alpha * values[i] + (1 - alpha) * smoothed[-1])
    
    # Forecast is last smoothed value
    forecast = np.full(steps, smoothed[-1])
    return np.array(smoothed), forecast

# Compare different alpha values
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

for alpha in [0.1, 0.3, 0.5, 0.9]:
    smoothed, forecast = exponential_smoothing(train, alpha=alpha, steps=30)
    axes[0].plot(train.index, smoothed, label=f'α={alpha}', linewidth=1.5)

axes[0].plot(train, 'k-', alpha=0.3, label='Original')
axes[0].set_title('Exponential Smoothing with Different α')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Best forecast
_, best_forecast = exponential_smoothing(train, alpha=0.3, steps=30)
axes[1].plot(train[-100:], label='Training')
axes[1].plot(test.index, test, label='Actual', color='green')
axes[1].plot(test.index, best_forecast, label='ES Forecast (α=0.3)', color='red', linestyle='--')
axes[1].axvline(x=train.index[-1], color='black', linestyle=':')
axes[1].legend()
axes[1].set_title('Exponential Smoothing Forecast')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

ARIMA Models

ARIMA(p, d, q) combines:
  • AR(p): Autoregressive - regression on past values
  • I(d): Integrated - differencing for stationarity
  • MA(q): Moving Average - regression on past errors
yt=c+ϕ1yt1+...+ϕpytp+θ1ϵt1+...+θqϵtq+ϵty'_t = c + \phi_1 y'_{t-1} + ... + \phi_p y'_{t-p} + \theta_1 \epsilon_{t-1} + ... + \theta_q \epsilon_{t-q} + \epsilon_t Where yty'_t is the differenced series.
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Fit ARIMA model
model = ARIMA(train, order=(2, 1, 2))  # ARIMA(2,1,2)
fitted = model.fit()

print(fitted.summary())

# Forecast
forecast = fitted.forecast(steps=30)

# Plot
plt.figure(figsize=(14, 6))
plt.plot(train[-100:], label='Training Data')
plt.plot(test.index, test, label='Actual', color='green', linewidth=2)
plt.plot(test.index, forecast, label='ARIMA Forecast', color='red', linewidth=2, linestyle='--')

# Confidence interval
forecast_ci = fitted.get_forecast(steps=30).conf_int()
plt.fill_between(test.index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1], 
                 alpha=0.2, color='red')

plt.axvline(x=train.index[-1], color='black', linestyle=':')
plt.legend()
plt.title('ARIMA(2,1,2) Forecast with 95% Confidence Interval')
plt.grid(True, alpha=0.3)
plt.show()

# Metrics
rmse = np.sqrt(mean_squared_error(test, forecast))
mae = mean_absolute_error(test, forecast)
mape = np.mean(np.abs((test - forecast) / test)) * 100

print(f"\nForecast Metrics:")
print(f"  RMSE: {rmse:.4f}")
print(f"  MAE: {mae:.4f}")
print(f"  MAPE: {mape:.2f}%")

Seasonal Decomposition

from statsmodels.tsa.seasonal import seasonal_decompose

# Create seasonal data
np.random.seed(42)
n = 365 * 2  # 2 years
t = np.arange(n)

# Components
trend = 50 + 0.1 * t
seasonal = 20 * np.sin(2 * np.pi * t / 365)  # Yearly
noise = np.random.randn(n) * 5

y = trend + seasonal + noise

dates = pd.date_range('2022-01-01', periods=n, freq='D')
ts_seasonal = pd.Series(y, index=dates)

# Decompose
decomposition = seasonal_decompose(ts_seasonal, period=365)

# Plot
fig, axes = plt.subplots(4, 1, figsize=(14, 12))

axes[0].plot(ts_seasonal)
axes[0].set_title('Original Time Series')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(decomposition.trend)
axes[1].set_title('Trend')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

axes[2].plot(decomposition.seasonal)
axes[2].set_title('Seasonal')
axes[2].set_ylabel('Value')
axes[2].grid(True, alpha=0.3)

axes[3].plot(decomposition.resid)
axes[3].set_title('Residual')
axes[3].set_ylabel('Value')
axes[3].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Strength of components
total_variance = np.var(ts_seasonal)
trend_strength = 1 - np.var(decomposition.resid.dropna()) / np.var(decomposition.trend.dropna() + decomposition.resid.dropna())
seasonal_strength = 1 - np.var(decomposition.resid.dropna()) / np.var(decomposition.seasonal + decomposition.resid.dropna())

print(f"Component Strengths:")
print(f"  Trend strength: {trend_strength:.3f}")
print(f"  Seasonal strength: {seasonal_strength:.3f}")

Time Series Cross-Validation

Regular cross-validation doesn’t work for time series (can’t use future data to predict past).
Statistical Mistake in ML — Random Splitting Time Series Data: One of the most common and damaging mistakes in applied ML is using random train/test splits on time series data. If your training set includes data from March and your test set includes data from February, you are literally training on the future to predict the past. This creates a subtle but severe data leakage that inflates your metrics. Your model appears to perform brilliantly in evaluation but fails catastrophically in production. Always use time-based splits: train on the past, test on the future. This applies not just to pure time series forecasting but to any problem where data has a temporal ordering — user behavior, financial transactions, sensor readings, and more.
from sklearn.model_selection import TimeSeriesSplit

def time_series_cv(series, n_splits=5, test_size=30):
    """
    Time series cross-validation.
    Always train on past, test on future.
    """
    tscv = TimeSeriesSplit(n_splits=n_splits, test_size=test_size)
    
    fig, axes = plt.subplots(n_splits, 1, figsize=(14, 2*n_splits))
    
    rmses = []
    
    for i, (train_idx, test_idx) in enumerate(tscv.split(series)):
        train = series.iloc[train_idx]
        test = series.iloc[test_idx]
        
        # Fit simple model
        model = ARIMA(train, order=(1, 1, 1))
        fitted = model.fit()
        forecast = fitted.forecast(steps=len(test))
        
        rmse = np.sqrt(mean_squared_error(test, forecast))
        rmses.append(rmse)
        
        # Plot
        axes[i].plot(train.index, train, 'b-', label='Train')
        axes[i].plot(test.index, test, 'g-', label='Test')
        axes[i].plot(test.index, forecast, 'r--', label=f'Forecast (RMSE={rmse:.2f})')
        axes[i].set_title(f'Fold {i+1}')
        axes[i].legend(loc='upper left')
        axes[i].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print(f"\nCross-Validation Results:")
    print(f"  Mean RMSE: {np.mean(rmses):.4f}")
    print(f"  Std RMSE: {np.std(rmses):.4f}")
    print(f"  Individual folds: {[f'{r:.4f}' for r in rmses]}")

time_series_cv(stock, n_splits=5, test_size=30)

Practice Exercises

Problem: Download real stock data (e.g., using yfinance) and:
  1. Check for stationarity
  2. Transform to stationary (log returns)
  3. Fit ARIMA model
  4. Forecast next 5 days
Problem: Given monthly airline passenger data:
  1. Decompose into trend, seasonality, residual
  2. Fit SARIMA (Seasonal ARIMA)
  3. Forecast next 12 months
Problem: Compare different forecasting methods (MA, ES, ARIMA) using rolling window validation. Which performs best at different forecast horizons (1-day, 7-day, 30-day)?

Summary

ConceptWhat It IsWhy It Matters
StationarityStatistical properties constant over timeRequired for most methods
AutocorrelationCorrelation with lagged selfReveals memory structure
ACF/PACFCorrelation at different lagsModel identification
Differencingy’t = y_t - yRemove trend
ARIMAAR + I + MA combinedFlexible forecasting
Seasonal DecompositionSeparate trend/seasonal/residualUnderstand components
Key Takeaway: Time series analysis is about understanding temporal dependencies. Before applying any model, always check for stationarity and understand the autocorrelation structure. The goal is to capture the patterns (trend, seasonality) while forecasting with quantified uncertainty.