Skip to main content
Time Series Fundamentals

Time Series Fundamentals

What Makes Time Series Special?

In regular ML, data points are assumed to be independent. In time series, order matters and past values predict future values.
Regular MLTime Series
Each sample is independentSamples are correlated in time
Shuffle data for cross-validationMust respect time order
Features predict targetPast values predict future values
Random split for train/testTime-based split required
Estimated Time: 4-5 hours
Difficulty: Intermediate
Prerequisites: Descriptive Statistics, Probability, Regression
What You’ll Build: Stock price forecaster, seasonal decomposition tool

Time Series Components

Every time series can be decomposed into:
  1. Trend: Long-term increase or decrease
  2. Seasonality: Regular patterns that repeat (daily, weekly, yearly)
  3. Cyclical: Irregular fluctuations (business cycles)
  4. Residual: Random noise
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import signal

np.random.seed(42)

# Create a synthetic time series with all components
n = 365 * 3  # 3 years of daily data
t = np.arange(n)

# Components
trend = 0.05 * t  # Upward trend
seasonal = 10 * np.sin(2 * np.pi * t / 365)  # Yearly seasonality
weekly = 3 * np.sin(2 * np.pi * t / 7)  # Weekly pattern
noise = np.random.randn(n) * 2  # Random noise

# Combine
y = 100 + trend + seasonal + weekly + noise

# Create time index
dates = pd.date_range('2021-01-01', periods=n, freq='D')
ts = pd.Series(y, index=dates)

# Plot
fig, axes = plt.subplots(4, 1, figsize=(14, 12))

axes[0].plot(dates, y)
axes[0].set_title('Complete Time Series')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(dates, trend + 100)
axes[1].set_title('Trend Component')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

axes[2].plot(dates, seasonal)
axes[2].set_title('Seasonal Component (Yearly)')
axes[2].set_ylabel('Value')
axes[2].grid(True, alpha=0.3)

axes[3].plot(dates, noise)
axes[3].set_title('Residual (Noise)')
axes[3].set_ylabel('Value')
axes[3].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"Time Series Statistics:")
print(f"  Mean: {ts.mean():.2f}")
print(f"  Std: {ts.std():.2f}")
print(f"  Min: {ts.min():.2f}")
print(f"  Max: {ts.max():.2f}")

Stationarity: The Key Assumption

Most time series methods assume stationarity - statistical properties don’t change over time.

What Stationarity Means

PropertyStationaryNon-Stationary
MeanConstant over timeChanges (trend)
VarianceConstant over timeChanges (heteroscedasticity)
AutocorrelationDepends only on lagDepends on time
from scipy import stats
from statsmodels.tsa.stattools import adfuller

def check_stationarity(series, window=30):
    """
    Check if a series is stationary using:
    1. Visual inspection (rolling stats)
    2. Augmented Dickey-Fuller test
    """
    # Rolling statistics
    rolling_mean = series.rolling(window=window).mean()
    rolling_std = series.rolling(window=window).std()
    
    # Plot
    fig, axes = plt.subplots(2, 1, figsize=(14, 8))
    
    axes[0].plot(series, label='Original')
    axes[0].plot(rolling_mean, color='red', linewidth=2, label='Rolling Mean')
    axes[0].fill_between(series.index, rolling_mean - 2*rolling_std, 
                         rolling_mean + 2*rolling_std, alpha=0.2, color='red')
    axes[0].legend()
    axes[0].set_title('Rolling Mean and Std')
    axes[0].grid(True, alpha=0.3)
    
    # ADF test
    adf_result = adfuller(series.dropna())
    
    axes[1].text(0.1, 0.8, f"ADF Statistic: {adf_result[0]:.4f}", transform=axes[1].transAxes, fontsize=12)
    axes[1].text(0.1, 0.6, f"p-value: {adf_result[1]:.4f}", transform=axes[1].transAxes, fontsize=12)
    axes[1].text(0.1, 0.4, f"Critical values:", transform=axes[1].transAxes, fontsize=12)
    for key, value in adf_result[4].items():
        axes[1].text(0.15, 0.3 - float(key[0])*0.05, f"  {key}: {value:.4f}", 
                    transform=axes[1].transAxes, fontsize=10)
    
    is_stationary = adf_result[1] < 0.05
    axes[1].text(0.5, 0.5, "STATIONARY ✓" if is_stationary else "NON-STATIONARY ✗",
                transform=axes[1].transAxes, fontsize=20, 
                color='green' if is_stationary else 'red',
                fontweight='bold')
    axes[1].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    return is_stationary

# Check our trend series (non-stationary)
print("Checking series with trend:")
check_stationarity(ts)

# Make it stationary by differencing
ts_diff = ts.diff().dropna()
print("\nChecking differenced series:")
check_stationarity(ts_diff)

Making Series Stationary

ProblemSolution
TrendDifferencing: yt=ytyt1y'_t = y_t - y_{t-1}
SeasonalitySeasonal differencing: yt=ytytsy'_t = y_t - y_{t-s}
Changing varianceLog transform: yt=log(yt)y'_t = \log(y_t)
# Example: Stock price (multiplicative growth) → Log returns
np.random.seed(42)

# Simulate stock price with geometric random walk
returns = np.random.randn(500) * 0.02 + 0.001  # Daily returns
price = 100 * np.cumprod(1 + returns)  # Price = cumulative product

dates = pd.date_range('2022-01-01', periods=500, freq='D')
stock = pd.Series(price, index=dates)

# Transform to stationarity
log_stock = np.log(stock)
log_returns = log_stock.diff().dropna()

fig, axes = plt.subplots(3, 1, figsize=(14, 10))

axes[0].plot(stock)
axes[0].set_title('Stock Price (Non-Stationary)')
axes[0].set_ylabel('Price ($)')
axes[0].grid(True, alpha=0.3)

axes[1].plot(log_stock)
axes[1].set_title('Log Price (Still Non-Stationary)')
axes[1].set_ylabel('Log Price')
axes[1].grid(True, alpha=0.3)

axes[2].plot(log_returns)
axes[2].set_title('Log Returns (Stationary!)')
axes[2].set_ylabel('Log Return')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("Log returns statistics:")
print(f"  Mean: {log_returns.mean():.6f}")
print(f"  Std: {log_returns.std():.6f}")

Autocorrelation: Memory in Time Series

Autocorrelation measures how correlated a time series is with lagged versions of itself.
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import acf, pacf

# Generate a time series with different memory characteristics
np.random.seed(42)
n = 500

# AR(1) process: y_t = 0.8 * y_{t-1} + noise
ar1 = np.zeros(n)
for i in range(1, n):
    ar1[i] = 0.8 * ar1[i-1] + np.random.randn()

# Random walk: y_t = y_{t-1} + noise (AR with coefficient 1)
random_walk = np.cumsum(np.random.randn(n))

# White noise (no memory)
white_noise = np.random.randn(n)

# Plot ACF for each
fig, axes = plt.subplots(3, 2, figsize=(14, 12))

series_list = [
    ('White Noise (No Memory)', white_noise),
    ('AR(1) with φ=0.8 (Short Memory)', ar1),
    ('Random Walk (Infinite Memory)', random_walk),
]

for i, (name, series) in enumerate(series_list):
    axes[i, 0].plot(series[:100])
    axes[i, 0].set_title(name)
    axes[i, 0].grid(True, alpha=0.3)
    
    # ACF
    acf_values = acf(series, nlags=40)
    axes[i, 1].bar(range(len(acf_values)), acf_values)
    axes[i, 1].axhline(y=0, color='black', linewidth=0.5)
    axes[i, 1].axhline(y=1.96/np.sqrt(n), color='red', linestyle='--', label='95% CI')
    axes[i, 1].axhline(y=-1.96/np.sqrt(n), color='red', linestyle='--')
    axes[i, 1].set_title(f'Autocorrelation Function')
    axes[i, 1].set_xlabel('Lag')

plt.tight_layout()
plt.show()

ACF vs PACF

FunctionMeasuresUse
ACF (Autocorrelation)Total correlation at lag k (including indirect)Identify MA order
PACF (Partial ACF)Direct correlation at lag k onlyIdentify AR order
# Determine model order from ACF/PACF
fig, axes = plt.subplots(2, 2, figsize=(14, 8))

# AR(2) process
ar2 = np.zeros(n)
for i in range(2, n):
    ar2[i] = 0.5 * ar2[i-1] - 0.3 * ar2[i-2] + np.random.randn()

# MA(2) process
ma2 = np.random.randn(n)
for i in range(2, n):
    ma2[i] = ma2[i] + 0.5 * np.random.randn() - 0.3 * np.random.randn()

plot_acf(ar2, ax=axes[0, 0], lags=20, title='AR(2): ACF - Gradual Decay')
plot_pacf(ar2, ax=axes[0, 1], lags=20, title='AR(2): PACF - Cuts off at lag 2')

plot_acf(ma2, ax=axes[1, 0], lags=20, title='MA(2): ACF - Cuts off at lag 2')
plot_pacf(ma2, ax=axes[1, 1], lags=20, title='MA(2): PACF - Gradual Decay')

plt.tight_layout()
plt.show()

print("Identification Rules:")
print("  AR(p): ACF decays gradually, PACF cuts off at lag p")
print("  MA(q): ACF cuts off at lag q, PACF decays gradually")
print("  ARMA(p,q): Both decay gradually")

Simple Forecasting Methods

Moving Average

def moving_average_forecast(series, window=7, steps=30):
    """Simple moving average forecast."""
    last_values = series.values[-window:]
    forecast = np.mean(last_values)
    return np.full(steps, forecast)

# Test on our stock data
train = stock[:-30]
test = stock[-30:]

ma_forecast = moving_average_forecast(train, window=7, steps=30)

plt.figure(figsize=(14, 6))
plt.plot(train[-100:], label='Training Data')
plt.plot(test.index, test, label='Actual', color='green')
plt.plot(test.index, ma_forecast, label='MA Forecast', color='red', linestyle='--')
plt.axvline(x=train.index[-1], color='black', linestyle=':', label='Forecast Start')
plt.legend()
plt.title('Moving Average Forecast')
plt.grid(True, alpha=0.3)
plt.show()

Exponential Smoothing

def exponential_smoothing(series, alpha=0.3, steps=30):
    """
    Simple exponential smoothing.
    alpha: smoothing factor (0-1). Higher = more weight on recent values.
    """
    values = series.values
    smoothed = [values[0]]
    
    for i in range(1, len(values)):
        smoothed.append(alpha * values[i] + (1 - alpha) * smoothed[-1])
    
    # Forecast is last smoothed value
    forecast = np.full(steps, smoothed[-1])
    return np.array(smoothed), forecast

# Compare different alpha values
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

for alpha in [0.1, 0.3, 0.5, 0.9]:
    smoothed, forecast = exponential_smoothing(train, alpha=alpha, steps=30)
    axes[0].plot(train.index, smoothed, label=f'α={alpha}', linewidth=1.5)

axes[0].plot(train, 'k-', alpha=0.3, label='Original')
axes[0].set_title('Exponential Smoothing with Different α')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Best forecast
_, best_forecast = exponential_smoothing(train, alpha=0.3, steps=30)
axes[1].plot(train[-100:], label='Training')
axes[1].plot(test.index, test, label='Actual', color='green')
axes[1].plot(test.index, best_forecast, label='ES Forecast (α=0.3)', color='red', linestyle='--')
axes[1].axvline(x=train.index[-1], color='black', linestyle=':')
axes[1].legend()
axes[1].set_title('Exponential Smoothing Forecast')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

ARIMA Models

ARIMA(p, d, q) combines:
  • AR(p): Autoregressive - regression on past values
  • I(d): Integrated - differencing for stationarity
  • MA(q): Moving Average - regression on past errors
yt=c+ϕ1yt1+...+ϕpytp+θ1ϵt1+...+θqϵtq+ϵty'_t = c + \phi_1 y'_{t-1} + ... + \phi_p y'_{t-p} + \theta_1 \epsilon_{t-1} + ... + \theta_q \epsilon_{t-q} + \epsilon_t Where yty'_t is the differenced series.
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Fit ARIMA model
model = ARIMA(train, order=(2, 1, 2))  # ARIMA(2,1,2)
fitted = model.fit()

print(fitted.summary())

# Forecast
forecast = fitted.forecast(steps=30)

# Plot
plt.figure(figsize=(14, 6))
plt.plot(train[-100:], label='Training Data')
plt.plot(test.index, test, label='Actual', color='green', linewidth=2)
plt.plot(test.index, forecast, label='ARIMA Forecast', color='red', linewidth=2, linestyle='--')

# Confidence interval
forecast_ci = fitted.get_forecast(steps=30).conf_int()
plt.fill_between(test.index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1], 
                 alpha=0.2, color='red')

plt.axvline(x=train.index[-1], color='black', linestyle=':')
plt.legend()
plt.title('ARIMA(2,1,2) Forecast with 95% Confidence Interval')
plt.grid(True, alpha=0.3)
plt.show()

# Metrics
rmse = np.sqrt(mean_squared_error(test, forecast))
mae = mean_absolute_error(test, forecast)
mape = np.mean(np.abs((test - forecast) / test)) * 100

print(f"\nForecast Metrics:")
print(f"  RMSE: {rmse:.4f}")
print(f"  MAE: {mae:.4f}")
print(f"  MAPE: {mape:.2f}%")

Seasonal Decomposition

from statsmodels.tsa.seasonal import seasonal_decompose

# Create seasonal data
np.random.seed(42)
n = 365 * 2  # 2 years
t = np.arange(n)

# Components
trend = 50 + 0.1 * t
seasonal = 20 * np.sin(2 * np.pi * t / 365)  # Yearly
noise = np.random.randn(n) * 5

y = trend + seasonal + noise

dates = pd.date_range('2022-01-01', periods=n, freq='D')
ts_seasonal = pd.Series(y, index=dates)

# Decompose
decomposition = seasonal_decompose(ts_seasonal, period=365)

# Plot
fig, axes = plt.subplots(4, 1, figsize=(14, 12))

axes[0].plot(ts_seasonal)
axes[0].set_title('Original Time Series')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(decomposition.trend)
axes[1].set_title('Trend')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

axes[2].plot(decomposition.seasonal)
axes[2].set_title('Seasonal')
axes[2].set_ylabel('Value')
axes[2].grid(True, alpha=0.3)

axes[3].plot(decomposition.resid)
axes[3].set_title('Residual')
axes[3].set_ylabel('Value')
axes[3].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Strength of components
total_variance = np.var(ts_seasonal)
trend_strength = 1 - np.var(decomposition.resid.dropna()) / np.var(decomposition.trend.dropna() + decomposition.resid.dropna())
seasonal_strength = 1 - np.var(decomposition.resid.dropna()) / np.var(decomposition.seasonal + decomposition.resid.dropna())

print(f"Component Strengths:")
print(f"  Trend strength: {trend_strength:.3f}")
print(f"  Seasonal strength: {seasonal_strength:.3f}")

Time Series Cross-Validation

Regular cross-validation doesn’t work for time series (can’t use future data to predict past).
from sklearn.model_selection import TimeSeriesSplit

def time_series_cv(series, n_splits=5, test_size=30):
    """
    Time series cross-validation.
    Always train on past, test on future.
    """
    tscv = TimeSeriesSplit(n_splits=n_splits, test_size=test_size)
    
    fig, axes = plt.subplots(n_splits, 1, figsize=(14, 2*n_splits))
    
    rmses = []
    
    for i, (train_idx, test_idx) in enumerate(tscv.split(series)):
        train = series.iloc[train_idx]
        test = series.iloc[test_idx]
        
        # Fit simple model
        model = ARIMA(train, order=(1, 1, 1))
        fitted = model.fit()
        forecast = fitted.forecast(steps=len(test))
        
        rmse = np.sqrt(mean_squared_error(test, forecast))
        rmses.append(rmse)
        
        # Plot
        axes[i].plot(train.index, train, 'b-', label='Train')
        axes[i].plot(test.index, test, 'g-', label='Test')
        axes[i].plot(test.index, forecast, 'r--', label=f'Forecast (RMSE={rmse:.2f})')
        axes[i].set_title(f'Fold {i+1}')
        axes[i].legend(loc='upper left')
        axes[i].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print(f"\nCross-Validation Results:")
    print(f"  Mean RMSE: {np.mean(rmses):.4f}")
    print(f"  Std RMSE: {np.std(rmses):.4f}")
    print(f"  Individual folds: {[f'{r:.4f}' for r in rmses]}")

time_series_cv(stock, n_splits=5, test_size=30)

Practice Exercises

Problem: Download real stock data (e.g., using yfinance) and:
  1. Check for stationarity
  2. Transform to stationary (log returns)
  3. Fit ARIMA model
  4. Forecast next 5 days
Problem: Given monthly airline passenger data:
  1. Decompose into trend, seasonality, residual
  2. Fit SARIMA (Seasonal ARIMA)
  3. Forecast next 12 months
Problem: Compare different forecasting methods (MA, ES, ARIMA) using rolling window validation. Which performs best at different forecast horizons (1-day, 7-day, 30-day)?

Summary

ConceptWhat It IsWhy It Matters
StationarityStatistical properties constant over timeRequired for most methods
AutocorrelationCorrelation with lagged selfReveals memory structure
ACF/PACFCorrelation at different lagsModel identification
Differencingy’t = y_t - yRemove trend
ARIMAAR + I + MA combinedFlexible forecasting
Seasonal DecompositionSeparate trend/seasonal/residualUnderstand components
Key Takeaway: Time series analysis is about understanding temporal dependencies. Before applying any model, always check for stationarity and understand the autocorrelation structure. The goal is to capture the patterns (trend, seasonality) while forecasting with quantified uncertainty.