In regular ML, data points are assumed to be independent. In time series, order matters and past values predict future values.Analogy: Regular ML is like looking at a bag of random photographs — each one is independent, and you can shuffle them freely. Time series is like watching a movie — each frame only makes sense in sequence, and what happened 5 seconds ago tells you a lot about what is happening now. Shuffling the frames destroys the information.
Most time series methods assume stationarity - statistical properties don’t change over time.Analogy: Stationarity means the “rules of the game” stay constant. A stationary time series is like a casino where the odds never change — the house edge is always the same, no matter when you play. A non-stationary series is like a game where the rules keep shifting — the average changes, the volatility changes, and any strategy you learned yesterday might not work tomorrow. Most forecasting models need stationary data because they assume the patterns they learned in the past will continue into the future.
ML Application — Distribution Shift Detection: Stationarity is directly related to one of the biggest problems in production ML: distribution shift (also called dataset shift or concept drift). When the statistical properties of incoming data change over time, your model’s training assumptions break. Monitoring stationarity of input features using rolling statistics and the Augmented Dickey-Fuller test — the same tools from time series analysis — is one of the most effective ways to detect when your production model needs retraining. Companies like Netflix and Uber run these checks continuously on their feature pipelines.
Autocorrelation measures how correlated a time series is with lagged versions of itself.Analogy: Autocorrelation measures “memory” in a time series. Think of it like asking: “If today was a hot day, how much does that tell me about tomorrow?” If there is high autocorrelation at lag 1, hot days tend to be followed by hot days (weather has memory). If autocorrelation is zero, each day is independent of the last (like coin flips). The ACF plot shows you exactly how many days of “memory” your data has — and that directly tells you how many past values to include as features in your forecasting model.
from statsmodels.graphics.tsaplots import plot_acf, plot_pacffrom statsmodels.tsa.stattools import acf, pacf# Generate a time series with different memory characteristicsnp.random.seed(42)n = 500# AR(1) process: y_t = 0.8 * y_{t-1} + noisear1 = np.zeros(n)for i in range(1, n): ar1[i] = 0.8 * ar1[i-1] + np.random.randn()# Random walk: y_t = y_{t-1} + noise (AR with coefficient 1)random_walk = np.cumsum(np.random.randn(n))# White noise (no memory)white_noise = np.random.randn(n)# Plot ACF for eachfig, axes = plt.subplots(3, 2, figsize=(14, 12))series_list = [ ('White Noise (No Memory)', white_noise), ('AR(1) with φ=0.8 (Short Memory)', ar1), ('Random Walk (Infinite Memory)', random_walk),]for i, (name, series) in enumerate(series_list): axes[i, 0].plot(series[:100]) axes[i, 0].set_title(name) axes[i, 0].grid(True, alpha=0.3) # ACF acf_values = acf(series, nlags=40) axes[i, 1].bar(range(len(acf_values)), acf_values) axes[i, 1].axhline(y=0, color='black', linewidth=0.5) axes[i, 1].axhline(y=1.96/np.sqrt(n), color='red', linestyle='--', label='95% CI') axes[i, 1].axhline(y=-1.96/np.sqrt(n), color='red', linestyle='--') axes[i, 1].set_title(f'Autocorrelation Function') axes[i, 1].set_xlabel('Lag')plt.tight_layout()plt.show()
Regular cross-validation doesn’t work for time series (can’t use future data to predict past).
Statistical Mistake in ML — Random Splitting Time Series Data: One of the most common and damaging mistakes in applied ML is using random train/test splits on time series data. If your training set includes data from March and your test set includes data from February, you are literally training on the future to predict the past. This creates a subtle but severe data leakage that inflates your metrics. Your model appears to perform brilliantly in evaluation but fails catastrophically in production. Always use time-based splits: train on the past, test on the future. This applies not just to pure time series forecasting but to any problem where data has a temporal ordering — user behavior, financial transactions, sensor readings, and more.
from sklearn.model_selection import TimeSeriesSplitdef time_series_cv(series, n_splits=5, test_size=30): """ Time series cross-validation. Always train on past, test on future. """ tscv = TimeSeriesSplit(n_splits=n_splits, test_size=test_size) fig, axes = plt.subplots(n_splits, 1, figsize=(14, 2*n_splits)) rmses = [] for i, (train_idx, test_idx) in enumerate(tscv.split(series)): train = series.iloc[train_idx] test = series.iloc[test_idx] # Fit simple model model = ARIMA(train, order=(1, 1, 1)) fitted = model.fit() forecast = fitted.forecast(steps=len(test)) rmse = np.sqrt(mean_squared_error(test, forecast)) rmses.append(rmse) # Plot axes[i].plot(train.index, train, 'b-', label='Train') axes[i].plot(test.index, test, 'g-', label='Test') axes[i].plot(test.index, forecast, 'r--', label=f'Forecast (RMSE={rmse:.2f})') axes[i].set_title(f'Fold {i+1}') axes[i].legend(loc='upper left') axes[i].grid(True, alpha=0.3) plt.tight_layout() plt.show() print(f"\nCross-Validation Results:") print(f" Mean RMSE: {np.mean(rmses):.4f}") print(f" Std RMSE: {np.std(rmses):.4f}") print(f" Individual folds: {[f'{r:.4f}' for r in rmses]}")time_series_cv(stock, n_splits=5, test_size=30)
Problem: Download real stock data (e.g., using yfinance) and:
Check for stationarity
Transform to stationary (log returns)
Fit ARIMA model
Forecast next 5 days
Exercise 2: Seasonal Model
Problem: Given monthly airline passenger data:
Decompose into trend, seasonality, residual
Fit SARIMA (Seasonal ARIMA)
Forecast next 12 months
Exercise 3: Multi-Step Forecast
Problem: Compare different forecasting methods (MA, ES, ARIMA) using rolling window validation. Which performs best at different forecast horizons (1-day, 7-day, 30-day)?
Key Takeaway: Time series analysis is about understanding temporal dependencies. Before applying any model, always check for stationarity and understand the autocorrelation structure. The goal is to capture the patterns (trend, seasonality) while forecasting with quantified uncertainty.