Time Series Analysis: Forecasting the Future

What is Time Series Analysis?

Time series analysis involves studying data points collected over time to identify patterns, trends, and seasonality. It's essential for forecasting future values in domains like finance, sales, weather, and resource planning.

Unlike regular ML where observations are independent, time series data has temporal dependencies - what happens today depends on what happened yesterday.

Components of Time Series

Trend: Long-term increase or decrease in the data
Seasonality: Regular patterns that repeat (daily, weekly, yearly)
Cyclical: Longer-term fluctuations without fixed period
Noise: Random variation that can't be explained

import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose

# Decompose time series
result = seasonal_decompose(df['sales'], model='multiplicative', period=12)

# Plot components
result.plot()

# Access individual components
trend = result.trend
seasonal = result.seasonal
residual = result.resid

Stationarity

Many time series methods require stationary data (constant mean and variance over time):

from statsmodels.tsa.stattools import adfuller

# Augmented Dickey-Fuller test
def check_stationarity(series):
    result = adfuller(series.dropna())
    print(f'ADF Statistic: {result[0]:.4f}')
    print(f'p-value: {result[1]:.4f}')
    if result[1] <= 0.05:
        print("Series is stationary")
    else:
        print("Series is non-stationary")

# Make stationary with differencing
df['sales_diff'] = df['sales'].diff()

# Or log transformation
df['sales_log'] = np.log(df['sales'])
df['sales_log_diff'] = df['sales_log'].diff()

ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) is a classical forecasting method:

from statsmodels.tsa.arima.model import ARIMA
from pmdarima import auto_arima

# Auto ARIMA to find best parameters
auto_model = auto_arima(
    df['sales'],
    seasonal=True,
    m=12,  # Monthly seasonality
    trace=True,
    suppress_warnings=True
)
print(auto_model.summary())

# Fit ARIMA with found parameters
model = ARIMA(df['sales'], order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
fitted = model.fit()

# Forecast
forecast = fitted.forecast(steps=12)

# Confidence intervals
forecast_df = fitted.get_forecast(steps=12)
conf_int = forecast_df.conf_int()

Prophet by Facebook

Prophet handles seasonality, holidays, and missing data automatically:

from prophet import Prophet

# Prepare data (must have 'ds' and 'y' columns)
df_prophet = df.rename(columns={'date': 'ds', 'sales': 'y'})

# Create and fit model
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False,
    seasonality_mode='multiplicative'
)

# Add custom seasonality
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)

# Add holidays
model.add_country_holidays(country_name='US')

model.fit(df_prophet)

# Create future dates
future = model.make_future_dataframe(periods=365)

# Predict
forecast = model.predict(future)

# Plot
model.plot(forecast)
model.plot_components(forecast)

LSTM for Time Series

import torch
import torch.nn as nn
from sklearn.preprocessing import MinMaxScaler

# Prepare sequences
def create_sequences(data, seq_length):
    xs, ys = [], []
    for i in range(len(data) - seq_length):
        x = data[i:(i + seq_length)]
        y = data[i + seq_length]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

# Scale data
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df[['sales']])

# Create sequences
seq_length = 30
X, y = create_sequences(scaled_data, seq_length)

# LSTM Model
class LSTMModel(nn.Module):
    def __init__(self, input_size=1, hidden_size=64, num_layers=2):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, 1)

    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        return self.fc(lstm_out[:, -1, :])

model = LSTMModel()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

Evaluation Metrics

from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

def evaluate_forecast(actual, predicted):
    mae = mean_absolute_error(actual, predicted)
    mse = mean_squared_error(actual, predicted)
    rmse = np.sqrt(mse)
    mape = np.mean(np.abs((actual - predicted) / actual)) * 100

    print(f"MAE: {mae:.2f}")
    print(f"RMSE: {rmse:.2f}")
    print(f"MAPE: {mape:.2f}%")

    return {'mae': mae, 'rmse': rmse, 'mape': mape}

Best Practices

Understand your data: Visualize trends, seasonality, anomalies
Use proper train/test split: Time-based split, not random
Handle missing values: Interpolation, forward fill
Feature engineering: Lag features, rolling statistics
Cross-validation: Use time series cross-validation
Ensemble methods: Combine multiple models

Master Time Series with Expert Mentorship

Our Data Science program covers time series analysis from basics to advanced forecasting. Build real prediction systems with guidance from industry experts.

Explore Data Science Program

Time Series Analysis

What is Time Series Analysis?

Components of Time Series

Stationarity

ARIMA Models

Prophet by Facebook

LSTM for Time Series

Evaluation Metrics

Best Practices

Master Time Series with Expert Mentorship

Related Articles

Machine Learning Fundamentals

Deep Learning

Statistics for Data Science