What is Time Series Analysis?

Time series analysis involves studying data points collected over time to identify patterns, trends, and seasonality. It's essential for forecasting future values in domains like finance, sales, weather, and resource planning.

Unlike regular ML where observations are independent, time series data has temporal dependencies - what happens today depends on what happened yesterday.

Components of Time Series

  • Trend: Long-term increase or decrease in the data
  • Seasonality: Regular patterns that repeat (daily, weekly, yearly)
  • Cyclical: Longer-term fluctuations without fixed period
  • Noise: Random variation that can't be explained
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose

# Decompose time series
result = seasonal_decompose(df['sales'], model='multiplicative', period=12)

# Plot components
result.plot()

# Access individual components
trend = result.trend
seasonal = result.seasonal
residual = result.resid

Stationarity

Many time series methods require stationary data (constant mean and variance over time):

from statsmodels.tsa.stattools import adfuller

# Augmented Dickey-Fuller test
def check_stationarity(series):
    result = adfuller(series.dropna())
    print(f'ADF Statistic: {result[0]:.4f}')
    print(f'p-value: {result[1]:.4f}')
    if result[1] <= 0.05:
        print("Series is stationary")
    else:
        print("Series is non-stationary")

# Make stationary with differencing
df['sales_diff'] = df['sales'].diff()

# Or log transformation
df['sales_log'] = np.log(df['sales'])
df['sales_log_diff'] = df['sales_log'].diff()

ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) is a classical forecasting method:

from statsmodels.tsa.arima.model import ARIMA
from pmdarima import auto_arima

# Auto ARIMA to find best parameters
auto_model = auto_arima(
    df['sales'],
    seasonal=True,
    m=12,  # Monthly seasonality
    trace=True,
    suppress_warnings=True
)
print(auto_model.summary())

# Fit ARIMA with found parameters
model = ARIMA(df['sales'], order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
fitted = model.fit()

# Forecast
forecast = fitted.forecast(steps=12)

# Confidence intervals
forecast_df = fitted.get_forecast(steps=12)
conf_int = forecast_df.conf_int()

Prophet by Facebook

Prophet handles seasonality, holidays, and missing data automatically:

from prophet import Prophet

# Prepare data (must have 'ds' and 'y' columns)
df_prophet = df.rename(columns={'date': 'ds', 'sales': 'y'})

# Create and fit model
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False,
    seasonality_mode='multiplicative'
)

# Add custom seasonality
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)

# Add holidays
model.add_country_holidays(country_name='US')

model.fit(df_prophet)

# Create future dates
future = model.make_future_dataframe(periods=365)

# Predict
forecast = model.predict(future)

# Plot
model.plot(forecast)
model.plot_components(forecast)

LSTM for Time Series

import torch
import torch.nn as nn
from sklearn.preprocessing import MinMaxScaler

# Prepare sequences
def create_sequences(data, seq_length):
    xs, ys = [], []
    for i in range(len(data) - seq_length):
        x = data[i:(i + seq_length)]
        y = data[i + seq_length]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

# Scale data
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df[['sales']])

# Create sequences
seq_length = 30
X, y = create_sequences(scaled_data, seq_length)

# LSTM Model
class LSTMModel(nn.Module):
    def __init__(self, input_size=1, hidden_size=64, num_layers=2):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, 1)

    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        return self.fc(lstm_out[:, -1, :])

model = LSTMModel()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

Evaluation Metrics

from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

def evaluate_forecast(actual, predicted):
    mae = mean_absolute_error(actual, predicted)
    mse = mean_squared_error(actual, predicted)
    rmse = np.sqrt(mse)
    mape = np.mean(np.abs((actual - predicted) / actual)) * 100

    print(f"MAE: {mae:.2f}")
    print(f"RMSE: {rmse:.2f}")
    print(f"MAPE: {mape:.2f}%")

    return {'mae': mae, 'rmse': rmse, 'mape': mape}

Best Practices

  • Understand your data: Visualize trends, seasonality, anomalies
  • Use proper train/test split: Time-based split, not random
  • Handle missing values: Interpolation, forward fill
  • Feature engineering: Lag features, rolling statistics
  • Cross-validation: Use time series cross-validation
  • Ensemble methods: Combine multiple models

Master Time Series with Expert Mentorship

Our Data Science program covers time series analysis from basics to advanced forecasting. Build real prediction systems with guidance from industry experts.

Explore Data Science Program

Related Articles