SARIMA (Seasonal Autoregressive Integrated Moving Average) (original) (raw)

Last Updated : 7 Apr, 2026

SARIMA or Seasonal Autoregressive Integrated Moving Average is an extension of the traditional ARIMA model, specifically designed for time series data with seasonal patterns. While ARIMA is great for non-seasonal data, SARIMA introduces seasonal components to handle periodic fluctuations and provides better forecasting capabilities for seasonal data.

Understanding the Components of SARIMA

SARIMA consists of several components that help capture both short-term and long-term dependencies within a time series:

SARIMA Notation

The SARIMA model is represented as:

**SARIMA(p, d, q)(P, D, Q, s)

**Parameters:

Before applying SARIMA, seasonal differencing is often required to make the data stationary. This process involves subtracting the current observation from one that corresponds to the same season in the previous cycle. Seasonal differencing helps remove the seasonal pattern from the data, enabling more accurate forecasting.

Understanding Mathematical Representation of SARIMA

The SARIMA model can be expressed mathematically as:

(1 - \phi_1 B) (1 - \Phi_1 B^s) (1 - B) (1 - B^s) y_t = (1 + \theta_1 B) (1 + \Theta_1 B^s) \epsilon_t

**Parameters:

Implementing SARIMA in Time Series Forecasting

1. Importing Libraries

To begin working with SARIMA, we need to import the necessary libraries like Numpy, Pandas, Matplotlib, Statsmodels and Scikit-learn.

Python `

import numpy as np import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.statespace.sarimax import SARIMAX from statsmodels.tsa.stattools import adfuller from statsmodels.graphics.tsaplots import plot_acf, plot_pacf from sklearn.metrics import mean_absolute_error, mean_squared_error

`

2. Loading the Dataset

We will load a retail dataset to predict monthly sales for a global superstore.

You can download dataset from here.

df = pd.read_csv("/content/Dataset- Superstore (2015-2018).csv") sales_data = df[['Order Date', 'Sales']]

sales_data['Order Date'] = pd.to_datetime(sales_data['Order Date']) sales_data.head()

`

**Output:

data

Dataset

We will aggregate the data on a monthly basis to focus on trends rather than daily fluctuations.

df1 = sales_data.set_index('Order Date') monthly_sales = df1.resample('ME').sum() monthly_sales.head()

`

**Output:

monthly_data

Monthly Sales

4. Plotting the Monthly Sales

Visualizing the sales data helps us identify seasonal patterns.

plt.figure(figsize=(10, 6)) plt.plot(monthly_sales['Sales'], linewidth=3, c='deeppink') plt.title("Monthly Sales") plt.xlabel("Date") plt.ylabel("Sales") plt.show()

`

**Output:

month_plot

Plotting Monthly Sales

5. Stationarity Check

Before applying SARIMA, we need to check if the data is stationary. Stationary data has constant mean and variance, which is a key assumption for SARIMA. We use the Augmented Dickey-Fuller test (ADF) for this.

def check_stationarity(timeseries): result = adfuller(timeseries, autolag='AIC') p_value = result[1] print(f'ADF Statistic: {result[0]}') print(f'p-value: {p_value}') print('Stationary' if p_value < 0.05 else 'Non-Stationary')

check_stationarity(monthly_sales['Sales'])

`

**Output:

ADF Statistic: -4.493767844002665
p-value: 0.00020180198458237758
Stationary

6. Identifying Model Parameters

We can identify the SARIMA model parameters (p, d, q, P, D, Q, s) using Autocorrelation (ACF) and Partial Autocorrelation (PACF) plots. These plots help in determining the order of the model components.

plot_acf(monthly_sales) plot_pacf(monthly_sales) plt.show()

`

**Output:

acf

ACF

pacf

PACF

7. Fitting the SARIMA Model

Once we have identified the model parameters, we can fit the SARIMA model using the SARIMAX function.

p, d, q = 1, 1, 1 P, D, Q, s = 1, 1, 1, 12

model = SARIMAX(monthly_sales, order=(p, d, q), seasonal_order=(P, D, Q, s)) results = model.fit()

`

8. Generating Forecasts

With the SARIMA model fitted, we can forecast future sales values. For example, forecasting the next 12 months.

forecast_periods = 12 forecast = results.forecast(steps=forecast_periods)

plt.figure(figsize=(10, 6)) plt.plot(monthly_sales, label='Observed') plt.plot(forecast, label='Forecast', color='red') plt.title("Sales Forecast") plt.xlabel("Date") plt.ylabel("Sales") plt.legend() plt.show()

`

**Output:

forecast

Forecasts

9. Evaluating the Model

We evaluate the model’s forecast accuracy using Mean Absolute Error (MAE) and Mean Squared Error (MSE).

Take last 12 months as observed values

observed = monthly_sales.tail(forecast_periods)

mae = mean_absolute_error(observed, forecast) mse = mean_squared_error(observed, forecast)

print("MAE:", mae) print("MSE:", mse)

`

**Output:

MAE: 10611.591984026598
MSE: 151953342.15188608

You can download source code from here.