Python | ARIMA Model for Time Series Forecasting (original) (raw)

Last Updated : 19 Feb, 2020

A Time Series is defined as a series of data points indexed in time order. The time order can be daily, monthly, or even yearly. Given below is an example of a Time Series that illustrates the number of passengers of an airline per month from the year 1949 to 1960. Time Series ForecastingTime Series forecasting is the process of using a statistical model to predict future values of a time series based on past results.Some Use Cases

Components of a Time Series:

Importing required libraries

import numpy as np import pandas as pd import matplotlib.pylot as plt from statsmodels.tsa.seasonal import seasonal_decompose

Read the AirPassengers dataset

airline = pd.read_csv('AirPassengers.csv', index_col ='Month', parse_dates = True)

Print the first five rows of the dataset

airline.head()

ETS Decomposition

result = seasonal_decompose(airline['# Passengers'], model ='multiplicative')

ETS plot

result.plot()

` Output: ARIMA Model for Time Series ForecastingARIMA stands for autoregressive integrated moving average model and is specified by three order parameters: (p, d, q).

To install the library

pip install pmdarima

Import the library

from pmdarima import auto_arima

Ignore harmless warnings

import warnings warnings.filterwarnings("ignore")

Fit auto_arima function to AirPassengers dataset

stepwise_fit = auto_arima(airline['# Passengers'], start_p = 1, start_q = 1, max_p = 3, max_q = 3, m = 12, start_P = 0, seasonal = True, d = None, D = 1, trace = True, error_action ='ignore', # we don't want to know if an order does not work suppress_warnings = True, # we don't want convergence warnings stepwise = True) # set to stepwise

To print the summary

stepwise_fit.summary()

**Output:** ![](https://media.geeksforgeeks.org/wp-content/uploads/20200131180914/Screenshot-2020-01-31-at-6.07.51-PM.png) **Code : Fit ARIMA Model to AirPassengers dataset** Python3 1==

Split data into train / test sets

train = airline.iloc[:len(airline)-12] test = airline.iloc[len(airline)-12:] # set one year(12 months) for testing

Fit a SARIMAX(0, 1, 1)x(2, 1, 1, 12) on the training set

from statsmodels.tsa.statespace.sarimax import SARIMAX

model = SARIMAX(train['# Passengers'], order = (0, 1, 1), seasonal_order =(2, 1, 1, 12))

result = model.fit() result.summary()

**Output:** ![](https://media.geeksforgeeks.org/wp-content/uploads/20200131182833/Screenshot-2020-01-31-at-6.28.16-PM.png) **Code : Predictions of ARIMA Model against the test set** Python3 1==

start = len(train) end = len(train) + len(test) - 1

Predictions for one-year against the test set

predictions = result.predict(start, end, typ = 'levels').rename("Predictions")

plot predictions and actual values

predictions.plot(legend = True) test['# Passengers'].plot(legend = True)

**Output:** ![](https://media.geeksforgeeks.org/wp-content/uploads/20200131190748/Screenshot-2020-01-31-at-7.00.46-PM.png) **Code : Evaluate the model using MSE and RMSE** Python3 1==

Load specific evaluation tools

from sklearn.metrics import mean_squared_error from statsmodels.tools.eval_measures import rmse

Calculate root mean squared error

rmse(test["# Passengers"], predictions)

Calculate mean squared error

mean_squared_error(test["# Passengers"], predictions)

**Output:** ![](https://media.geeksforgeeks.org/wp-content/uploads/20200131185027/Screenshot-2020-01-31-at-6.49.56-PM.png) ![](https://media.geeksforgeeks.org/wp-content/uploads/20200131185156/Screenshot-2020-01-31-at-6.51.44-PM.png) **Code : Forecast using ARIMA Model Python3 1==**

# Train the model on the full dataset model = model = SARIMAX(airline['# Passengers'], order = (0, 1, 1), seasonal_order =(2, 1, 1, 12)) result = model.fit()

# Forecast for the next 3 years forecast = result.predict(start = len(airline), end = (len(airline)-1) + 3 * 12, typ = 'levels').rename('Forecast')

# Plot the forecast values airline['# Passengers'].plot(figsize = (12, 5), legend = True) forecast.plot(legend = True)

` Output: