Vector Autoregression (VAR) for Multivariate Time Series (original) (raw)

Last Updated : 14 May, 2024

**Vector Autoregression (VAR) is a statistical tool used to investigate the dynamic relationships between multiple time series variables. Unlike univariate autoregressive models, which only forecast a single variable based on its previous values, VAR models investigate the interconnectivity of many variables. They accomplish this by modeling each variable as a function of not only its previous values but also of the past values of other variables in the system. In this article, we are going to explore the fundamentals of Vector Autoregression.

Table of Content

What is Vector Autoregression?

Vector Autoregression was first presented in the 1960s by economist _Clive Granger. Granger's significant discoveries laid the framework for understanding and modeling the dynamic interactions that exist among economic factors. VAR models acquired significant momentum in econometrics and macroeconomics during the 1970s and 1980s.

Vector Autoregression (VAR) is a multivariate extension of autoregression (AR) models. While traditional AR models analyze the relationship between a single variable and its lagged values, VAR models consider multiple variables simultaneously. In a VAR model, each variable is regressed on its own lagged values as well as lagged values of other variables in the system.

Mathematical Intuition of VAR Equations

VAR models are mathematically represented as a system of simultaneous equations, where each equation describes the behavior of one variable as a function of its own lagged values and the lagged values of all other variables in the system.

Mathematically, a VAR(p) model with 'p' lags can be represented as:

Y_t = c + \Phi_1 Y_{t-1} + \Phi_2 Y_{t-2} + \dots + \Phi_p Y_{t-p} + \varepsilon_t

Here,

To ensure the validity and trustworthiness of the results from VAR analysis, various assumptions and requirements must be met.

Assumptions underlying the VAR model

VAR analysis is subject to several assumptions and requirements to ensure the validity and reliability of the results:

  1. **Linearity: Relationships between variables are linear.
  2. **Stationarity: Time series data are stationary.
  3. **No Perfect Multicollinearity: No perfect linear relationships exist between variables.
  4. **No Autocorrelation in Residuals: Residuals are not serially correlated.
  5. **Homoscedasticity: Residual variance is constant.
  6. **No Endogeneity: Variables are not affected by omitted factors.
  7. **Exogeneity: Explanatory variables are not influenced by other variables.
  8. **Sufficient Observations: Adequate data for parameter estimation.
  9. **Weak Exogeneity: Some variables may be endogenous but not contemporaneously correlated with errors.

Steps to Implement VAR on Time Series Model

The code conducts Vector Autoregression (VAR) analysis on randomly generated time series data, including stationarity testing, VAR modeling, forecasting, and visualization of the forecasted outcomes.

Step 1: Importing necessary libraries

Python `

import pandas as pd import numpy as np import matplotlib.pyplot as plt from statsmodels.tsa.api import VAR from statsmodels.tsa.stattools import adfuller

`

Step 2: Generate Sample Data

Python `

Sample data generation

np.random.seed(0) dates = pd.date_range(start='2024-01-01', periods=100) data = pd.DataFrame(np.random.randn(100, 3), index=dates, columns=['A', 'B', 'C'])

`

Step 3: Function to plot time series

Python `

Function to plot time series

def plot_series(data): fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(10, 8)) for i, col in enumerate(data.columns): data[col].plot(ax=axes[i], title=col) axes[i].set_ylabel('Values') axes[i].set_xlabel('Date') plt.tight_layout() plt.show()

plot_series(data)

`

**Output:

download-(10)-min

Generated Sample Data

Step 4: Function to check stationarity

Checking for stationarity in time series data is crucial for VAR (Vector Autoregression) modeling because VAR assumes that the time series variables are stationary. Stationarity implies that the statistical properties of the time series remain constant over time, such as mean, variance, and autocorrelation.

Python `

Check stationarity of time series using ADF test

def check_stationarity(timeseries): result = adfuller(timeseries) print('ADF Statistic:', result[0]) print('p-value:', result[1]) print('Critical Values:') for key, value in result[4].items(): print('\t%s: %.3f' % (key, value))

`

Step 5: VAR analysis

This part defines a function varanalysis(data) that conducts Vector Autoregression (VAR) analysis on the given dataset. It consists of four steps: checking stationarity and visualizing the original data, applying the VAR model, forecasting future values, and visualizing the forecast. Finally, it calls the varanalysis() function with the provided data to execute the analysis.

In the third step, the code forecasts future values using the VAR model. It first determines the lag order of the model ****(** lagorder ****)** and then uses this information to generate forecasts for the next 10 steps ****(** steps=10 ****)** and in fourth step, the forecasted values are visualized. A new set of date indices ****(** forecastindex ****)** starting from '2024-04-11' for the next 10 periods is created.

Python `

Section for VAR analysis

def var_analysis(data): # Step 1: Check stationarity and visualize the original data print("Step 1: Checking stationarity") for col in data.columns: print('Stationarity test for', col) check_stationarity(data[col])

# Step 2: Applying VAR model
print("\nStep 2: Applying VAR model")
model = VAR(data)
results = model.fit()

# Step 3: Forecasting
print("\nStep 3: Forecasting")
lag_order = results.k_ar
forecast = results.forecast(data.values[-lag_order:], steps=10)

# Step 4: Visualizing forecast
print("\nStep 4: Visualizing forecast")
forecast_index = pd.date_range(start='2024-04-11', periods=10)
forecast_data = pd.DataFrame(forecast, index=forecast_index, columns=data.columns)
plot_series(pd.concat([data, forecast_data]))

Perform VAR analysis

var_analysis(data)

`

**Output:

Step 1: Checking stationarity and visualizing the original data Stationarity test for A ADF Statistic: -8.43759993424834 p-value: 1.7990274249398063e-13 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583 Stationarity test for B ADF Statistic: -11.229664527662438 p-value: 1.9214648218450937e-20 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583 Stationarity test for C ADF Statistic: -9.028783852793346 p-value: 5.516998045646418e-15 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583

Step 2: Applying VAR model Step 3: Forecasting Step 4: Visualizing forecast

download-(11)-min

Forecasting for period of next 10 steps

Output Explanation

The results of the Augmented Dickey-Fuller (ADF) test for each variable in the dataset.

All three variables (A, B, and C) in the dataset are stationary based on the results of the Augmented Dickey-Fuller test.

Applications of VAR Models

  1. **Economic Forecasting: VAR models are widely used in economics to forecast the behavior of economic variables such as GDP, inflation, and interest rates.
  2. **Causal Inference: By studying the impulse responses generated by VAR models, researchers can infer the causal impact of one variable on another. This is particularly valuable in policy evaluation.
  3. **Financial Markets: VAR models can be used to predict financial indices, stocks and asset prices.