In the previous document you simulated data according to an AR(1) model. Now, you will simulate data for a vector autoregressive model of order 1, a VAR(1) model. This is an autoregressive model for multiple dependent variables that are regressed on themselves (autoregression) and each other (crossregression).
Study equation 3.2 and 3.3. Because there are multiple dependent variables, the equations are now specified in terms of vectors and matrices.
Now go to the section ‘example’ on https://en.wikipedia.org/wiki/Vector_autoregression. You see two sets of equations here. One in terms of matrices, and one with two equations that are not in terms of matrices (each regression equation has a single dependent variable).
Use matrix algebra (do this by hand, with pen and paper) to get from the matrix specification at the top, to the two equations at the bottom. For this you will need matrix addition “Entrywise sum”: https://en.wikipedia.org/wiki/Matrix_addition, and matrix multiplication “matrix product”: https://www.mathsisfun.com/algebra/matrix-multiplying.html.
T <- 100
Make four objects:
one vector called int, containing the two intercepts of your variables
one 2 by 2 matrix called Phi, containing the two autoregression coefficients on the diagonal, and the two cross-lagged coefficients on the off-diagonal.
one 2 by 2 matrix called Sigma, containing the variances for the residuals of your data on the diagonal, and the covariance between the residuals on the off-diagonals. tip: use function cov2cor() on this matrix to see what the correlation matrix is of the residuals.
one vector called mures, containing two elements that are both equal to zero (the means for your residuals).
int <- c(0,0)
Phi <- matrix(c(.6,.3,0,.5),2,2,byrow=TRUE)
Sigma <- matrix(c(.3,.12,.12,.3),2,2,byrow=TRUE)
cov2cor(Sigma)
## [,1] [,2]
## [1,] 1.0 0.4
## [2,] 0.4 1.0
mures <- c(0,0)
Call all the objects you have made to see what they look like.
For this, install package ‘MASS’, load the package, and then use the package’s function mvrnorm() to generate data from the bivariate normal distribution.
#install.packages("MASS") # install the package
library("MASS") ##load the package
?mvrnorm ## search for help on function mvrnorm
resids <- mvrnorm(T, mures, Sigma)
Call object resids to see what it looks like.
y=matrix(NA,T,2)
Call y to see what it looks like.
The best way would be to generate the first observations from the means and covariance matrix of the VAR(1) process, as we did in the AR(1) simulations. However, this is a bit complicated now, so I will help you with this later. For now, set the first observation for both variables to zero.
y[1,1:2]<-c(0,0)
Call y to see what it looks like.
with y, with repeated measurement t selected, as the dependent variable (as in exercise 2)
with y with the repeated measurement right before t selected (for the rows) as the predictor variable (as in exercise 3)
add the residuals by adding object resids, with repeated measurement t selected (for the rows).
don’t forget to match the names of your parameters and the residuals in the equations with those that you used in exercise 4.
for matrix summation in R, use +. For matrix multiplication in R, use %*%.
# y[t,]<-int+Phi%*%y[t-1,]+resids[t,]
Now specify that t=3. Run the regression equation code you wrote in the previous exercise. Call your dependent variable y to see what it looks like. Now do this for t=4.
t=2
y[t,]<-int+Phi%*%y[t-1,]+resids[t,]
t=3
y[t,]<-int+Phi%*%y[t-1,]+resids[t,]
t=4
y[t,]<-int+Phi%*%y[t-1,]+resids[t,]
If you keep doing this until you have reached t=T, then you have simulated your own AR(1) data. However, this is of course quite tedious. That is why we are going to make R do this for you, by specifying a ‘for loop’.
Specify a for loop around your regression equation that goes from t=2 until t=T. Simulate your data by running this for loop with the regression equation inside.
for(t in 2:T){
y[t,]<-int+Phi%*%y[t-1,]+resids[t,]
}
ts.plot(y)
hist(y[,1])
hist(y[,2])
That is, make a new matrix for each column of y, with 1 column and T rows, and set all its elements to NA. Now, make it such that ytmin1 has NA as the first element of y[,1], the first element of y[,1] as the second element, and so on. Do the same thing again but then for variable y[,2] and y2tmin1.
Put y and the two lagged variables ytmin1 in a dataframe called `dataset’ with y in the first two columns and y1tmin1 and y2min1 on the third and fourth column.
y1tmin1=matrix(NA,T,1)
y1tmin1[2:T]=y[1:T-1,1]
y2tmin1=matrix(NA,T,1)
y2tmin1[2:T]=y[1:T-1,2]
dataset = data.frame(y,y1tmin1, y2tmin1)
Call object dataset to see what it looks like.
Inspect function lm() - lm stands for linear model - to see how the function works. Look at the examples provided in the help file. Perform the regression analysis.
myregression1<-lm(X1 ~ 1 + y1tmin1 + y2tmin1, data=dataset) ## the 1 you add to the regression equation is used to add an intercept to the model.
myregression2<-lm(X2 ~ 1 + y1tmin1 + y2tmin1, data=dataset) ## the 1 you add to the regression equation is used to add an intercept to the model.
Call myregression1 and 2 to see the results. Use summary() to get more detailed results. Interpret the results.
myregression1
##
## Call:
## lm(formula = X1 ~ 1 + y1tmin1 + y2tmin1, data = dataset)
##
## Coefficients:
## (Intercept) y1tmin1 y2tmin1
## 0.03243 0.61506 0.32461
summary(myregression1)
##
## Call:
## lm(formula = X1 ~ 1 + y1tmin1 + y2tmin1, data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.70565 -0.37913 0.05457 0.43911 1.77142
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.03243 0.06041 0.537 0.59262
## y1tmin1 0.61506 0.08323 7.390 5.42e-11 ***
## y2tmin1 0.32461 0.11091 2.927 0.00427 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5997 on 96 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.6153, Adjusted R-squared: 0.6073
## F-statistic: 76.77 on 2 and 96 DF, p-value: < 2.2e-16
myregression2
##
## Call:
## lm(formula = X2 ~ 1 + y1tmin1 + y2tmin1, data = dataset)
##
## Coefficients:
## (Intercept) y1tmin1 y2tmin1
## -0.01188 0.06098 0.50155
summary(myregression2)
##
## Call:
## lm(formula = X2 ~ 1 + y1tmin1 + y2tmin1, data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.2770 -0.3656 -0.0022 0.4467 1.1682
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.01188 0.06007 -0.198 0.844
## y1tmin1 0.06098 0.08276 0.737 0.463
## y2tmin1 0.50155 0.11028 4.548 1.58e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5963 on 96 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.31, Adjusted R-squared: 0.2956
## F-statistic: 21.56 on 2 and 96 DF, p-value: 1.843e-08
install package vars
load package vars
use function VAR() on your matrix y (by doing this you fit the same models as in exercise 12)
#install.packages('vars')
library('vars')
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
myvar <- VAR(y)
## Warning in VAR(y): No column names supplied in y, using: y1, y2 , instead.
myvar
##
## VAR Estimation Results:
## =======================
##
## Estimated coefficients for equation y1:
## =======================================
## Call:
## y1 = y1.l1 + y2.l1 + const
##
## y1.l1 y2.l1 const
## 0.61506447 0.32461476 0.03243003
##
##
## Estimated coefficients for equation y2:
## =======================================
## Call:
## y2 = y1.l1 + y2.l1 + const
##
## y1.l1 y2.l1 const
## 0.06098184 0.50155014 -0.01188301