Updated on: 2022-07-26
The purpose of this project was to learn about Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models and how to implement them in R to measure volatility of stock returns. It introduces the packages in R that is used for building GARCH models and briefly touches on basic n-ahead forecasting to evaluate the models’ out-of-sample forecast performance.
library(dplyr)
library(forecast) # For ARIMA models, forecasting and evaluation
library(PerformanceAnalytics) # For portfolio performance and risk analysis
library(PortfolioAnalytics) # For portfolio optimization and analysis
library(quantmod) # For obtaining historical prices from Yahoo Finance
library(rugarch) # For univariate GARCH models
library(urca) # For unit root tests
In typical time series analysis, we usually require that the time series data is stationary with a constant mean and constant variance. However, we may have stationarity with changing conditional variance (conditional heteroskedasticity), commonly seen in financial data. In such data, the plots would show periods of volatility clustering, where there are periods of high and low volatility and variance is not constant in these periods. For instance, let us take a look at the weekly returns of Apple Inc. (AAPL), which we will be using for the rest of this project.
# Load AAPL price data from Yahoo Finance using quantmod package
# Set start and end dates for data retrieval
<- as.Date("2010-01-01")
startdate <- as.Date("2022-07-01")
enddate
# Retrieve price data
<- quantmod::getSymbols(Symbols = "AAPL",
AAPL_price src = "yahoo",
from = startdate, to = enddate,
periodicity = "weekly",
auto.assign = F)
# View the first and last 6 observations in AAPL_price
c(head(AAPL_price), tail(AAPL_price))
## AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume
## 2010-01-01 7.622500 7.699643 7.466071 7.520714 2124925600
## 2010-01-08 7.510714 7.607143 7.289286 7.479643 2543086000
## 2010-01-15 7.533214 7.698214 7.352500 7.431071 2544382400
## 2010-01-22 7.385000 7.632500 7.041429 7.117500 6710648000
## 2010-01-29 7.181429 7.221429 6.794643 6.858929 4067151200
## 2010-02-05 6.879643 7.133929 6.816071 7.095357 2882171600
## 2022-05-20 139.089996 144.339996 132.610001 143.779999 542369300
## 2022-05-27 145.389999 151.740005 145.259995 151.210007 341331600
## 2022-06-03 146.899994 149.869995 142.529999 142.639999 351400100
## 2022-06-10 140.279999 140.759995 129.039993 130.059998 498086200
## 2022-06-17 130.070007 138.589996 129.809998 138.270004 361363800
## 2022-06-24 139.899994 143.490005 133.770004 136.720001 391615000
## AAPL.Adjusted
## 2010-01-01 6.430343
## 2010-01-08 6.395227
## 2010-01-15 6.353699
## 2010-01-22 6.085590
## 2010-01-29 5.864506
## 2010-02-05 6.066657
## 2022-05-20 143.779999
## 2022-05-27 151.210007
## 2022-06-03 142.639999
## 2022-06-10 130.059998
## 2022-06-17 138.270004
## 2022-06-24 136.720001
# Check number of observations
nrow(AAPL_price)
## [1] 652
# Check for missing data
colSums(is.na(AAPL_price))
## AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume
## 0 0 0 0 0
## AAPL.Adjusted
## 0
# Calculate log/continuous returns using Adjusted column (column 6)
# Remove first row of calculated returns since returns cannot be calculated for first observation
<- PerformanceAnalytics::Return.calculate(prices = AAPL_price[, 6], method = "log")[-1,] * 100
rAAPL colnames(rAAPL) <- "Returns"
# Check that first row has been removed
head(rAAPL)
## Returns
## 2010-01-08 -0.5475950
## 2010-01-15 -0.6514768
## 2010-01-22 -4.3113482
## 2010-01-29 -3.7005431
## 2010-02-05 3.3889462
## 2010-02-12 2.1215958
nrow(rAAPL)
## [1] 651
# Chart weekly log-returns over time
plot.zoo(rAAPL, main = "Weekly Log-Returns of AAPL", xlab = "Time", ylab = "Log-Return (in %)")
If we were to only look at the period in early 2020 (COVID-19
outbreak), we see that there is a period where large absolute returns
are followed by large absolute returns, which then starts to return to
“normal”. I charted the monthly volatilty (measured using annualized
standard deviation) of rAAPL
to better present the
fluctuations in returns that might not be easily seen in the weekly
returns plot.
# Plot rolling volatility of AAPL returns
# width = 4 to approximately calculate annualized sd using a monthly rolling-window
::chart.RollingPerformance(R = rAAPL, width = 4, FUN = "sd.annualized", scale = 52, main = "1-Month Rolling Volatility of AAPL Returns") PerformanceAnalytics
We can check if the series is stationary using an Augmented Dickey Fuller Test and Kwiatkowski-Phillips-Schmidt-Shin Test.
# ADF Test with Drift, lags selected based on AIC
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() rAAPL
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.0125 -2.1538 0.2441 2.2200 16.4332
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.4726270 0.1479833 3.194 0.00147 **
## z.lag.1 -0.9994788 0.0556699 -17.954 < 2e-16 ***
## z.diff.lag -0.0008002 0.0394159 -0.020 0.98381
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.711 on 646 degrees of freedom
## Multiple R-squared: 0.5001, Adjusted R-squared: 0.4986
## F-statistic: 323.1 on 2 and 646 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -17.9537 161.1678
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1 6.43 4.59 3.78
# KPSS Test
%>% urca::ur.kpss(type = "mu", lags = "long") %>% summary() rAAPL
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 19 lags.
##
## Value of test-statistic is: 0.0525
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The test-statistic value of tau2
-17.95 against the
critical value of -2.86 at the 5% significance level means that we can
reject the null hypothesis (alternative hypothesis is that the time
series does not have unit root). The KPSS test-statistic 0.052 against
the 5% critical value of 0.46 means that we cannot reject the null
hypothesis (alternative hypothesis that the time series is not
stationary).
Despite varying variances at different periods, the ADF and KPSS tests shows that the series is stationary. The problem of conditional heteroskedasticity is not addressed in ARIMA and other models that assumed constant variance. We can use GARCH models to estimate this volatility to better manage risk and improve forecasts of returns. Before discussing GARCH, it may be helpful to talk about the ARCH model since GARCH was an improvement of it.
ARCH or AutoRegressive Conditional Heteroskedasticity models can be thought of as an Autoregressive model. Suppose that the returns was modelled as such (remains true for all the GARCH models):
\[r_t = \mu + \varepsilon_t\]
where \(\mu\) can be a constant or represented with ARMA models, and \(\varepsilon_t\) is the residual at time \(t\). \(\varepsilon_t\) can be separated into a time-varying volatility or standard deviation \(\sigma_t\) and a white noise error term (or called innovations/shocks) \(z_t\) which is i.i.d. with mean of 0 and variance of 1, which we write as \(\varepsilon_t = \sigma_t z_t\).
In most cases, \(z_t\) is assumed to have a normal distribution, but with the fat tails and slight skew that is typical of financial returns, it is also possible to use a Student’s t-distribution or a skewed t-distribution.
# Plot distribution of AAPL's weekly log returns
::chart.Histogram(R = rAAPL,
PerformanceAnalyticsmain = "Distribution of AAPL Weekly Log-Returns",
methods = c("add.density", "add.normal"),
colorset = c("blue", "red", "black"),
ylim = c(0, 0.15))
legend(x = "topleft", legend = c("Log-Return", "Density", "Normal"), col = c("blue", "red", "black"), lty = 1)
# Look at skewness and kurtosis of AAPL returns
::table.Distributions(R = rAAPL) PerformanceAnalytics
An ARCH(1) model is then formulated as:
\[Var(\varepsilon_t) = \sigma_t^2 = \omega + \alpha_1 \varepsilon^2_{t-1}\]
What the formula essentially say is that the conditional variance in the current period (\(\sigma_t^2\)) is affected by the errors in the previous period. \(\alpha_0 > 0\) since variance of first period should also be non-negative and \(\alpha_1 > 0\) so that larger error terms in the previous period means higher conditional variance but \(\alpha_1 < 1\) to prevent “explosive” variance. This can be extended to an ARCH(q) model where we have \(q\) lags of error terms.
Moving forward, I would estimate the models using returns data up to end-April 2022 and use the data in May and June 2022 to evaluate the models.
# Split the data for training and testing models
<- rAAPL["/2022-04",]
trainset nrow(trainset)
## [1] 643
<- rAAPL["2022-05/",]
testset nrow(testset)
## [1] 8
I used an ARMA model to estimate the mean equation before generating the ARCH model.
# Find mean equation using auto.arima from forecast package
<- forecast::auto.arima(y = trainset,
meaneqn max.p = 2, max.q = 2,
stepwise = F, approximation = F, trace = F,
ic = "aic", method = "ML")
summary(meaneqn)
## Series: trainset
## ARIMA(0,0,0) with non-zero mean
##
## Coefficients:
## mean
## 0.4965
## s.e. 0.1440
##
## sigma^2 = 13.36: log likelihood = -1745.25
## AIC=3494.5 AICc=3494.52 BIC=3503.43
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -4.804446e-13 3.65206 2.751707 -Inf Inf 0.981016 -0.007041325
Based on the ARMA output, a ARMA(0, 0) with non-zero mean has the
lowest AIC. With this, we can create the model specification for the
ARCH model using ugarchspec
.
# Create ARCH(1) model specification using rugarch package, arguments to be in a list
# Check ?ugarchspec for the arguments
# Specify ARCH(1) model by setting order of GARCH terms to 0
<- list(model = "sGARCH", garchOrder = c(1, 0))
archmodel
# Specify mean model as found using auto.arima
<- list(armaOrder = c(0, 0), include.mean = T)
meanmodel
# Assume a normal distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = archmodel, mean.model = meanmodel, distribution.model = "norm") archspec
# Fit the data to the ARCH(1) specification
<- rugarch::ugarchfit(spec = archspec, data = trainset, solver = "hybrid")
fitARCH
fitARCH
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,0)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.51948 0.142665 3.6413 0.000271
## omega 12.00707 0.907333 13.2334 0.000000
## alpha1 0.10810 0.059532 1.8159 0.069385
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.51948 0.14860 3.4959 0.000472
## omega 12.00707 1.34500 8.9272 0.000000
## alpha1 0.10810 0.07645 1.4140 0.157349
##
## LogLikelihood : -1742.841
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.4303
## Bayes 5.4511
## Shibata 5.4303
## Hannan-Quinn 5.4384
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.02494 0.8745
## Lag[2*(p+q)+(p+q)-1][2] 0.08917 0.9273
## Lag[4*(p+q)+(p+q)-1][5] 4.18949 0.2314
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.6875 4.070e-01
## Lag[2*(p+q)+(p+q)-1][2] 17.0268 2.689e-05
## Lag[4*(p+q)+(p+q)-1][5] 28.0623 1.109e-07
## d.o.f=1
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[2] 32.48 0.500 2.000 1.207e-08
## ARCH Lag[4] 34.00 1.397 1.611 2.633e-09
## ARCH Lag[6] 35.04 2.222 1.500 3.179e-09
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 0.5946
## Individual Statistics:
## mu 0.07162
## omega 0.15271
## alpha1 0.23159
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 0.846 1.01 1.35
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.5830 0.5601
## Negative Sign Bias 0.4720 0.6371
## Positive Sign Bias 0.5776 0.5637
## Joint Effect 3.5984 0.3082
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 28.88 0.06787
## 2 30 38.23 0.11734
## 3 40 48.45 0.14279
## 4 50 52.41 0.34310
##
##
## Elapsed time : 0.151643
The output is split into several tables. The first table
Optimal Parameters
gives us the coefficients, standard
errors and significance for the mean and ARCH equations, as well as the
robust errors. Using the values, we would write the equations:
\[\hat{r}_t = 0.519 + \hat{\varepsilon}_t \\ \widehat{Var(\varepsilon_t)} = \hat{\sigma}_t^2 = 12.007 + 0.108 \varepsilon_{t-1}^2\]
However, coefficient for \(\varepsilon_{t-1}^2\) is insignificant at
the 5% significance level. The second table
Information Criteria
shows the different information
criteria values, which can be compared across models to check which fits
the data better. The third table
Weighted Ljung-Box Test on Standardized Residuals
shows
results for serial correlation in the residuals and based on the test,
the null hypothesis could not be rejected. The next table that interests
us is the Weighted ARCH LM Tests
and the low p-values means
that we reject the null hypothesis (alternative is that there is ARCH
effects). We may need to increase the order of ARCH terms to account for
the rest of the ARCH effects. The last table
Adjusted Pearson Goodness-of-Fit Test
shows that the
assumption of normally distributed errors cannot be rejected, but it is
still good practice to try other distributions.
I created an ARCH(p) model to account for remaining ARCH effects, still using the assumed normal distribution of errors.
# Specify an ARCH(2) model
<- list(model = "sGARCH", garchOrder = c(2, 0))
arch2model
# Assume normal distribution of error term z_t
<- rugarch::ugarchspec(variance.model = arch2model, mean.model = meanmodel, distribution.model = "norm")
arch2spec
<- ugarchfit(spec = arch2spec, data = trainset, solver = "hybrid")
fitARCH2
fitARCH2
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(2,0)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.650963 0.138547 4.6985 0.000003
## omega 9.989957 0.922085 10.8341 0.000000
## alpha1 0.057729 0.045654 1.2645 0.206057
## alpha2 0.199779 0.065030 3.0721 0.002126
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.650963 0.155456 4.1874 0.000028
## omega 9.989957 1.209668 8.2584 0.000000
## alpha1 0.057729 0.048738 1.1845 0.236219
## alpha2 0.199779 0.070242 2.8442 0.004453
##
## LogLikelihood : -1731.135
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3970
## Bayes 5.4248
## Shibata 5.3969
## Hannan-Quinn 5.4078
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1618 0.6875
## Lag[2*(p+q)+(p+q)-1][2] 0.1915 0.8594
## Lag[4*(p+q)+(p+q)-1][5] 3.3677 0.3440
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.00948 0.9224
## Lag[2*(p+q)+(p+q)-1][5] 0.29446 0.9843
## Lag[4*(p+q)+(p+q)-1][9] 1.37920 0.9649
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.2177 0.500 2.000 0.6408
## ARCH Lag[5] 0.4174 1.440 1.667 0.9077
## ARCH Lag[7] 1.2924 2.315 1.543 0.8619
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 0.6254
## Individual Statistics:
## mu 0.17240
## omega 0.08037
## alpha1 0.15925
## alpha2 0.13576
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.07 1.24 1.6
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.09764 0.92225
## Negative Sign Bias 1.23187 0.21845
## Positive Sign Bias 1.44054 0.15020
## Joint Effect 8.95549 0.02989 **
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 23.78 0.20470
## 2 30 33.28 0.26647
## 3 40 41.98 0.34313
## 4 50 65.01 0.06248
##
##
## Elapsed time : 0.1133151
ARCH LM Tests show that we cannot reject the null hypothesis, so we have sufficiently accounted for ARCH effects in the model.
I tried an ARCH(2) model with a Student’s t-distribution to account for the fat-tailed distribution.
# Assume a normal distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = arch2model, mean.model = meanmodel, distribution.model = "std")
arch2spec_std
<- ugarchfit(spec = arch2spec_std, data = trainset, solver = "hybrid")
fitARCH2_std
fitARCH2_std
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(2,0)
## Mean Model : ARFIMA(0,0,0)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.667203 0.131293 5.0818 0.000000
## omega 9.877722 1.122702 8.7982 0.000000
## alpha1 0.064211 0.053893 1.1914 0.233482
## alpha2 0.207263 0.075341 2.7510 0.005942
## shape 7.546520 2.149598 3.5107 0.000447
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.667203 0.140562 4.7467 0.000002
## omega 9.877722 1.121571 8.8070 0.000000
## alpha1 0.064211 0.051120 1.2561 0.209084
## alpha2 0.207263 0.074721 2.7738 0.005540
## shape 7.546520 2.075020 3.6368 0.000276
##
## LogLikelihood : -1720.172
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3660
## Bayes 5.4007
## Shibata 5.3659
## Hannan-Quinn 5.3795
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1812 0.6703
## Lag[2*(p+q)+(p+q)-1][2] 0.2107 0.8477
## Lag[4*(p+q)+(p+q)-1][5] 3.3750 0.3428
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.03283 0.8562
## Lag[2*(p+q)+(p+q)-1][5] 0.29956 0.9838
## Lag[4*(p+q)+(p+q)-1][9] 1.36826 0.9656
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.2427 0.500 2.000 0.6223
## ARCH Lag[5] 0.4306 1.440 1.667 0.9039
## ARCH Lag[7] 1.2882 2.315 1.543 0.8627
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.203
## Individual Statistics:
## mu 0.12363
## omega 0.17708
## alpha1 0.09359
## alpha2 0.29712
## shape 0.61475
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.28 1.47 1.88
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.04524 0.96393
## Negative Sign Bias 1.15905 0.24687
## Positive Sign Bias 1.52698 0.12726
## Joint Effect 8.71932 0.03327 **
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 12.65 0.8562
## 2 30 26.10 0.6202
## 3 40 31.03 0.8148
## 4 50 46.50 0.5750
##
##
## Elapsed time : 0.1592081
With the t-distribution ARCH(2) model, the coefficient of \(\varepsilon_{t-1}^2\) is still insignificant, but we have lower AIC.
Finally, I created an ARCH(2) model with a skewed t-distribution to account for the heavy tail and skewed distribution.
# Use skewed t-distribution
<- ugarchspec(variance.model = arch2model, mean.model = meanmodel, distribution.model = "sstd")
arch2spec_sstd
<- rugarch::ugarchfit(spec = arch2spec_sstd, data = trainset, solver = "hybrid")
fitARCH2_sstd
fitARCH2_sstd
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(2,0)
## Mean Model : ARFIMA(0,0,0)
## Distribution : sstd
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.615281 0.138353 4.4472 0.000009
## omega 9.851054 1.115402 8.8318 0.000000
## alpha1 0.073452 0.055048 1.3343 0.182095
## alpha2 0.190350 0.073117 2.6034 0.009231
## skew 0.933738 0.051096 18.2742 0.000000
## shape 7.826487 2.310906 3.3868 0.000707
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.615281 0.148502 4.1432 0.000034
## omega 9.851054 1.122155 8.7787 0.000000
## alpha1 0.073452 0.055413 1.3256 0.184988
## alpha2 0.190350 0.072292 2.6331 0.008462
## skew 0.933738 0.048882 19.1018 0.000000
## shape 7.826487 2.222069 3.5222 0.000428
##
## LogLikelihood : -1719.394
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3667
## Bayes 5.4084
## Shibata 5.3665
## Hannan-Quinn 5.3829
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1818 0.6698
## Lag[2*(p+q)+(p+q)-1][2] 0.2158 0.8446
## Lag[4*(p+q)+(p+q)-1][5] 3.3947 0.3397
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1139 0.7358
## Lag[2*(p+q)+(p+q)-1][5] 0.4651 0.9629
## Lag[4*(p+q)+(p+q)-1][9] 1.5191 0.9540
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.2610 0.500 2.000 0.6094
## ARCH Lag[5] 0.4191 1.440 1.667 0.9072
## ARCH Lag[7] 1.2610 2.315 1.543 0.8678
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.3748
## Individual Statistics:
## mu 0.11719
## omega 0.18642
## alpha1 0.09345
## alpha2 0.28928
## skew 0.10917
## shape 0.61556
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.49 1.68 2.12
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.09865 0.92145
## Negative Sign Bias 1.03059 0.30312
## Positive Sign Bias 1.52180 0.12855
## Joint Effect 8.33019 0.03966 **
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 14.39 0.7607
## 2 30 26.56 0.5952
## 3 40 26.30 0.9400
## 4 50 44.95 0.6381
##
##
## Elapsed time : 0.229322
The ARCH(2) with skewed t-distribution has slightly higher AIC than the ARCH(2) with t-distribution, but the skew coefficient is significant.
# Plot some data from the ARCH(2) with t-distribution
par(mfrow = c(2, 2))
# Plot 1: Actual return series with conditional SD
# Plot 3: Conditional volatility vs absolute returns
# Plot 8: Distribution of residuals
# Plot 9: Q-Q plot of residuals to check for normality
<- c(1, 3, 8, 9)
plotnum
for (i in plotnum) {
plot(fitARCH2_std, which = i)
}
In order to capture the effects of volatility clustering, an ARCH model with many lags may be required. To solve this issue, the standard Generalized ARCH (sGARCH) can be used. The model can be thought of as an Autoregressive Moving Average Model, and is the basic model of in the GARCH variation of models. A GARCH(1, 1) model can be written as:
\[\sigma_t^2 = \alpha_0 + \alpha_1 \varepsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2 ~, \text{ where } \alpha_0,~ \alpha_1,~ \beta_1 > 0\]
Essentially, the conditional variance in the current period is affected by the errors and the conditional variance in the previous period. Similar to ARCH, the model can be extended to a GARCH(p, q) model with \(p\) lags of conditional variance and \(q\) lags of error terms.
Create an GARCH(1, 1) model specification with normally distributed errors:
# Specify GARCH(1,1) model
<- list(model = "sGARCH", garchOrder = c(1, 1))
garchmodel
# Assume a normal distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = garchmodel, mean.model = meanmodel, distribution.model = "norm")
garchspec
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = garchspec, data = trainset, solver = "hybrid")
fitGARCH11
fitGARCH11
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.60537 0.142198 4.2573 0.000021
## omega 3.73165 1.531777 2.4362 0.014844
## alpha1 0.10892 0.044753 2.4339 0.014938
## beta1 0.61147 0.141684 4.3157 0.000016
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.60537 0.148316 4.0816 0.000045
## omega 3.73165 1.504923 2.4796 0.013152
## alpha1 0.10892 0.039491 2.7582 0.005813
## beta1 0.61147 0.128058 4.7749 0.000002
##
## LogLikelihood : -1734.833
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.4085
## Bayes 5.4363
## Shibata 5.4084
## Hannan-Quinn 5.4193
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1829 0.6689
## Lag[2*(p+q)+(p+q)-1][2] 0.2539 0.8219
## Lag[4*(p+q)+(p+q)-1][5] 3.0361 0.4005
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.7269 0.39389
## Lag[2*(p+q)+(p+q)-1][5] 8.3103 0.02467
## Lag[4*(p+q)+(p+q)-1][9] 9.8487 0.05404
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 1.968 0.500 2.000 0.1606
## ARCH Lag[5] 2.137 1.440 1.667 0.4418
## ARCH Lag[7] 2.355 2.315 1.543 0.6425
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 0.3917
## Individual Statistics:
## mu 0.12627
## omega 0.08493
## alpha1 0.07316
## beta1 0.09560
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.07 1.24 1.6
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.2338 0.8153
## Negative Sign Bias 0.3893 0.6972
## Positive Sign Bias 1.4266 0.1542
## Joint Effect 5.5742 0.1343
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 25.90 0.1331
## 2 30 26.19 0.6152
## 3 40 40.98 0.3836
## 4 50 59.88 0.1372
##
##
## Elapsed time : 0.09278607
With a GARCH(1, 1) model, the \(\alpha_1\) is statistically significant, standardized residuals do not have serial correlation and there are no ARCH effects left after including the GARCH term. The Adjusted Pearson Goodness-of-Fit Test statistic also shows that we cannot reject that the errors follow a normal distribution. We can write the formula as:
\[\hat{r}_t = 0.605 + \hat{\sigma}_t \hat{z}_t \\ \widehat{Var(\varepsilon_t)} = \hat{\sigma}_t^2 = 3.732 + 0.109 \varepsilon_{t-1}^2 + 0.611 \sigma_{t-1}^2\]
As per the ARCH models, I created GARCH(1, 1) with the assumption that errors follow a t-distribution and a skewed t-distribution.
# GARCH(1, 1) with errors that follow a t-distribution
<- rugarch::ugarchspec(variance.model = garchmodel, mean.model = meanmodel, distribution.model = "std")
garchspec_std
<- rugarch::ugarchfit(spec = garchspec_std, data = trainset, solver = "hybrid")
fitGARCH11_std
fitGARCH11_std
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.644531 0.133241 4.8373 0.000001
## omega 1.335765 1.162309 1.1492 0.250460
## alpha1 0.076522 0.039344 1.9449 0.051782
## beta1 0.825073 0.115850 7.1219 0.000000
## shape 6.635424 1.642166 4.0407 0.000053
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.644531 0.143570 4.48933 0.000007
## omega 1.335765 1.613021 0.82811 0.407606
## alpha1 0.076522 0.049609 1.54252 0.122947
## beta1 0.825073 0.160723 5.13352 0.000000
## shape 6.635424 1.703154 3.89596 0.000098
##
## LogLikelihood : -1720.528
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3671
## Bayes 5.4018
## Shibata 5.3670
## Hannan-Quinn 5.3806
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1283 0.7202
## Lag[2*(p+q)+(p+q)-1][2] 0.2301 0.8359
## Lag[4*(p+q)+(p+q)-1][5] 3.0580 0.3966
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1381 0.71018
## Lag[2*(p+q)+(p+q)-1][5] 9.0970 0.01564
## Lag[4*(p+q)+(p+q)-1][9] 10.7614 0.03431
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 2.091 0.500 2.000 0.1482
## ARCH Lag[5] 2.396 1.440 1.667 0.3901
## ARCH Lag[7] 2.507 2.315 1.543 0.6110
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 0.966
## Individual Statistics:
## mu 0.08553
## omega 0.23612
## alpha1 0.15839
## beta1 0.23547
## shape 0.66597
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.28 1.47 1.88
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.1688 0.8660
## Negative Sign Bias 0.6056 0.5450
## Positive Sign Bias 1.3342 0.1826
## Joint Effect 5.4915 0.1391
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 13.95 0.7865
## 2 30 32.63 0.2929
## 3 40 34.14 0.6910
## 4 50 41.21 0.7776
##
##
## Elapsed time : 0.09906483
Using the t-distribution led to insignificant \(\omega\) and \(\alpha_1\).
# GARCH(1, 1) with errors that follow a skewed t-distribution
<- rugarch::ugarchspec(variance.model = garchmodel, mean.model = meanmodel, distribution.model = "sstd")
garchspec_sstd
<- rugarch::ugarchfit(spec = garchspec_sstd, data = trainset, solver = "hybrid")
fitGARCH11_sstd
fitGARCH11_sstd
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : sstd
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.563131 0.140625 4.0045 0.000062
## omega 1.081351 0.854691 1.2652 0.205802
## alpha1 0.068029 0.031637 2.1503 0.031531
## beta1 0.851328 0.086314 9.8631 0.000000
## skew 0.910202 0.050148 18.1504 0.000000
## shape 6.879183 1.756956 3.9154 0.000090
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.563131 0.144220 3.9047 0.000094
## omega 1.081351 1.043376 1.0364 0.300018
## alpha1 0.068029 0.034996 1.9439 0.051908
## beta1 0.851328 0.105308 8.0841 0.000000
## skew 0.910202 0.045698 19.9176 0.000000
## shape 6.879183 1.794090 3.8344 0.000126
##
## LogLikelihood : -1719.081
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3657
## Bayes 5.4074
## Shibata 5.3656
## Hannan-Quinn 5.3819
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.09793 0.7543
## Lag[2*(p+q)+(p+q)-1][2] 0.21648 0.8441
## Lag[4*(p+q)+(p+q)-1][5] 3.12267 0.3851
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.06918 0.79253
## Lag[2*(p+q)+(p+q)-1][5] 9.76845 0.01055
## Lag[4*(p+q)+(p+q)-1][9] 11.51797 0.02331
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 1.988 0.500 2.000 0.1585
## ARCH Lag[5] 2.324 1.440 1.667 0.4039
## ARCH Lag[7] 2.437 2.315 1.543 0.6255
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.0998
## Individual Statistics:
## mu 0.07835
## omega 0.22466
## alpha1 0.15568
## beta1 0.21385
## skew 0.09560
## shape 0.68766
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.49 1.68 2.12
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.2614 0.7939
## Negative Sign Bias 0.6847 0.4938
## Positive Sign Bias 1.2174 0.2239
## Joint Effect 5.7490 0.1245
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 15.32 0.7020
## 2 30 27.59 0.5399
## 3 40 33.77 0.7072
## 4 50 37.33 0.8887
##
##
## Elapsed time : 0.1600549
# Plot some data from the GARCH(1, 1) with t-distribution
par(mfrow = c(2, 2))
for (i in plotnum) {
plot(fitGARCH11_std, which = i)
}
GJR-GARCH models can capture the asymmetric patterns in volatility due to positive and negative shocks (such as bad/good financial results, etc.). The volatility asymmetry is not captured by the standard GARCH models. We write the GJR-GARCH(1, 1) as:
\[\sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \gamma_1 \varepsilon_{t-1}^2 I_{t-1} + \beta_1 \sigma_{t-1}^2\]
\(I_{t-1} = 0\) if \(\varepsilon_{t-1}^2 \ge 0\) and \(I_{t-1} = 1\) if \(\varepsilon_{t-1}^2 < 0\). The inclusion of the dummy variable \(I_{t-1}\) simply means that we expect higher volatility in the current period, if there was a negative shock in the previous period.
Create a GJR-GARCH(1, 1) model specification with normally distributed errors:
# Specify GJR-GARCH(1,1) model
<- list(model = "gjrGARCH", garchOrder = c(1, 1))
gjrmodel
# Assume a normal distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = gjrmodel, mean.model = meanmodel, distribution.model = "norm")
gjrspec
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = gjrspec, data = trainset, solver = "hybrid")
fitGJR11
fitGJR11
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : gjrGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.54378 1.1933 0.455677 0.64862
## omega 4.48858 5.8324 0.769599 0.44154
## alpha1 0.00000 4.2990 0.000000 1.00000
## beta1 0.50477 1.3321 0.378920 0.70475
## gamma1 0.31532 3.3832 0.093203 0.92574
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.54378 91.101 0.005969 0.99524
## omega 4.48858 441.601 0.010164 0.99189
## alpha1 0.00000 330.322 0.000000 1.00000
## beta1 0.50477 102.073 0.004945 0.99605
## gamma1 0.31532 259.861 0.001213 0.99903
##
## LogLikelihood : -1722.357
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3728
## Bayes 5.4075
## Shibata 5.3727
## Hannan-Quinn 5.3863
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.5627 0.4532
## Lag[2*(p+q)+(p+q)-1][2] 0.6598 0.6231
## Lag[4*(p+q)+(p+q)-1][5] 3.9271 0.2633
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 3.544 0.05975
## Lag[2*(p+q)+(p+q)-1][5] 7.739 0.03421
## Lag[4*(p+q)+(p+q)-1][9] 8.960 0.08284
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 1.581 0.500 2.000 0.2086
## ARCH Lag[5] 1.626 1.440 1.667 0.5599
## ARCH Lag[7] 1.753 2.315 1.543 0.7694
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.9467
## Individual Statistics:
## mu 0.2152
## omega 0.1254
## alpha1 0.6448
## beta1 0.1288
## gamma1 0.1718
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.28 1.47 1.88
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.1138 0.9094
## Negative Sign Bias 1.0726 0.2839
## Positive Sign Bias 1.5283 0.1269
## Joint Effect 3.8146 0.2822
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 29.01 0.06589
## 2 30 46.25 0.02214
## 3 40 60.02 0.01686
## 4 50 65.32 0.05935
##
##
## Elapsed time : 0.2745059
With the GJR-GARCH(1, 1) with normally distributed errors, none of the coefficients in the models are significant. This might hint at a mis-specification of models. Let’s try the assumption of t-distribution and skewed t-distribution.
# Assume Student's t-distribution of error term z_t
<- rugarch::ugarchspec(variance.model = gjrmodel, mean.model = meanmodel, distribution.model = "std")
gjrspec_std
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = gjrspec_std, data = trainset, solver = "hybrid")
fitGJR11_std
fitGJR11_std
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : gjrGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.59694 0.131810 4.5288 0.000006
## omega 3.76777 1.259606 2.9912 0.002779
## alpha1 0.00000 0.094145 0.0000 1.000000
## beta1 0.56513 0.111797 5.0550 0.000000
## gamma1 0.29505 0.135978 2.1698 0.030020
## shape 8.29014 2.631474 3.1504 0.001631
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.59694 0.150087 3.9773 0.000070
## omega 3.76777 1.500677 2.5107 0.012049
## alpha1 0.00000 0.155180 0.0000 1.000000
## beta1 0.56513 0.094059 6.0082 0.000000
## gamma1 0.29505 0.175437 1.6818 0.092608
## shape 8.29014 3.352420 2.4729 0.013403
##
## LogLikelihood : -1711.757
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3429
## Bayes 5.3846
## Shibata 5.3428
## Hannan-Quinn 5.3591
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.5652 0.4522
## Lag[2*(p+q)+(p+q)-1][2] 0.6509 0.6268
## Lag[4*(p+q)+(p+q)-1][5] 3.8003 0.2800
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 3.187 0.07422
## Lag[2*(p+q)+(p+q)-1][5] 7.249 0.04512
## Lag[4*(p+q)+(p+q)-1][9] 8.326 0.11120
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 1.677 0.500 2.000 0.1953
## ARCH Lag[5] 1.723 1.440 1.667 0.5357
## ARCH Lag[7] 1.771 2.315 1.543 0.7656
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.802
## Individual Statistics:
## mu 0.13648
## omega 0.30992
## alpha1 0.39793
## beta1 0.25869
## gamma1 0.09779
## shape 0.28461
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.49 1.68 2.12
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.1049 0.91645
## Negative Sign Bias 0.9505 0.34220
## Positive Sign Bias 1.7135 0.08711 *
## Joint Effect 4.0820 0.25274
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 22.04 0.2823255
## 2 30 33.19 0.2701523
## 3 40 56.53 0.0343318
## 4 50 89.12 0.0004023
##
##
## Elapsed time : 0.4018228
# Assume skewed t-distribution of error term z_t
<- rugarch::ugarchspec(variance.model = gjrmodel, mean.model = meanmodel, distribution.model = "sstd")
gjrspec_sstd
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = gjrspec_sstd, data = trainset, solver = "hybrid")
fitGJR11_sstd
fitGJR11_sstd
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : gjrGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : sstd
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.53099 0.135904 3.9071 0.000093
## omega 3.83977 1.280450 2.9988 0.002711
## alpha1 0.00000 0.085656 0.0000 1.000000
## beta1 0.56348 0.116819 4.8235 0.000001
## gamma1 0.29048 0.128188 2.2661 0.023447
## skew 0.90785 0.050240 18.0703 0.000000
## shape 8.51249 2.712575 3.1382 0.001700
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.53099 0.148100 3.5854 0.000337
## omega 3.83977 1.367220 2.8085 0.004978
## alpha1 0.00000 0.127236 0.0000 1.000000
## beta1 0.56348 0.095472 5.9020 0.000000
## gamma1 0.29048 0.147930 1.9636 0.049571
## skew 0.90785 0.048060 18.8899 0.000000
## shape 8.51249 3.194274 2.6649 0.007701
##
## LogLikelihood : -1710.23
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3413
## Bayes 5.3899
## Shibata 5.3411
## Hannan-Quinn 5.3602
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.6114 0.4343
## Lag[2*(p+q)+(p+q)-1][2] 0.7120 0.6019
## Lag[4*(p+q)+(p+q)-1][5] 3.9143 0.2650
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 3.158 0.07556
## Lag[2*(p+q)+(p+q)-1][5] 7.126 0.04833
## Lag[4*(p+q)+(p+q)-1][9] 8.185 0.11855
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 1.594 0.500 2.000 0.2068
## ARCH Lag[5] 1.643 1.440 1.667 0.5556
## ARCH Lag[7] 1.699 2.315 1.543 0.7807
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.9299
## Individual Statistics:
## mu 0.1385
## omega 0.2983
## alpha1 0.4199
## beta1 0.2464
## gamma1 0.1113
## skew 0.1295
## shape 0.2823
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.69 1.9 2.35
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.07627 0.9392
## Negative Sign Bias 0.98712 0.3240
## Positive Sign Bias 1.58021 0.1146
## Joint Effect 3.87723 0.2750
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 14.70 0.741571
## 2 30 33.56 0.255617
## 3 40 50.56 0.101653
## 4 50 77.61 0.005714
##
##
## Elapsed time : 0.5618382
With the t-distribution and skewed t-distribution assumption, only the coefficient of \(\varepsilon_{t-1}^2\) is insignificant and \(\gamma_1\) is insignificant when using robust errors.
# Plot some data from the GJR-GARCH(1, 1) with t-distribution
par(mfrow = c(2, 2))
for (i in plotnum) {
plot(fitGJR11_std, which = i)
}
EGARCH models, similar to GJR-GARCH models, resolves the asymmetric patterns of volatility to positive and negative shocks. An EGARCH(1, 1) model is written as:
\[\ln(\sigma_t^2) = \omega + \bigg[ \alpha_1 z_{t-1} + \gamma_1 \bigg( |z_{t-1}| - E|z_{t-1}| \bigg) \bigg] + \beta_1 \ln(\sigma_{t-1}^2)\]
\(\gamma_1\) is the coefficient of the EGARCH term \(\gamma_1 \bigg( |z_{t-1}| - E|z_{t-1}| \bigg)\). If negative shocks cause higher volatility than positive shocks, then \(\alpha_1\) should be negative. This is because, if \(z_{t-1} \ge 0\), we would have \((-\alpha_1 + \gamma_1)z_{t-1} - \gamma_1 E(|z_{t-1}|)\) and if \(z_{t-1} < 0\), we would have \((\alpha_1 + \gamma_1)z_{t-1} - \gamma_1 E(|z_{t-1}|)\), which creates higher conditional volatility when there is a negative shock.
Create an EGARCH(1, 1) model specification with normally distributed errors:
# Specify EGARCH(1,1) model
<- list(model = "eGARCH", garchOrder = c(1, 1))
egarchmodel
# Assume a normal distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = egarchmodel, mean.model = meanmodel, distribution.model = "norm")
egarchspec
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = egarchspec, data = trainset, solver = "hybrid")
fitEGARCH11
fitEGARCH11
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : eGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.53455 0.135192 3.9540 0.000077
## omega 1.04537 0.204634 5.1085 0.000000
## alpha1 -0.27588 0.056221 -4.9071 0.000001
## beta1 0.58685 0.080277 7.3103 0.000000
## gamma1 0.16260 0.067583 2.4059 0.016131
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.53455 0.147463 3.6250 0.000289
## omega 1.04537 0.199551 5.2386 0.000000
## alpha1 -0.27588 0.057280 -4.8164 0.000001
## beta1 0.58685 0.075056 7.8188 0.000000
## gamma1 0.16260 0.065427 2.4852 0.012947
##
## LogLikelihood : -1721.632
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3705
## Bayes 5.4053
## Shibata 5.3704
## Hannan-Quinn 5.3840
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.4971 0.4808
## Lag[2*(p+q)+(p+q)-1][2] 0.6019 0.6476
## Lag[4*(p+q)+(p+q)-1][5] 5.1678 0.1402
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 2.877 0.0898311
## Lag[2*(p+q)+(p+q)-1][5] 15.004 0.0004434
## Lag[4*(p+q)+(p+q)-1][9] 18.009 0.0006453
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.5821 0.500 2.000 0.4455
## ARCH Lag[5] 1.0638 1.440 1.667 0.7141
## ARCH Lag[7] 1.7477 2.315 1.543 0.7705
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.2164
## Individual Statistics:
## mu 0.1887
## omega 0.1932
## alpha1 0.7019
## beta1 0.1783
## gamma1 0.1430
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.28 1.47 1.88
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.5684 0.5700
## Negative Sign Bias 0.8028 0.4224
## Positive Sign Bias 0.9806 0.3272
## Joint Effect 2.0914 0.5537
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 24.03 0.19502
## 2 30 38.42 0.11338
## 3 40 46.21 0.19904
## 4 50 63.30 0.08231
##
##
## Elapsed time : 0.2226751
Based on the output, we can write the formula as:
\[\hat{r}_t = 0.535 + \hat{\sigma}_t \hat{z}_t \\ \ln(\hat{\sigma}_t^2) = 1.045 - 0.275z_{t-1} + 0.163 \bigg( |z_{t-1}| - E|z_{t-1}| \bigg) + 0.587 \ln(\sigma_{t-1}^2)\]
All the coefficients are significant, standardized residuals are not serially correlated and ARCH effects have been sufficiently captured. I have also estimated the EGARCH with assumed t-distribution and skewed t-distribution of errors.
# Assume a t-distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = egarchmodel, mean.model = meanmodel, distribution.model = "std")
egarchspec_std
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = egarchspec_std, data = trainset, solver = "hybrid")
fitEGARCH11_std
fitEGARCH11_std
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : eGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.58466 0.122449 4.7748 0.000002
## omega 0.84197 0.264839 3.1792 0.001477
## alpha1 -0.24192 0.063613 -3.8030 0.000143
## beta1 0.66407 0.104998 6.3246 0.000000
## gamma1 0.16515 0.073542 2.2457 0.024726
## shape 8.46226 2.552073 3.3158 0.000914
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.58466 0.128296 4.5572 0.000005
## omega 0.84197 0.307692 2.7364 0.006211
## alpha1 -0.24192 0.055913 -4.3267 0.000015
## beta1 0.66407 0.121533 5.4641 0.000000
## gamma1 0.16515 0.068635 2.4062 0.016119
## shape 8.46226 2.699698 3.1345 0.001721
##
## LogLikelihood : -1712.211
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3444
## Bayes 5.3860
## Shibata 5.3442
## Hannan-Quinn 5.3605
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.5006 0.4792
## Lag[2*(p+q)+(p+q)-1][2] 0.6018 0.6477
## Lag[4*(p+q)+(p+q)-1][5] 4.9355 0.1583
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 2.583 0.108018
## Lag[2*(p+q)+(p+q)-1][5] 13.408 0.001183
## Lag[4*(p+q)+(p+q)-1][9] 15.906 0.002152
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.7679 0.500 2.000 0.3809
## ARCH Lag[5] 1.0585 1.440 1.667 0.7156
## ARCH Lag[7] 1.5424 2.315 1.543 0.8128
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.1313
## Individual Statistics:
## mu 0.1426
## omega 0.3173
## alpha1 0.4081
## beta1 0.2791
## gamma1 0.1515
## shape 0.2131
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.49 1.68 2.12
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.7249 0.4688
## Negative Sign Bias 0.6355 0.5253
## Positive Sign Bias 1.3163 0.1885
## Joint Effect 2.4070 0.4923
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 22.72 0.24976
## 2 30 33.66 0.25206
## 3 40 57.40 0.02889
## 4 50 51.01 0.39445
##
##
## Elapsed time : 0.591047
# Assume a skewed t-distribution of the error term z_t
<- rugarch::ugarchspec(variance.model = egarchmodel, mean.model = meanmodel, distribution.model = "sstd")
egarchspec_sstd
# Fit GARCH(1, 1)
<- rugarch::ugarchfit(spec = egarchspec_sstd, data = trainset, solver = "hybrid")
fitEGARCH11_sstd
fitEGARCH11_sstd
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : eGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : sstd
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.52678 0.134316 3.9219 0.000088
## omega 0.85533 0.284003 3.0117 0.002598
## alpha1 -0.23630 0.063857 -3.7004 0.000215
## beta1 0.66023 0.112343 5.8769 0.000000
## gamma1 0.16247 0.073062 2.2237 0.026166
## skew 0.91305 0.050314 18.1472 0.000000
## shape 8.65667 2.662249 3.2516 0.001147
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.52678 0.146255 3.6018 0.000316
## omega 0.85533 0.333274 2.5665 0.010274
## alpha1 -0.23630 0.057426 -4.1148 0.000039
## beta1 0.66023 0.131672 5.0142 0.000001
## gamma1 0.16247 0.065391 2.4846 0.012970
## skew 0.91305 0.047507 19.2195 0.000000
## shape 8.65667 2.775810 3.1186 0.001817
##
## LogLikelihood : -1710.853
##
## Information Criteria
## ------------------------------------
##
## Akaike 5.3432
## Bayes 5.3919
## Shibata 5.3430
## Hannan-Quinn 5.3621
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.5486 0.4589
## Lag[2*(p+q)+(p+q)-1][2] 0.6675 0.6199
## Lag[4*(p+q)+(p+q)-1][5] 5.0387 0.1500
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 2.487 0.114803
## Lag[2*(p+q)+(p+q)-1][5] 13.257 0.001297
## Lag[4*(p+q)+(p+q)-1][9] 15.720 0.002390
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.7273 0.500 2.000 0.3937
## ARCH Lag[5] 1.0128 1.440 1.667 0.7291
## ARCH Lag[7] 1.5156 2.315 1.543 0.8182
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.3488
## Individual Statistics:
## mu 0.1522
## omega 0.3067
## alpha1 0.4119
## beta1 0.2714
## gamma1 0.1355
## skew 0.1176
## shape 0.2086
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.69 1.9 2.35
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.5094 0.6106
## Negative Sign Bias 0.6706 0.5027
## Positive Sign Bias 1.1665 0.2439
## Joint Effect 1.9281 0.5875
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 16.19 0.6445
## 2 30 31.60 0.3375
## 3 40 46.33 0.1956
## 4 50 56.30 0.2206
##
##
## Elapsed time : 0.3819659
# Plot some data from the EGARCH(1, 1) with t-distribution
par(mfrow = c(2, 2))
for (i in plotnum) {
plot(fitEGARCH11_std, which = i)
}
In this section, I have estimated 4 different models, with 3 different assumed distribution of the error \(z_t\). I have created a table comparing their AIC values below.
Models | Normal Distribution | t-Distribution | Skewed t-Distribution |
---|---|---|---|
ARCH(2) | 5.397 | 5.366 | 5.367 |
GARCH(1, 1) | 5.409 | 5.367 | 5.366 |
GJR-GARCH(1, 1) | 5.373 | 5.343 | 5.341 |
EGARCH(1, 1) | 5.371 | 5.344 | 5.343 |
Despite the rather low AIC values of the GJR models, the \(\gamma\) coefficient for the \(\varepsilon_{t-1}^2 I_{t-1}\) term was insignificant when looking at the robust errors. However, I would still consider this model in the forecast evaluation in the next section.
In this section, I evaluated the out-of-sample forecast performance of the 12 models created in Section 3. However, before jumping into the forecast, there is a need to highlight the difficulty of evaluating GARCH models. The difficulty lies in using an appropriate proxy of volatility, since it is not observable at a point in time. The paper by Wennström (2014) describes some proxies of volatility, such as squared returns, using moving averages of variance and cumulative squared intra-day returns. In the paper, and another by Schmidt (2021), the High-Low Range proxy was used and the formula is given by:
\[v_t = \ln \bigg( \frac{H_t}{L_t} \bigg)\]
I extracted the weekly high and weekly low prices of AAPL of the test period and calculate the volatility proxy.
# Extract high (column 2) and low (column 3) prices within test period
<- AAPL_price["2022-05/", 2:3]
proxy
$Volatility <- log(proxy[, 1] / proxy[, 2]) * 100
proxy
proxy
## AAPL.High AAPL.Low Volatility
## 2022-05-06 159.44 138.80 13.863362
## 2022-05-13 149.77 136.60 9.204382
## 2022-05-20 144.34 132.61 8.475910
## 2022-05-27 151.74 145.26 4.364336
## 2022-06-03 149.87 142.53 5.021572
## 2022-06-10 140.76 129.04 8.693390
## 2022-06-17 138.59 129.81 6.544808
## 2022-06-24 143.49 133.77 7.014344
In the following sub-sections, I evaluated the models against the volatility proxy using Mean Squared Error, Root Mean Squared Error and Mean Absolute Error, grouped by the model types (ARCH, GARCH, GJR and EGARCH).
The formula for each evaluation metric is:
\[ MSE = \frac{1}{n} \sum_{n=1}^N (v_{n+1}^2 - \hat{\sigma}_{n+1}^2)^2 \\ RMSE = \sqrt{MSE} \\ MAE = \frac{1}{n} \sum_{n=1}^N |v_{n+1}^2 - \hat{\sigma}_{n+1}^2| \]
# Forecast ARCH(2) with normal distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitARCH2, n.ahead = 8)
fcstARCH2_nd
# Forecast ARCH(2) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitARCH2_std, n.ahead = 8)
fcstARCH2_std
# Forecast ARCH(2) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitARCH2_sstd, n.ahead = 8) fcstARCH2_sstd
Store forecasted conditional standard deviation of ARCH models into a data frame:
<- cbind(sigma(fcstARCH2_nd), sigma(fcstARCH2_std), sigma(fcstARCH2_sstd)) %>%
fcstARCH2 `colnames<-`(c("ARCH2_ND", "ARCH2_STD", "ARCH2_SSTD"))
fcstARCH2
## ARCH2_ND ARCH2_STD ARCH2_SSTD
## T+1 3.533897 3.549828 3.553096
## T+2 3.947907 3.972137 3.918734
## T+3 3.658503 3.674590 3.658155
## T+4 3.725102 3.743649 3.709058
## T+5 3.669469 3.684593 3.661807
## T+6 3.679605 3.695164 3.668056
## T+7 3.669006 3.683521 3.659467
## T+8 3.670421 3.684966 3.660027
Calculate evaluation metrics for each ARCH model:
<- NULL
evalARCH
for (i in 1:3) {
<- mean((proxy$Volatility^2 - fcstARCH2[,i]^2)^2)
mse
<- sqrt(mse)
rmse
<- mean(abs(proxy$Volatility^2 - fcstARCH2[,i]^2))
mae
<- rbind(evalARCH, c(MSE = mse, RMSE = rmse, MAE= mae))
evalARCH
}
rownames(evalARCH) <- colnames(fcstARCH2)
evalARCH
## MSE RMSE MAE
## ARCH2_ND 5830.373 76.35688 56.41940
## ARCH2_STD 5816.254 76.26437 56.29428
## ARCH2_SSTD 5831.107 76.36168 56.48219
# Forecast GARCH(1, 1) with normal distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGARCH11, n.ahead = 8)
fcstGARCH11_nd
# Forecast GARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGARCH11_std, n.ahead = 8)
fcstGARCH11_std
# Forecast GARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGARCH11_sstd, n.ahead = 8) fcstGARCH11_sstd
Store forecasted conditional standard deviation of GARCH models into a data frame:
<- cbind(sigma(fcstGARCH11_nd), sigma(fcstGARCH11_std), sigma(fcstGARCH11_sstd)) %>%
fcstGARCH11 `colnames<-`(c("GARCH11_ND", "GARCH11_STD", "GARCH11_SSTD"))
fcstGARCH11
## GARCH11_ND GARCH11_STD GARCH11_SSTD
## T+1 3.685383 3.711890 3.687765
## T+2 3.676414 3.709187 3.685680
## T+3 3.669939 3.706749 3.683762
## T+4 3.665268 3.704549 3.681998
## T+5 3.661899 3.702564 3.680376
## T+6 3.659470 3.700774 3.678883
## T+7 3.657719 3.699159 3.677511
## T+8 3.656457 3.697702 3.676249
Calculate evaluation metrics for each GARCH model:
<- NULL
evalGARCH
for (i in 1:3) {
<- mean((proxy$Volatility^2 - fcstGARCH11[,i]^2)^2)
mse
<- sqrt(mse)
rmse
<- mean(abs(proxy$Volatility^2 - fcstGARCH11[,i]^2))
mae
<- rbind(evalGARCH, c(MSE = mse, RMSE = rmse, MAE= mae))
evalGARCH
}
rownames(evalGARCH) <- colnames(fcstGARCH11)
evalGARCH
## MSE RMSE MAE
## GARCH11_ND 5821.056 76.29584 56.63477
## GARCH11_STD 5793.129 76.11260 56.35842
## GARCH11_SSTD 5812.432 76.23930 56.52492
# Forecast GJR-GARCH(1, 1) with normal distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGJR11, n.ahead = 8)
fcstGJR11_nd
# Forecast GJR-GARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGJR11_std, n.ahead = 8)
fcstGJR11_std
# Forecast GJR-GARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitGJR11_sstd, n.ahead = 8) fcstGJR11_sstd
Store forecasted conditional standard deviation of GJR-GARCH models into a data frame:
<- cbind(sigma(fcstGJR11_nd), sigma(fcstGJR11_std), sigma(fcstGJR11_sstd)) %>%
fcstGJR11 `colnames<-`(c("GJR11_ND", "GJR11_STD", "GJR11_SSTD"))
fcstGJR11
## GJR11_ND GJR11_STD GJR11_SSTD
## T+1 4.266492 4.238812 4.202596
## T+2 4.067775 4.070917 4.033529
## T+3 3.930610 3.946910 3.910168
## T+4 3.837050 3.856104 3.820965
## T+5 3.773795 3.790063 3.756919
## T+6 3.731303 3.742287 3.711183
## T+7 3.702886 3.707864 3.678655
## T+8 3.683941 3.683137 3.655590
Calculate evaluation metrics for each GJR-GARCH model:
<- NULL
evalGJR
for (i in 1:3) {
<- mean((proxy$Volatility^2 - fcstGJR11[,i]^2)^2)
mse
<- sqrt(mse)
rmse
<- mean(abs(proxy$Volatility^2 - fcstGJR11[,i]^2))
mae
<- rbind(evalGJR, c(MSE = mse, RMSE = rmse, MAE= mae))
evalGJR
}
rownames(evalGJR) <- colnames(fcstGJR11)
evalGJR
## MSE RMSE MAE
## GJR11_ND 5519.740 74.29495 55.03242
## GJR11_STD 5525.771 74.33553 54.99476
## GJR11_SSTD 5556.277 74.54044 55.25315
# Forecast EGARCH(1, 1) with normal distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitEGARCH11, n.ahead = 8)
fcstEGARCH11_nd
# Forecast EGARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitEGARCH11_std, n.ahead = 8)
fcstEGARCH11_std
# Forecast EGARCH(1, 1) with t-distribution of innovations 8 periods ahead
<- rugarch::ugarchforecast(fitORspec = fitEGARCH11_sstd, n.ahead = 8) fcstEGARCH11_sstd
Store forecasted conditional standard deviation of EGARCH models into a data frame:
<- cbind(sigma(fcstEGARCH11_nd), sigma(fcstEGARCH11_std), sigma(fcstEGARCH11_sstd)) %>%
fcstEGARCH11 `colnames<-`(c("EGARCH11_ND", "EGARCH11_STD", "EGARCH11_SSTD"))
fcstEGARCH11
## EGARCH11_ND EGARCH11_STD EGARCH11_SSTD
## T+1 4.614321 4.553992 4.519062
## T+2 4.137426 4.169141 4.151596
## T+3 3.880849 3.931721 3.925516
## T+4 3.737751 3.781577 3.783040
## T+5 3.656243 3.685053 3.691820
## T+6 3.609240 3.622320 3.632802
## T+7 3.581939 3.581253 3.594355
## T+8 3.566013 3.554239 3.569195
Calculate evaluation metrics for each EGARCH model:
<- NULL
evalEGARCH
for (i in 1:3) {
<- mean((proxy$Volatility^2 - fcstEGARCH11[,i]^2)^2)
mse
<- sqrt(mse)
rmse
<- mean(abs(proxy$Volatility^2 - fcstEGARCH11[,i]^2))
mae
<- rbind(evalEGARCH, c(MSE = mse, RMSE = rmse, MAE= mae))
evalEGARCH
}
rownames(evalEGARCH) <- colnames(fcstEGARCH11)
evalEGARCH
## MSE RMSE MAE
## EGARCH11_ND 5413.873 73.57903 55.15565
## EGARCH11_STD 5425.776 73.65987 55.07382
## EGARCH11_SSTD 5439.500 73.75296 55.09558
Summarizing the evaluation metrics of all the models:
rbind(evalARCH, evalGARCH, evalGJR, evalEGARCH)
## MSE RMSE MAE
## ARCH2_ND 5830.373 76.35688 56.41940
## ARCH2_STD 5816.254 76.26437 56.29428
## ARCH2_SSTD 5831.107 76.36168 56.48219
## GARCH11_ND 5821.056 76.29584 56.63477
## GARCH11_STD 5793.129 76.11260 56.35842
## GARCH11_SSTD 5812.432 76.23930 56.52492
## GJR11_ND 5519.740 74.29495 55.03242
## GJR11_STD 5525.771 74.33553 54.99476
## GJR11_SSTD 5556.277 74.54044 55.25315
## EGARCH11_ND 5413.873 73.57903 55.15565
## EGARCH11_STD 5425.776 73.65987 55.07382
## EGARCH11_SSTD 5439.500 73.75296 55.09558
The EGARCH(1, 1) model with assumed normal distribution of innovations had the best MSE and RMSE, but it was surprising that the GJR-GARCH(1, 1) with assumed t-distribution of innovations had the best MAE. The EGARCH and GJR-GARCH models seem to be able to capture the higher volatility in the first 2 forecast periods better than the standard ARCH and GARCH models, likely due to the negative returns in those periods. However, all models were weak in capturing the volatility of returns past the 4th period in the forecast period using the basic forecast function. Since forecasting many periods ahead is bound to fail regardless of the type of models used, it is better to keep the forecast period short when using such models.
While the project intended to show how ARCH and GARCH (and its
variants) can be used in R to model volatility, I believe there is much
to be desired from the forecasting results in Section 4. Due to the
innovations (\(z_t\)) that cannot be
observed, the forecasts does not include the 8-period ahead returns. The
bootstrapping method of forecasting in the
rugarch
package may be a solution, which I
may tackle in another project and determine its feasibility and
accuracy. I am also curious about the performance of rolling estimation
since there can be structural changes to AAPL, such as business
environment, company prospects, new innovations, etc. that could affect
the estimation of the models.
Costa, F. J. M. (2017). Forecasting volatility using GARCH models. Retrieved from https://core.ac.uk/download/pdf/132797589.pdf
Schmidt, L. (2021). Volatility Forecasting Performance of GARCH Models: A Study on Nordic Indices During COVID-19. Retrieved from https://www.diva-portal.org/smash/get/diva2:1566342/FULLTEXT01.pdf
Wennström, A. (2014). Volatility Forecasting Performance: Evaluation of GARCH type volatility models on Nordic equity indices. Retrieved from https://www.math.kth.se/matstat/seminarier/reports/M-exjobb14/140616b.pdf
Wikipedia. (n.d.) Autoregressive conditional heteroskedasticity. Retrieved 15 July 2022, from https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity.
Zivot, E. (2013). Univariate GARCH. Presentation, University of Washington. Retrieved from https://faculty.washington.edu/ezivot/econ589/ch18-garch.pdf