The purpose of this project was to understand non-stationary processes in time series analysis and the difference between deterministic and stochastic trend in time series variables. It also sought to find out if the treatment for achieving stationarity (differencing, de-trending, including trend in regression) affects the accuracy of out-of-sample forecast for time series with a deterministic trend.
The project used a simulation of random walk and ARIMA models and added a drift and/or trend term to simulate non-stationary time series data.
library(forecast) # For ARIMA modelling, forecasting and evaluation
library(urca) # For unit root and stationarity tests
In this section, the different non-stationary processes are introduced, along with the simulation of the processes.
I began the analysis of non-stationary processes with the random walk (RW) model as this model will lay the foundation for the models discussed later.
The RW model has the form \(y_t = y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). This means that the current value is based on its previous value and a stochastic white noise component.
The model can be easily rewritten with the form \(y_t = y_0 + \sum_{s=1}^t \varepsilon_s\) with \(E(y_t) = y_0\) and \(V(y_t) = t \sigma^2\). The mean is dependent on \(y_0\) and without loss of generality, \(y_0 = 0\) means that \(y_t\) is random and cannot be accurately predicted. The variance is dependent on and increases with time.
Using stats::arima.sim()
, we can
simulate the RW model as such:
# Number of observations
= 500
N
# Model to simulate
# For RW model, only need d = 1 in (p,d,q) to include unit root
= list(order = c(0, 1, 0))
M
set.seed(123)
<- stats::arima.sim(model = M, n = N)
RW
# Plot RW
plot(x = RW, main = "Random Walk", ylab = "y", lwd = 2)
Using urca::ur.df()
and
urca::ur.kpss
, the Augmented Dickey Fuller
test for unit root and KPSS test for stationarity can be performed:
%>% urca::ur.df(type = "none", selectlags = "AIC") %>% summary() RW
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6493 -0.5908 0.0341 0.6822 3.1660
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -0.003077 0.006604 -0.466 0.642
## z.diff.lag -0.050654 0.045091 -1.123 0.262
##
## Residual standard error: 0.9734 on 497 degrees of freedom
## Multiple R-squared: 0.003255, Adjusted R-squared: -0.0007556
## F-statistic: 0.8116 on 2 and 497 DF, p-value: 0.4447
##
##
## Value of test-statistic is: -0.4659
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.58 -1.95 -1.62
# KPSS requires a deterministic component, either drift (mu) or trend (tau)
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() RW
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 4.0069
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).
The second non-stationary process is a random walk model with drift (or constant). When plotted, the RW model with drift would look like a trending series. A positive/negative constant would produce a upward/downward trending RW series.
This model can be represented by \(y_t = \alpha + y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). This is similar to the RW model, with the current value also affected by a drift (or constant) term.
It is easily shown that the model can be rewritten as \(y_t = t \alpha + y_0 + \sum_{s = 1}^t \varepsilon_s\), with \(E(y_t) = t \alpha + y_0\) and \(V(y_t) = t \sigma^2\). WLOG, \(y_0 = 0\) shows that the mean is increasing with time. Same as the RW model, the variance of the RW model with drift is time-dependent and increases with time.
Using stats::arima.sim()
, we can
simulate the RW model with drift as such:
# Use same number of observations (N) and model (M) as RW model
# Include a non-zero mean value and optionally the standard deviation
# RW with positive constant
set.seed(234)
<- stats::arima.sim(model = M, n = N, mean = 0.5, sd = 10)
RW_posDrift
# RW with negative constant
set.seed(234)
<- stats::arima.sim(model = M, n = N, mean = -0.5, sd = 10)
RW_negDrift
# Plot RW with drift
plot(x = cbind(RW_posDrift, RW_negDrift),
main = "Random Walk with Drift", ylab = "y",
plot.type = "single",
col = c("black", "red"), lwd = 2)
legend(x = "topleft", legend = c("RW with positive drift", "RW with negative drift"),
col = c("black", "red"), lwd = 2)
Perform ADF and KPSS tests for the RW model with positive constant:
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() RW_posDrift
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.3934 -6.1937 0.6209 6.4644 28.8047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.657396 0.543495 1.210 0.227
## z.lag.1 -0.002492 0.004093 -0.609 0.543
## z.diff.lag -0.041102 0.044913 -0.915 0.361
##
## Residual standard error: 9.716 on 496 degrees of freedom
## Multiple R-squared: 0.002576, Adjusted R-squared: -0.001445
## F-statistic: 0.6406 on 2 and 496 DF, p-value: 0.5274
##
##
## Value of test-statistic is: -0.6088 0.7418
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1 6.43 4.59 3.78
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() RW_posDrift
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 6.9035
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).
A simple model with a deterministic trend represented by \(y_t = \delta t + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\), is different from stochastic trend processes such as the RW model with and without drift. This model is non-stationary because a time component is involved, with \(E(y_t) = \delta t\) such that the mean is time-dependent. By removing the time trend (de-trending), the series would be stationary since \(\varepsilon_t\) is a white noise process.
When plotted, values still vary up and down but returns to the trend, without ever drifting off from the trend permanently.
To simulate a model with deterministic trend, a time trend needs to be created:
# Use same number of observations as RW model, observation begans from time 0
<- 0:N
tt
set.seed(345)
# Set sd = 20 so that the ups and downs can be seen clearly
<- as.ts(tt + rnorm(n = N, mean = 0, sd = 10))
DT
plot(x = DT, main = "Deterministic Trend", ylab = "y", lwd = 2)
Perform ADF and KPSS tests:
%>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary() DT
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.801 -7.013 -0.048 6.966 27.995
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.37932 0.92604 -1.489 0.137
## z.lag.1 -1.08758 0.06430 -16.915 <2e-16 ***
## tt 1.09096 0.06460 16.888 <2e-16 ***
## z.diff.lag 0.06248 0.04492 1.391 0.165
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.1 on 495 degrees of freedom
## Multiple R-squared: 0.5133, Adjusted R-squared: 0.5104
## F-statistic: 174 on 3 and 495 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -16.9147 98.8768 143.0639
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
%>% urca::ur.kpss(type = "tau", lags = "short") %>% summary() DT
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: tau with 5 lags.
##
## Value of test-statistic is: 0.0227
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.119 0.146 0.176 0.216
The null hypothesis of the ADF test was rejected (there is no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary with deterministic trend).
Lastly, we can have a non-stationary process combining both deterministic and stochastic trend terms. A simple example of this process is a random walk with drift and deterministic trend with the form \(y_t = \alpha_0 + \delta t + y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). Recursively expanding the lags of \(y_t\) in the equation yields \(y_t = t \alpha_0 + \delta (t + (t-1) + (t-2) + \dotsc + 3 + 2 + 1) + y_0 + \sum_{s=1}^t \varepsilon_s\). The expected mean is then \(E(y_t) = t \alpha_0 + \delta \bigg( \frac{t(t+1)}{2} \bigg) + y_0\) and the variance is \(V(y_t) = t \sigma^2\). Both the mean and variance depends on time.
Simulating a random walk with drift and deterministic trend requires a time trend and a RW model with drift:
# Create simulated model by adding time trend (tt) to RW_posDrift
<- tt + RW_posDrift
RW_posDrift_DT
plot(x = RW_posDrift_DT, main = "Random Walk with Drift and Deterministic Trend", ylab = "y", lwd = 2)
Perform ADF and KPSS tests:
%>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary() RW_posDrift_DT
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.178 -6.354 0.361 6.425 27.404
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.546801 1.107479 -0.494 0.6217
## z.lag.1 -0.019132 0.008249 -2.319 0.0208 *
## tt 0.033256 0.013870 2.398 0.0169 *
## z.diff.lag -0.035204 0.044788 -0.786 0.4322
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.673 on 495 degrees of freedom
## Multiple R-squared: 0.01331, Adjusted R-squared: 0.007329
## F-statistic: 2.226 on 3 and 495 DF, p-value: 0.08435
##
##
## Value of test-statistic is: -2.3195 5.8417 2.879
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
%>% urca::ur.kpss(type = "tau", lags = "short") %>% summary() RW_posDrift_DT
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: tau with 5 lags.
##
## Value of test-statistic is: 1.4357
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.119 0.146 0.176 0.216
The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).
Plot the RW, RW with positive drift, deterministic trend and RW with positive drift and deterministic trend:
plot(cbind(RW, RW_posDrift, DT, RW_posDrift_DT),
main = "Non-Stationary Processes", ylab = "y",
plot.type = "single",
col = c("black", "red", "blue", "green"), lwd = 2)
legend(x = "topleft",
legend = c("RW", "RW with positive drift", "Deterministic trend", "RW with positive drift and deterministic trend"),
col = c("black", "red", "blue", "green"), lwd = 2)
The ADF and KPSS tests were conducted for each model and only the model with deterministic trend was stationary after accounting for the time trend in the tests (so it is a trend stationary process). In the next section, I discussed how non-stationary processes can be made stationary by differencing, de-trending or both.
In this section, I discussed how each of the non-stationary processes can be made stationary, in order of their presentation in Section 3.
Time series with a stochastic trend, such as the RW with and without drift, can be transformed into a stationary process by taking differences. Based on the RW equation in Section 3.1, subtracting \(y_{t-1}\) from both sides of the equation would yield:
\[y_t - y_{t-1} = y_{t-1} - y_{t-1} + \varepsilon_t \\ \Delta y_t = \varepsilon_t\]
Since \(\varepsilon_t\) is a white noise process, the RW model is stationary after taking the first difference. Such variables are integrated of order one or \(I(1)\). If a variable is not stationary after taking the first difference, but stationary after the second difference, it is an \(I(2)\) variable and so on.
Taking the first difference of RW and RW with drift:
<- diff(x = RW, lag = 1, differences = 1)
RW_diff
<- diff(x = RW_posDrift, lag = 1, differences = 1)
RWdrift_diff
plot(x = cbind(RW_diff, RWdrift_diff),
main = "First-Differenced Random Walk with and without Drift",
ylab = "Differenced y", plot.type = "single",
col = c("black", "red"), lwd = 2)
legend(x = "topleft",
legend = c("Differenced RW", "Differenced RW with drift"),
col = c("black", "red"), lwd = 2)
Perform ADF and KPSS tests on differenced RW:
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() RW_diff
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.71472 -0.64227 -0.02945 0.63278 3.06546
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.04028 0.04366 0.923 0.357
## z.lag.1 -1.11728 0.06517 -17.143 <2e-16 ***
## z.diff.lag 0.05922 0.04487 1.320 0.187
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9731 on 495 degrees of freedom
## Multiple R-squared: 0.529, Adjusted R-squared: 0.5271
## F-statistic: 277.9 on 2 and 495 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -17.143 146.9431
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() RW_diff
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0918
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). The test results would be the same for the differenced RW with drift.
For time series variables with a deterministic trend, assumptions are needed for the time trend on whether it might be linear, exponential, or quadratic. Remove the time trend and the variable would be stationary. Based on the model with deterministic trend in Section 3.3, suppose we know the true data generating process, we can simply subtract a linear time trend from the equation, therefore leaving us with a stochastic white noise process:
\[y_t - \delta t = \varepsilon_t\]
Another way is to regress \(y_t\) on a time trend and obtain the residuals (or subtract fitted \(y_t\) from actual \(y_t\)), which should be a white noise process.
Plot time trend with the model before and after removing the deterministic trend:
<- DT - tt
DT_detrend
plot(x = cbind(DT, tt, DT_detrend),
main = "Before and After Detrending", ylab = "y",
plot.type = "single",
col = c("black", "red", "blue"), lwd = 2)
legend(x = "topleft", legend = c("Deterministic trend", "Time Trend", "Detrend"),
col = c("black", "red", "blue"), lwd = 2)
Perform ADF and KPSS tests on model after de-trending:
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() DT_detrend
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.3901 -7.0606 -0.2946 6.9805 27.5018
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.37891 0.45290 -0.837 0.403
## z.lag.1 -1.08262 0.06414 -16.879 <2e-16 ***
## z.diff.lag 0.05986 0.04486 1.334 0.183
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.11 on 496 degrees of freedom
## Multiple R-squared: 0.5122, Adjusted R-squared: 0.5102
## F-statistic: 260.4 on 2 and 496 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -16.8786 142.4435
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1 6.43 4.59 3.78
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() DT_detrend
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.1653
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary).
Suppose that we subtracted the previous value of \(y\) from its current value, where \(y_{t-1} = \delta (t-1) + \varepsilon_{t-1}\). This would transform the trend stationary equation to \(\Delta y_t = \delta + \Delta \varepsilon_{t-1}\). Although \(\varepsilon_t\) is a stochastic white noise process, the transformed variable \(\Delta \varepsilon_t = \varepsilon_t - \varepsilon_{t-1}\) introduces a problem. The shock at time \(t\) does not only affect the current \(y\) value but also the \(y\) value next period and might not disappear as time passes.
Take first difference on model with deterministic trend:
<- diff(x = DT, lag = 1, differences = 1)
DT_diff
# Remove first row of DT_detrend as differencing causes us to lose one observation
plot(x = cbind(DT_detrend[-1], DT_diff),
main = "De-trending vs Differencing of Deterministic Trend Process", ylab = "",
plot.type = "single",
col = c("black", "red"), lwd = c(2, 1), lty = c(1, 2))
legend(x = "topleft",
legend = c("De-trended", "Differenced"),
col = c("black", "red"), lwd = c(2, 1), lty = c(1, 2))
Perform ADF and KPSS tests for differenced variable with deterministic trend:
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() DT_diff
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.106 -7.956 -0.125 8.239 41.565
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.94301 0.54353 3.575 0.000385 ***
## z.lag.1 -1.95836 0.07329 -26.720 < 2e-16 ***
## z.diff.lag 0.32165 0.04259 7.553 2.07e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.02 on 495 degrees of freedom
## Multiple R-squared: 0.7677, Adjusted R-squared: 0.7667
## F-statistic: 817.8 on 2 and 495 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -26.7196 356.9702
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() DT_diff
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0094
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The ADF and KPSS tests show that the differenced model with a deterministic trend is still stationary and has no unit root, but the standard deviation of the differenced variable is greater than the de-trended variable.
For variables with a stochastic and deterministic trend, differencing is required to remove the stochastic trend and regressing the differenced variable on a time trend to obtain the residuals, which removes the deterministic trend.
With the RW with drift and deterministic trend in Section 3.4, we can first obtain the residual from regressing \(\Delta y_t\) on \(t\) and a constant, which removes the deterministic trend, then subtract \(y_{t-1}\) to remove the stochastic trend, leaving behind the white noise process.
Transform RW with drift and deterministic trend into a stationary variable:
<- lm(formula = RW_posDrift_DT ~ tt)
reg_y
<- diff(x = residuals(reg_y), lag = 1, differences = 1)
delta_y
# Remove first observation from residuals as differencing causes one observation to be lost
plot(cbind(as.ts(residuals(reg_y)[-1]), delta_y),
main = "De-trending Only and With Differencing", ylab = "",
plot.type = "single",
col = c("black", "red"), lwd = 2)
legend(x = "top",
legend = c("After De-trending", "After De-trending and Differencing"),
col = c("black", "red"), lwd = 2)
Perform ADF and KPSS tests on de-trended and differenced variable:
%>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary() delta_y
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.2421 -6.1277 0.6785 6.5750 28.9570
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.168568 0.434222 -0.388 0.698
## z.lag.1 -1.049504 0.064636 -16.237 <2e-16 ***
## z.diff.lag 0.009195 0.044744 0.205 0.837
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.686 on 495 degrees of freedom
## Multiple R-squared: 0.5222, Adjusted R-squared: 0.5202
## F-statistic: 270.5 on 2 and 495 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -16.2371 131.8235
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
%>% urca::ur.kpss(type = "mu", lags = "short") %>% summary() delta_y
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.1056
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). In fact, the test statistics would be the same as the test for the differenced RW with positive drift, since the underlying data generating process is simply an addition of a time trend to the RW with positive drift.
In Section 4, methods to transform non-stationary process into stationary process were shown. In this section, I used a slightly more complex model involving ARIMA with deterministic and stochastic trends and see how the methods to achieve stationarity affect forecasting accuracy.
I simulated a simple ARIMA(2, 1, 1) model with a quadratic time trend:
# Number of observations
= 300
N
# Model to simulate, need to include coefficients of AR and MA terms
= list(order = c(2, 1, 1), ar = c(1.02, -0.43), ma = -0.41)
M
set.seed(456)
<- stats::arima.sim(model = M, n = N, mean = 0.23, sd = 3)
stoch_proc
<- as.ts(0:N)
t
<- stoch_proc + 1.39 * t - 0.0027 * t^2
y
plot(y, main = "Simulated ARIMA(2,1,1) with Quadratic Time Trend", lwd = 2)
%>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary() y
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.8261 -2.1125 -0.0418 2.1212 7.0733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.885355 0.531138 3.550 0.000449 ***
## z.lag.1 -0.029787 0.009481 -3.142 0.001851 **
## tt 0.025825 0.008823 2.927 0.003688 **
## z.diff.lag 0.525837 0.048483 10.846 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.953 on 295 degrees of freedom
## Multiple R-squared: 0.3004, Adjusted R-squared: 0.2933
## F-statistic: 42.23 on 3 and 295 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -3.1416 5.6093 5.1115
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2 6.15 4.71 4.05
## phi3 8.34 6.30 5.36
%>% urca::ur.kpss(type = "tau", lags = "short") %>% summary() y
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: tau with 5 lags.
##
## Value of test-statistic is: 0.5585
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.119 0.146 0.176 0.216
The null hypothesis for the ADF test cannot be rejected (there is unit root) and the null hypothesis for the KPSS test was rejected (not stationary).
I attempted to estimate the data as though the true data generating
process had not been known. The
forecast::auto.arima()
allows level data
to be inputted and differencing to be done internally by the function.
The only thing then is to decide if a trend is required and if a
differencing is needed afterwards.
Firstly, I split the data into training and testing sets.
<- as.ts(y[0:281])
train_y <- as.ts(t[0:281])
train_t
<- ts(data = y[282:301], start = 282)
test_y <- ts(data = t[282:301], start = 282) test_t
Secondly, determine if a fitting a trend is sufficient to obtain residuals with a white noise process.
<- lm(train_y ~ train_t)
linear_trend
<- lm(train_y ~ train_t + I(train_t^2))
quad_trend
plot(cbind(train_y, fitted(linear_trend), fitted(quad_trend)),
main = "Actual Training Data and Fitted Trend", ylab = "y",
plot.type = "single",
col = c("black", "red", "blue"), lwd = 2)
legend(x = "topleft",
legend = c("Actual", "Linear trend", "Quadratic trend"),
col = c("black", "red", "blue"), lwd = 2)
We can see that the quadratic trend seem to fit the data better. The
residuals from the quad_trend
model is plotted, along with
the residuals of linear_trend
for comparison:
plot(as.ts(cbind(residuals(quad_trend), residuals(linear_trend))),
main = "Residuals from Linear Trend and Quadratic Trend", ylab = "",
plot.type = "single",
col = c("black", "red"), lwd = 2)
legend(x = "topleft",
legend = c("Residuals quad_trend", "Residuals linear_trend"),
col = c("black", "red"), lwd = 2)
Perform ADF and KPSS test on residuals of
quad_trend
:
residuals(quad_trend) %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.7222 -2.1701 -0.0496 2.1245 7.4842
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.003176 0.177263 0.018 0.985718
## z.lag.1 -0.040382 0.011518 -3.506 0.000531 ***
## z.diff.lag 0.544500 0.049788 10.936 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.96 on 276 degrees of freedom
## Multiple R-squared: 0.3121, Adjusted R-squared: 0.3072
## F-statistic: 62.62 on 2 and 276 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -3.5058 6.1455
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1 6.47 4.61 3.79
residuals(quad_trend) %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.4081
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The null hypothesis of the ADF test was rejected (there is no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). However, it does not seem like a white noise process, thus differencing may be required after de-trending.
Estimate an ARIMA model using only differencing of the level data:
<- forecast::auto.arima(y = train_y, ic = "aic",
model1 seasonal = FALSE, stepwise = FALSE, approximation = FALSE)
model1
## Series: train_y
## ARIMA(3,1,2) with drift
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 drift
## 1.6858 -1.2848 0.2715 -1.1324 0.6023 1.0153
## s.e. 0.2430 0.3076 0.1577 0.2279 0.1429 0.2441
##
## sigma^2 = 8.272: log likelihood = -690.49
## AIC=1394.99 AICc=1395.4 BIC=1420.43
An ARIMA(3,1,2) model with drift was estimated when the training data was simply differenced.
Estimate an ARIMA model, adding quadratic trend as external regressor:
<- forecast::auto.arima(y = train_y, ic = "aic",
model2 xreg = cbind(train_t, train_t^2),
seasonal = FALSE, stepwise = FALSE, approximation = FALSE)
model2
## Series: train_y
## Regression with ARIMA(3,0,2) errors
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 train_t train_t^2
## 2.2379 -1.9398 0.6895 -0.7205 0.2962 1.6258 -0.0022
## s.e. 0.0915 0.1566 0.0784 0.1174 0.1074 0.1548 0.0006
##
## sigma^2 = 8.173: log likelihood = -692.42
## AIC=1400.84 AICc=1401.37 BIC=1429.95
By adding the time trend, an ARIMA(3,0,2) model was estimated, and no differencing was done since the data was stationary after de-trending as the ADF and KPSS tests have shown earlier.
Estimate an ARIMA model, adding quadratic trend as external regressor and explicitly include differencing:
<- forecast::auto.arima(y = train_y, ic = "aic", d = 1,
model3 xreg = cbind(train_t, train_t^2),
seasonal = FALSE, stepwise = FALSE, approximation = FALSE)
model3
## Series: train_y
## Regression with ARIMA(2,1,3) errors
##
## Coefficients:
## ar1 ar2 ma1 ma2 ma3 train_t train_t^2
## 1.3340 -0.8226 -0.7864 0.3211 0.1282 1.6218 -0.0022
## s.e. 0.0745 0.0732 0.0983 0.0894 0.0808 0.4611 0.0014
##
## sigma^2 = 8.237: log likelihood = -689.38
## AIC=1394.75 AICc=1395.28 BIC=1423.83
By including differencing to model2
, an ARIMA(2,1,3)
model was estimated.
I forecasted the three models estimated in Section 5.2 for the next 20 periods:
<- forecast::forecast(object = model1, h = 20, level = 95)
f_model1
<- forecast::forecast(object = model2, h = 20, level = 95, xreg = cbind(test_t, test_t^2))
f_model2
<- forecast::forecast(object = model3, h = 20, level = 95, xreg = cbind(test_t, test_t^2)) f_model3
Plot the f_model1
against the actual value and calculate
accuracy measures of forecast:
plot(cbind(y, f_model1$mean, f_model1$lower, f_model1$upper),
main = "Actual and Forecasted Values of model1", ylab = "y",
plot.type = "single",
col = c("black", "red", "gray", "gray"), lwd = 2)
legend(x = "topleft",
legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
col = c("black", "red", "gray"), lwd = 2)
data.frame(forecast::accuracy(object = f_model1, x = test_y))
Plot the f_model2
against the actual value and calculate
accuracy measures of forecast:
plot(cbind(y, f_model2$mean, f_model2$lower, f_model2$upper),
main = "Actual and Forecasted Values of model2", ylab = "y",
plot.type = "single",
col = c("black", "red", "gray", "gray"), lwd = 2)
legend(x = "topleft",
legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
col = c("black", "red", "gray"), lwd = 2)
data.frame(forecast::accuracy(object = f_model2, x = test_y))
Plot the f_model3
against the actual value and calculate
accuracy measures of forecast:
plot(cbind(y, f_model3$mean, f_model3$lower, f_model3$upper),
main = "Actual and Forecasted Values of model3", ylab = "y",
plot.type = "single",
col = c("black", "red", "gray", "gray"), lwd = 2)
legend(x = "topleft",
legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
col = c("black", "red", "gray"), lwd = 2)
data.frame(forecast::accuracy(object = f_model3, x = test_y))
Based on the accuracy measures (from the Test set
rows),
model3
had better forecasting performance than
model1
and model2
. Including the trend in
model estimation was able to improve forecasts of variables with
deterministic trend. However, estimating the trend can be difficult at
times. For example, if the fitted quadratic trend predicted an
inflection point has occurred even though the variable would continue to
be on an upward trend for a period in the future, the forecasts would be
impacted and would probably show a down trend.
This project used simulations of random walk models to determine and visualize the difference between deterministic and stochastic trends. An ARIMA model with deterministic and stochastic trend was simulated to see if including the trend into model estimation could improve forecasts. This are simple examples, but real-world data is much more complicated, with changing relationships between variables due to exogenous factors or shocks. In such cases, periodic re-estimation of the model with rolling windows may improve model estimation and forecasts, which I would test in another project.
Iordanova, T. (2022). An Introduction to Non-Stationary Processes. Investopedia. Retrieved 3 August 2022, from https://www.investopedia.com/articles/trading/07/stationary.asp#:~:text=Random%20Walk%20with%20Drift%20and,trend%2C%20and%20a%20stochastic%20component
Kotzé, K. Nonstationarity. kevinkotze.github.io. Retrieved 3 August 2022, from https://kevinkotze.github.io/ts-6-unit-roots/
Simulate Random Walk (RW) in R. Finance Train. Retrieved 1 August 2022, from https://financetrain.com/simulate-random-walk-rw-in-r
Zaiontz, C. Random Walk. Real Statistics Using Excel. Retrieved 2 August 2022, from https://www.real-statistics.com/time-series-analysis/stochastic-processes/random-walk/