Non-Stationary Processes in Time Series Analysis

1 Introduction

The purpose of this project was to understand non-stationary processes in time series analysis and the difference between deterministic and stochastic trend in time series variables. It also sought to find out if the treatment for achieving stationarity (differencing, de-trending, including trend in regression) affects the accuracy of out-of-sample forecast for time series with a deterministic trend.

The project used a simulation of random walk and ARIMA models and added a drift and/or trend term to simulate non-stationary time series data.

2 Packages Required

library(forecast) # For ARIMA modelling, forecasting and evaluation
library(urca) # For unit root and stationarity tests

3 Non-Stationary Processes

In this section, the different non-stationary processes are introduced, along with the simulation of the processes.

3.1 Random Walk

I began the analysis of non-stationary processes with the random walk (RW) model as this model will lay the foundation for the models discussed later.

The RW model has the form \(y_t = y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). This means that the current value is based on its previous value and a stochastic white noise component.

The model can be easily rewritten with the form \(y_t = y_0 + \sum_{s=1}^t \varepsilon_s\) with \(E(y_t) = y_0\) and \(V(y_t) = t \sigma^2\). The mean is dependent on \(y_0\) and without loss of generality, \(y_0 = 0\) means that \(y_t\) is random and cannot be accurately predicted. The variance is dependent on and increases with time.

Using stats::arima.sim(), we can simulate the RW model as such:

# Number of observations
N = 500

# Model to simulate
# For RW model, only need d = 1 in (p,d,q) to include unit root
M = list(order = c(0, 1, 0))

set.seed(123)

RW <- stats::arima.sim(model = M, n = N)
                       
# Plot RW
plot(x = RW, main = "Random Walk", ylab = "y", lwd = 2)

Using urca::ur.df() and urca::ur.kpss, the Augmented Dickey Fuller test for unit root and KPSS test for stationarity can be performed:

RW %>% urca::ur.df(type = "none", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6493 -0.5908  0.0341  0.6822  3.1660 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## z.lag.1    -0.003077   0.006604  -0.466    0.642
## z.diff.lag -0.050654   0.045091  -1.123    0.262
## 
## Residual standard error: 0.9734 on 497 degrees of freedom
## Multiple R-squared:  0.003255,   Adjusted R-squared:  -0.0007556 
## F-statistic: 0.8116 on 2 and 497 DF,  p-value: 0.4447
## 
## 
## Value of test-statistic is: -0.4659 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

# KPSS requires a deterministic component, either drift (mu) or trend (tau)
RW %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 4.0069 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).

3.2 Random Walk with Drift

The second non-stationary process is a random walk model with drift (or constant). When plotted, the RW model with drift would look like a trending series. A positive/negative constant would produce a upward/downward trending RW series.

This model can be represented by \(y_t = \alpha + y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). This is similar to the RW model, with the current value also affected by a drift (or constant) term.

It is easily shown that the model can be rewritten as \(y_t = t \alpha + y_0 + \sum_{s = 1}^t \varepsilon_s\), with \(E(y_t) = t \alpha + y_0\) and \(V(y_t) = t \sigma^2\). WLOG, \(y_0 = 0\) shows that the mean is increasing with time. Same as the RW model, the variance of the RW model with drift is time-dependent and increases with time.

Using stats::arima.sim(), we can simulate the RW model with drift as such:

# Use same number of observations (N) and model (M) as RW model

# Include a non-zero mean value and optionally the standard deviation
# RW with positive constant
set.seed(234)
RW_posDrift <- stats::arima.sim(model = M, n = N, mean = 0.5, sd = 10)

# RW with negative constant
set.seed(234)
RW_negDrift <- stats::arima.sim(model = M, n = N, mean = -0.5, sd = 10)

# Plot RW with drift
plot(x = cbind(RW_posDrift, RW_negDrift), 
     main = "Random Walk with Drift", ylab = "y", 
     plot.type = "single", 
     col = c("black", "red"), lwd = 2)

legend(x = "topleft", legend = c("RW with positive drift", "RW with negative drift"), 
       col = c("black", "red"), lwd = 2)

Perform ADF and KPSS tests for the RW model with positive constant:

RW_posDrift %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -30.3934  -6.1937   0.6209   6.4644  28.8047 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.657396   0.543495   1.210    0.227
## z.lag.1     -0.002492   0.004093  -0.609    0.543
## z.diff.lag  -0.041102   0.044913  -0.915    0.361
## 
## Residual standard error: 9.716 on 496 degrees of freedom
## Multiple R-squared:  0.002576,   Adjusted R-squared:  -0.001445 
## F-statistic: 0.6406 on 2 and 496 DF,  p-value: 0.5274
## 
## 
## Value of test-statistic is: -0.6088 0.7418 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1  6.43  4.59  3.78

RW_posDrift %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 6.9035 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).

3.3 Deterministic Trend

A simple model with a deterministic trend represented by \(y_t = \delta t + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\), is different from stochastic trend processes such as the RW model with and without drift. This model is non-stationary because a time component is involved, with \(E(y_t) = \delta t\) such that the mean is time-dependent. By removing the time trend (de-trending), the series would be stationary since \(\varepsilon_t\) is a white noise process.

When plotted, values still vary up and down but returns to the trend, without ever drifting off from the trend permanently.

To simulate a model with deterministic trend, a time trend needs to be created:

# Use same number of observations as RW model, observation begans from time 0
tt <- 0:N

set.seed(345)

# Set sd = 20 so that the ups and downs can be seen clearly
DT <- as.ts(tt + rnorm(n = N, mean = 0, sd = 10))

plot(x = DT, main = "Deterministic Trend", ylab = "y", lwd = 2)

Perform ADF and KPSS tests:

DT %>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -27.801  -7.013  -0.048   6.966  27.995 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.37932    0.92604  -1.489    0.137    
## z.lag.1     -1.08758    0.06430 -16.915   <2e-16 ***
## tt           1.09096    0.06460  16.888   <2e-16 ***
## z.diff.lag   0.06248    0.04492   1.391    0.165    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.1 on 495 degrees of freedom
## Multiple R-squared:  0.5133, Adjusted R-squared:  0.5104 
## F-statistic:   174 on 3 and 495 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -16.9147 98.8768 143.0639 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

DT %>% urca::ur.kpss(type = "tau", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 5 lags. 
## 
## Value of test-statistic is: 0.0227 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

The null hypothesis of the ADF test was rejected (there is no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary with deterministic trend).

3.4 Random Walk with Drift and Deterministic Trend

Lastly, we can have a non-stationary process combining both deterministic and stochastic trend terms. A simple example of this process is a random walk with drift and deterministic trend with the form \(y_t = \alpha_0 + \delta t + y_{t-1} + \varepsilon_t\), where \(\varepsilon_t \sim N(0, \sigma^2)\). Recursively expanding the lags of \(y_t\) in the equation yields \(y_t = t \alpha_0 + \delta (t + (t-1) + (t-2) + \dotsc + 3 + 2 + 1) + y_0 + \sum_{s=1}^t \varepsilon_s\). The expected mean is then \(E(y_t) = t \alpha_0 + \delta \bigg( \frac{t(t+1)}{2} \bigg) + y_0\) and the variance is \(V(y_t) = t \sigma^2\). Both the mean and variance depends on time.

Simulating a random walk with drift and deterministic trend requires a time trend and a RW model with drift:

# Create simulated model by adding time trend (tt) to RW_posDrift
RW_posDrift_DT <- tt + RW_posDrift

plot(x = RW_posDrift_DT, main = "Random Walk with Drift and Deterministic Trend", ylab = "y", lwd = 2)

Perform ADF and KPSS tests:

RW_posDrift_DT %>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -28.178  -6.354   0.361   6.425  27.404 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.546801   1.107479  -0.494   0.6217  
## z.lag.1     -0.019132   0.008249  -2.319   0.0208 *
## tt           0.033256   0.013870   2.398   0.0169 *
## z.diff.lag  -0.035204   0.044788  -0.786   0.4322  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.673 on 495 degrees of freedom
## Multiple R-squared:  0.01331,    Adjusted R-squared:  0.007329 
## F-statistic: 2.226 on 3 and 495 DF,  p-value: 0.08435
## 
## 
## Value of test-statistic is: -2.3195 5.8417 2.879 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

RW_posDrift_DT %>% urca::ur.kpss(type = "tau", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 5 lags. 
## 
## Value of test-statistic is: 1.4357 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

The null hypothesis of the ADF test cannot be rejected (there is unit root) and the null hypothesis of the KPSS test was rejected (not stationary).

3.5 Summary

Plot the RW, RW with positive drift, deterministic trend and RW with positive drift and deterministic trend:

plot(cbind(RW, RW_posDrift, DT, RW_posDrift_DT),
     main = "Non-Stationary Processes", ylab = "y", 
     plot.type = "single", 
     col = c("black", "red", "blue", "green"), lwd = 2)

legend(x = "topleft",
       legend = c("RW", "RW with positive drift", "Deterministic trend", "RW with positive drift and deterministic trend"), 
       col = c("black", "red", "blue", "green"), lwd = 2)

The ADF and KPSS tests were conducted for each model and only the model with deterministic trend was stationary after accounting for the time trend in the tests (so it is a trend stationary process). In the next section, I discussed how non-stationary processes can be made stationary by differencing, de-trending or both.

4 Transform Non-Stationary to Stationary Processes

In this section, I discussed how each of the non-stationary processes can be made stationary, in order of their presentation in Section 3.

4.1 Differencing

Time series with a stochastic trend, such as the RW with and without drift, can be transformed into a stationary process by taking differences. Based on the RW equation in Section 3.1, subtracting \(y_{t-1}\) from both sides of the equation would yield:

\[y_t - y_{t-1} = y_{t-1} - y_{t-1} + \varepsilon_t \\ \Delta y_t = \varepsilon_t\]

Since \(\varepsilon_t\) is a white noise process, the RW model is stationary after taking the first difference. Such variables are integrated of order one or \(I(1)\). If a variable is not stationary after taking the first difference, but stationary after the second difference, it is an \(I(2)\) variable and so on.

Taking the first difference of RW and RW with drift:

RW_diff <- diff(x = RW, lag = 1, differences = 1)

RWdrift_diff <- diff(x = RW_posDrift, lag = 1, differences = 1)

plot(x = cbind(RW_diff, RWdrift_diff), 
     main = "First-Differenced Random Walk with and without Drift", 
     ylab = "Differenced y", plot.type = "single",
     col = c("black", "red"), lwd = 2)

legend(x = "topleft", 
       legend = c("Differenced RW", "Differenced RW with drift"), 
       col = c("black", "red"), lwd = 2)

Perform ADF and KPSS tests on differenced RW:

RW_diff %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.71472 -0.64227 -0.02945  0.63278  3.06546 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.04028    0.04366   0.923    0.357    
## z.lag.1     -1.11728    0.06517 -17.143   <2e-16 ***
## z.diff.lag   0.05922    0.04487   1.320    0.187    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9731 on 495 degrees of freedom
## Multiple R-squared:  0.529,  Adjusted R-squared:  0.5271 
## F-statistic: 277.9 on 2 and 495 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -17.143 146.9431 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79

RW_diff %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0918 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). The test results would be the same for the differenced RW with drift.

4.2 De-trending

For time series variables with a deterministic trend, assumptions are needed for the time trend on whether it might be linear, exponential, or quadratic. Remove the time trend and the variable would be stationary. Based on the model with deterministic trend in Section 3.3, suppose we know the true data generating process, we can simply subtract a linear time trend from the equation, therefore leaving us with a stochastic white noise process:

\[y_t - \delta t = \varepsilon_t\]

Another way is to regress \(y_t\) on a time trend and obtain the residuals (or subtract fitted \(y_t\) from actual \(y_t\)), which should be a white noise process.

Plot time trend with the model before and after removing the deterministic trend:

DT_detrend <- DT - tt

plot(x = cbind(DT, tt, DT_detrend), 
     main = "Before and After Detrending", ylab = "y",
     plot.type = "single",
     col = c("black", "red", "blue"), lwd = 2)

legend(x = "topleft", legend = c("Deterministic trend", "Time Trend", "Detrend"),
       col = c("black", "red", "blue"), lwd = 2)

Perform ADF and KPSS tests on model after de-trending:

DT_detrend %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -27.3901  -7.0606  -0.2946   6.9805  27.5018 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.37891    0.45290  -0.837    0.403    
## z.lag.1     -1.08262    0.06414 -16.879   <2e-16 ***
## z.diff.lag   0.05986    0.04486   1.334    0.183    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.11 on 496 degrees of freedom
## Multiple R-squared:  0.5122, Adjusted R-squared:  0.5102 
## F-statistic: 260.4 on 2 and 496 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -16.8786 142.4435 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1  6.43  4.59  3.78

DT_detrend %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.1653 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary).

4.3 Differencing a Trend Stationary Variable

Suppose that we subtracted the previous value of \(y\) from its current value, where \(y_{t-1} = \delta (t-1) + \varepsilon_{t-1}\). This would transform the trend stationary equation to \(\Delta y_t = \delta + \Delta \varepsilon_{t-1}\). Although \(\varepsilon_t\) is a stochastic white noise process, the transformed variable \(\Delta \varepsilon_t = \varepsilon_t - \varepsilon_{t-1}\) introduces a problem. The shock at time \(t\) does not only affect the current \(y\) value but also the \(y\) value next period and might not disappear as time passes.

Take first difference on model with deterministic trend:

DT_diff <- diff(x = DT, lag = 1, differences = 1)

# Remove first row of DT_detrend as differencing causes us to lose one observation
plot(x = cbind(DT_detrend[-1], DT_diff), 
     main = "De-trending vs Differencing of Deterministic Trend Process", ylab = "",
     plot.type = "single",
     col = c("black", "red"), lwd = c(2, 1), lty = c(1, 2))

legend(x = "topleft",
       legend = c("De-trended", "Differenced"),
       col = c("black", "red"), lwd = c(2, 1), lty = c(1, 2))

Perform ADF and KPSS tests for differenced variable with deterministic trend:

DT_diff %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -37.106  -7.956  -0.125   8.239  41.565 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.94301    0.54353   3.575 0.000385 ***
## z.lag.1     -1.95836    0.07329 -26.720  < 2e-16 ***
## z.diff.lag   0.32165    0.04259   7.553 2.07e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.02 on 495 degrees of freedom
## Multiple R-squared:  0.7677, Adjusted R-squared:  0.7667 
## F-statistic: 817.8 on 2 and 495 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -26.7196 356.9702 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79

DT_diff %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0094 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The ADF and KPSS tests show that the differenced model with a deterministic trend is still stationary and has no unit root, but the standard deviation of the differenced variable is greater than the de-trended variable.

4.4 Combining De-trending and Differencing

For variables with a stochastic and deterministic trend, differencing is required to remove the stochastic trend and regressing the differenced variable on a time trend to obtain the residuals, which removes the deterministic trend.

With the RW with drift and deterministic trend in Section 3.4, we can first obtain the residual from regressing \(\Delta y_t\) on \(t\) and a constant, which removes the deterministic trend, then subtract \(y_{t-1}\) to remove the stochastic trend, leaving behind the white noise process.

Transform RW with drift and deterministic trend into a stationary variable:

reg_y <- lm(formula = RW_posDrift_DT ~ tt)

delta_y <- diff(x = residuals(reg_y), lag = 1, differences = 1)

# Remove first observation from residuals as differencing causes one observation to be lost
plot(cbind(as.ts(residuals(reg_y)[-1]), delta_y),
     main = "De-trending Only and With Differencing", ylab = "",
     plot.type = "single",
     col = c("black", "red"), lwd = 2)

legend(x = "top",
       legend = c("After De-trending", "After De-trending and Differencing"),
       col = c("black", "red"), lwd = 2)

Perform ADF and KPSS tests on de-trended and differenced variable:

delta_y %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -30.2421  -6.1277   0.6785   6.5750  28.9570 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.168568   0.434222  -0.388    0.698    
## z.lag.1     -1.049504   0.064636 -16.237   <2e-16 ***
## z.diff.lag   0.009195   0.044744   0.205    0.837    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.686 on 495 degrees of freedom
## Multiple R-squared:  0.5222, Adjusted R-squared:  0.5202 
## F-statistic: 270.5 on 2 and 495 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -16.2371 131.8235 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79

delta_y %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.1056 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test was rejected (no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). In fact, the test statistics would be the same as the test for the differenced RW with positive drift, since the underlying data generating process is simply an addition of a time trend to the RW with positive drift.

5 Impact on Forecasting

In Section 4, methods to transform non-stationary process into stationary process were shown. In this section, I used a slightly more complex model involving ARIMA with deterministic and stochastic trends and see how the methods to achieve stationarity affect forecasting accuracy.

5.1 Simulate ARIMA Model

I simulated a simple ARIMA(2, 1, 1) model with a quadratic time trend:

# Number of observations
N = 300

# Model to simulate, need to include coefficients of AR and MA terms
M = list(order = c(2, 1, 1), ar = c(1.02, -0.43), ma = -0.41)

set.seed(456)

stoch_proc <- stats::arima.sim(model = M, n = N, mean = 0.23, sd = 3)

t <- as.ts(0:N)

y <- stoch_proc + 1.39 * t - 0.0027 * t^2

plot(y, main = "Simulated ARIMA(2,1,1) with Quadratic Time Trend", lwd = 2)

y %>% urca::ur.df(type = "trend", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.8261 -2.1125 -0.0418  2.1212  7.0733 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.885355   0.531138   3.550 0.000449 ***
## z.lag.1     -0.029787   0.009481  -3.142 0.001851 ** 
## tt           0.025825   0.008823   2.927 0.003688 ** 
## z.diff.lag   0.525837   0.048483  10.846  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.953 on 295 degrees of freedom
## Multiple R-squared:  0.3004, Adjusted R-squared:  0.2933 
## F-statistic: 42.23 on 3 and 295 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -3.1416 5.6093 5.1115 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2  6.15  4.71  4.05
## phi3  8.34  6.30  5.36

y %>% urca::ur.kpss(type = "tau", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 5 lags. 
## 
## Value of test-statistic is: 0.5585 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

The null hypothesis for the ADF test cannot be rejected (there is unit root) and the null hypothesis for the KPSS test was rejected (not stationary).

5.2 Model Estimation

I attempted to estimate the data as though the true data generating process had not been known. The forecast::auto.arima() allows level data to be inputted and differencing to be done internally by the function. The only thing then is to decide if a trend is required and if a differencing is needed afterwards.

Firstly, I split the data into training and testing sets.

train_y <- as.ts(y[0:281])
train_t <- as.ts(t[0:281])

test_y <- ts(data = y[282:301], start = 282)
test_t <- ts(data = t[282:301], start = 282)

Secondly, determine if a fitting a trend is sufficient to obtain residuals with a white noise process.

linear_trend <- lm(train_y ~ train_t)

quad_trend <- lm(train_y ~ train_t + I(train_t^2))

plot(cbind(train_y, fitted(linear_trend), fitted(quad_trend)),
     main = "Actual Training Data and Fitted Trend", ylab = "y",
     plot.type = "single",
     col = c("black", "red", "blue"), lwd = 2)

legend(x = "topleft",
       legend = c("Actual", "Linear trend", "Quadratic trend"),
       col = c("black", "red", "blue"), lwd = 2)

We can see that the quadratic trend seem to fit the data better. The residuals from the quad_trend model is plotted, along with the residuals of linear_trend for comparison:

plot(as.ts(cbind(residuals(quad_trend), residuals(linear_trend))),
     main = "Residuals from Linear Trend and Quadratic Trend", ylab = "",
     plot.type = "single",
     col = c("black", "red"), lwd = 2)

legend(x = "topleft",
       legend = c("Residuals quad_trend", "Residuals linear_trend"),
       col = c("black", "red"), lwd = 2)

Perform ADF and KPSS test on residuals of quad_trend:

residuals(quad_trend) %>% urca::ur.df(type = "drift", selectlags = "AIC") %>% summary()

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.7222 -2.1701 -0.0496  2.1245  7.4842 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.003176   0.177263   0.018 0.985718    
## z.lag.1     -0.040382   0.011518  -3.506 0.000531 ***
## z.diff.lag   0.544500   0.049788  10.936  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.96 on 276 degrees of freedom
## Multiple R-squared:  0.3121, Adjusted R-squared:  0.3072 
## F-statistic: 62.62 on 2 and 276 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -3.5058 6.1455 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.44 -2.87 -2.57
## phi1  6.47  4.61  3.79

residuals(quad_trend) %>% urca::ur.kpss(type = "mu", lags = "short") %>% summary()

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.4081 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The null hypothesis of the ADF test was rejected (there is no unit root) and the null hypothesis of the KPSS test cannot be rejected (stationary). However, it does not seem like a white noise process, thus differencing may be required after de-trending.

Estimate an ARIMA model using only differencing of the level data:

model1 <- forecast::auto.arima(y = train_y, ic = "aic",
                               seasonal = FALSE, stepwise = FALSE, approximation = FALSE)

model1

## Series: train_y 
## ARIMA(3,1,2) with drift 
## 
## Coefficients:
##          ar1      ar2     ar3      ma1     ma2   drift
##       1.6858  -1.2848  0.2715  -1.1324  0.6023  1.0153
## s.e.  0.2430   0.3076  0.1577   0.2279  0.1429  0.2441
## 
## sigma^2 = 8.272:  log likelihood = -690.49
## AIC=1394.99   AICc=1395.4   BIC=1420.43

An ARIMA(3,1,2) model with drift was estimated when the training data was simply differenced.

Estimate an ARIMA model, adding quadratic trend as external regressor:

model2 <- forecast::auto.arima(y = train_y, ic = "aic",
                               xreg = cbind(train_t, train_t^2),
                               seasonal = FALSE, stepwise = FALSE, approximation = FALSE)

model2

## Series: train_y 
## Regression with ARIMA(3,0,2) errors 
## 
## Coefficients:
##          ar1      ar2     ar3      ma1     ma2  train_t  train_t^2
##       2.2379  -1.9398  0.6895  -0.7205  0.2962   1.6258    -0.0022
## s.e.  0.0915   0.1566  0.0784   0.1174  0.1074   0.1548     0.0006
## 
## sigma^2 = 8.173:  log likelihood = -692.42
## AIC=1400.84   AICc=1401.37   BIC=1429.95

By adding the time trend, an ARIMA(3,0,2) model was estimated, and no differencing was done since the data was stationary after de-trending as the ADF and KPSS tests have shown earlier.

Estimate an ARIMA model, adding quadratic trend as external regressor and explicitly include differencing:

model3 <- forecast::auto.arima(y = train_y, ic = "aic", d = 1,
                               xreg = cbind(train_t, train_t^2),
                               seasonal = FALSE, stepwise = FALSE, approximation = FALSE)

model3

## Series: train_y 
## Regression with ARIMA(2,1,3) errors 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     ma3  train_t  train_t^2
##       1.3340  -0.8226  -0.7864  0.3211  0.1282   1.6218    -0.0022
## s.e.  0.0745   0.0732   0.0983  0.0894  0.0808   0.4611     0.0014
## 
## sigma^2 = 8.237:  log likelihood = -689.38
## AIC=1394.75   AICc=1395.28   BIC=1423.83

By including differencing to model2, an ARIMA(2,1,3) model was estimated.

5.3 Forecast and Evaluation

I forecasted the three models estimated in Section 5.2 for the next 20 periods:

f_model1 <- forecast::forecast(object = model1, h = 20, level = 95)

f_model2 <- forecast::forecast(object = model2, h = 20, level = 95, xreg = cbind(test_t, test_t^2))

f_model3 <- forecast::forecast(object = model3, h = 20, level = 95, xreg = cbind(test_t, test_t^2))

Plot the f_model1 against the actual value and calculate accuracy measures of forecast:

plot(cbind(y, f_model1$mean, f_model1$lower, f_model1$upper),
     main = "Actual and Forecasted Values of model1", ylab = "y",
     plot.type = "single",
     col = c("black", "red", "gray", "gray"), lwd = 2)

legend(x = "topleft",
       legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
       col = c("black", "red", "gray"), lwd = 2)

data.frame(forecast::accuracy(object = f_model1, x = test_y))

Plot the f_model2 against the actual value and calculate accuracy measures of forecast:

plot(cbind(y, f_model2$mean, f_model2$lower, f_model2$upper),
     main = "Actual and Forecasted Values of model2", ylab = "y",
     plot.type = "single",
     col = c("black", "red", "gray", "gray"), lwd = 2)

legend(x = "topleft",
       legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
       col = c("black", "red", "gray"), lwd = 2)

data.frame(forecast::accuracy(object = f_model2, x = test_y))

Plot the f_model3 against the actual value and calculate accuracy measures of forecast:

plot(cbind(y, f_model3$mean, f_model3$lower, f_model3$upper),
     main = "Actual and Forecasted Values of model3", ylab = "y",
     plot.type = "single",
     col = c("black", "red", "gray", "gray"), lwd = 2)

legend(x = "topleft",
       legend = c("Actual", "Forecasted", "Lower and Upper Interval"),
       col = c("black", "red", "gray"), lwd = 2)

data.frame(forecast::accuracy(object = f_model3, x = test_y))

Based on the accuracy measures (from the Test set rows), model3 had better forecasting performance than model1 and model2. Including the trend in model estimation was able to improve forecasts of variables with deterministic trend. However, estimating the trend can be difficult at times. For example, if the fitted quadratic trend predicted an inflection point has occurred even though the variable would continue to be on an upward trend for a period in the future, the forecasts would be impacted and would probably show a down trend.

6 Final Remarks

This project used simulations of random walk models to determine and visualize the difference between deterministic and stochastic trends. An ARIMA model with deterministic and stochastic trend was simulated to see if including the trend into model estimation could improve forecasts. This are simple examples, but real-world data is much more complicated, with changing relationships between variables due to exogenous factors or shocks. In such cases, periodic re-estimation of the model with rolling windows may improve model estimation and forecasts, which I would test in another project.

References

Iordanova, T. (2022). An Introduction to Non-Stationary Processes. Investopedia. Retrieved 3 August 2022, from https://www.investopedia.com/articles/trading/07/stationary.asp#:~:text=Random%20Walk%20with%20Drift%20and,trend%2C%20and%20a%20stochastic%20component

Kotzé, K. Nonstationarity. kevinkotze.github.io. Retrieved 3 August 2022, from https://kevinkotze.github.io/ts-6-unit-roots/

Simulate Random Walk (RW) in R. Finance Train. Retrieved 1 August 2022, from https://financetrain.com/simulate-random-walk-rw-in-r

Zaiontz, C. Random Walk. Real Statistics Using Excel. Retrieved 2 August 2022, from https://www.real-statistics.com/time-series-analysis/stochastic-processes/random-walk/

Non-Stationary Processes in Time Series Analysis

Tan Zheng Liang

2022-08-15

1 Introduction

2 Packages Required

3 Non-Stationary Processes

3.1 Random Walk

3.2 Random Walk with Drift

3.3 Deterministic Trend

3.4 Random Walk with Drift and Deterministic Trend

3.5 Summary

4 Transform Non-Stationary to Stationary Processes

4.1 Differencing

4.3 Differencing a Trend Stationary Variable

5 Impact on Forecasting

5.1 Simulate ARIMA Model

5.2 Model Estimation

5.3 Forecast and Evaluation

6 Final Remarks

References