Chapter 6 Time-series Forecasting

Time-series forecasting is a supervised machine learning approach that uses historical data to predict future occurrences. This is particularly helpful in terrorism context for long-term strategic planning. For this analysis, the forecasting goal and corresponding data are chosen as below:

Table 6.1: Scope of Analaysis
Forecasting_Goal Frequency Chosen_Country
Predict future number of attacks By Months Afghanistan
Predict future number of fatalities By Months Iraq
Predict future number of attacks By Months SAHEL region

For each analysis, first, we select the appropriate data, examine seasonal components and then split the data in training and test set to evaluate the performance of Auto Arima, Neural Network, TBATS and ETS models with seven different metrics. To examine whether an ensemble prediction can improve the overall accuracy, we take the average of all the predictions and compute Theil’s U statistic. In the last part of the analysis, we use all the data points (train + test) to make a forecast for the chosen future period.

6.1 Afghanistan (Predict future attacks)

6.1.1 Data preparation

Based on exploratory data analysis, it is observed that the number of attacks with visible pattern began from the year 2000 so the data is selected between the year 2000 to 2016. To get the time-series frequency by months for all the years, we use complete function from tidyr package to turn implicit missing values into explicit missing values. In other words, we add missing months and assign zero as shown in the code below:

dft <- df %>%
  filter(year >= 2000 & country == "Afghanistan") %>%
  group_by(year, month) %>%
  summarise(total_count = n()) %>%
  ungroup() %>%
  group_by(year) %>%
  # Add missing months and assign 0 where no occurences
  tidyr::complete(month = full_seq(seq(1:12), 1L), 
                  fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
         month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], 
          start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly

dft <- na.kalman(dft)

6.1.2 Seasonality analysis

First, we take a look at time plot to get an idea about how a number of attacks have changed over the period of time. In the plot below, observations (number of attacks) are plotted against the time of observation.

Figure 6.1: Attack frequency by year- Afghanistan

The seasonal plot is similar to time plot above with seasonality component (i.e. months) in which the number of attacks were observed.

Figure 6.2: Seasonal pattern within year- Afghanistan

From the seasonal patterns within a year, as shown in the plot above, we can see that year 2015 (followed by 2012) was the deadliest year in terms of number of terror attacks. In both years, the spike is visible in May month.

Figure 6.3: Seasonal pattern (boxplot)- Afghanistan

From the boxplot, we can confirm that the May month contributes the most in terms of terrorist incidents throughout all the years (2000-2016) in Afghanistan. We can see the upward trend in a number of attacks starting from February and reaching a peak in May month.

Decomposition by additive and multiplicative time-series is helpful to describe the trend and seasonal component within data. This also helps understand anomalies in data as shown in the plot below:

Figure 6.4: Time-series decomposition- Afghanistan

Time-series decomposition comprises three components depending on observed patterns:

  • a seasonal component,
  • a trend-cycle component and
  • a remainder component

The seasonal component as shown in the plot above represents a pattern that occurs frequently within a fixed period of time. Trend-cycle contains both trend and cycle and a remainder component contains everything else in the time-series. The remainder component is also called random component/ noise and it represents residuals of the original time-series after removing seasonal and trend component (Anomaly.io, 2015; Hyndman & Athanasopoulos, 2018).

6.1.3 Correlation test

There are several methods to identify a correlation between series and lags such as ACF, PACF and lag plots. In a lag plot, two variables are lagged and presented in scatterplot manner. In simple words, lag means a fixed amount of time from time-series data. We use lag plots method for this analysis which allows us to quickly visualize three things:

  • outliers
  • randomness and
  • auto-correlation.

The plot as shown below represents nine different lags. Although we can see a few outliers but there is no randomness in data. To further explain this, we can see the positive linear trend going upward from left to right in all nine plots. The positive linear trend is an indication that positive auto-correlation is present in our data.

Figure 6.5: Correlation test

Specifically, lags 1, 2, 3 and 9 show strong positive auto-correlation. Presence of auto-correlation can be problematic for some models.

6.1.4 Modelling

In this part of the analysis, we split the data in training and test set in order to evaluate the performance of four different models before making the actual forecasts.

6.1.4.1 Train-Test Split

set.seed(84)

# horizon (look ahead period)
horizon <- 12

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)

# Split the data into training and testing sets
train <- data$train
test  <- data$test

We have chosen 12 months look ahead period (horizon) so the test set contains the last 12 months from our data i.e. all the months in the year 2016 on which we will be evaluating the performance of the model.

6.1.4.2 Auto Arima

fit_arima <- auto.arima(train)

Figure 6.6: Auto Arima: residuals

A quick look at residuals from Auto Arima suggests that the mean of residuals is very close to zero however from the histogram, we can see that residuals don’t follow the normal distribution. What this means is, forecasts from this method will probably be quite good but prediction intervals computed assuming a normal distribution may be inaccurate (Hyndman & Athanasopoulos, 2018).

# Accuracy check/ Forecast evaluation
fc_arima <- forecast(fit_arima, h = horizon)

Figure 6.7: Auto Arima: Actual vs Fitted vs Forecasted

From the plot above, it is observed that the Auto Arima model nearly captures fitted values based on training data but forecasted values are a little bit apart from actual values (test data- year 2016).

Next, we examine the pattern in actual vs fitted and forecasted values for the remaining three models.

6.1.4.3 Neural Network

fit_nn <- nnetar(train, repeats = 5)
# Accuracy check/ Forecast evaluation 
fc_nn <- forecast(fit_nn, h = horizon)

Figure 6.8: Neural Net: Actual vs Fitted vs Forecasted

6.1.4.4 TBATS

fit_tbats <- tbats(train)
# Accuracy check/ Forecast evaluation 
fc_tbats <- forecast(fit_tbats, h = horizon)

Figure 6.9: TBATS: Actual vs Fitted vs Forecasted

6.1.4.5 ETS

fit_ets <- ets(train)
# Accuracy check/ Forecast evaluation 
fc_ets <- forecast(fit_ets, h = horizon)

Figure 6.10: ETS: Actual vs Fitted vs Forecasted

6.1.5 Evaluating models’ Performance

To compare the performance of all four models on test data, I have extracted mean accuracy from each model and have arranged the models by MAPE metric which is most commonly used. We will also look at six other metrics to get a better idea of the model’s performance.

Out of all the seven metrics, as shown in the table below, ME (Mean Error), RMSE (Root Mean Squared Error) and MAE (Mean Absolute Error) are a scale-dependent error. Whereas MPE (Mean Percentage Error) and MAPE (Mean Absolute Percent Error) are percentage errors and ACF stands for first-order correlation. Researchers (Hyndman & Athanasopoulos, 2018) suggest that percentage errors have the advantage of being unit-free, and so are frequently used to compare forecast performances between data sets.

metrics  <- rbind(as.data.frame(round(accuracy(fc_arima$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_nn$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_tbats$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_ets$mean, test), 3))) %>% 
  add_column(models = c("Auto Arima", "NeuralNet", "TBATS", "ETS"), 
             .before = "ME") %>% 
  arrange(MAPE)
Table 6.2: Performance comparison of all models (Afghanistan)
models ME RMSE MAE MPE MAPE ACF1 Theil’s U
TBATS -6.374 23.34 18.93 -7.424 15.29 -0.303 0.645
ETS -6.615 24.75 19.50 -7.781 15.79 -0.315 0.684
NeuralNet -24.098 33.63 26.68 -21.038 22.64 -0.170 0.899
Auto Arima -23.698 34.26 27.21 -20.991 23.22 -0.064 0.953

Based on MAPE metrics, we can see that TBATS and ETS models achieve the higher accuracy (~ 15) and out performs Auto Arima and Neural Network models. TBATS (Exponential Smoothing State Space Model With Box-Cox Transformation) and ETS (Exponential Smoothing State Space Model) both use exponential smoothing method. Specifically, TBATS modeling approach offers several key advantages such as handling of typical nonlinear features and allowing any auto-correlation in the residuals to be taken into account (Livera, Hyndman, & Snyder, 2011).

In addition to MAPE metric which is chosen to identify the best model, we also look at Theil’s U statistic to estimate how good or bad the model is. In simple words, Theil’s U-statistic compares the performance of the model with naïve/ random walk model(U=1). If Theil’s U statistic value equals one, it means that the model forecasting method is as good as naïve model (guessing). A value greater than one means the forecasting method is even worst than guessing. Similarly, the value less than 1 indicates that the forecasting method is better than naïve model and worth considering (Oracle, n.d.).

From the comparison, we can see that all four models have Theil’s U score less than one while TBATS and ETS models having a comparatively good score of 0.6 compared to Neural Network at 0.95.

6.1.6 Ensemble

As stated in the literature review, many research focuses on a single model approach or using the best single model out of all the models. Instead of throwing out weak models, I employ simple ensemble approach (averaging predictions of all four models) to improve the overall accuracy on the test set. This is one of the well-known approach used in machine learning competitions such as on Kaggle (Jacob van Veen, Nguyen, Dat, & Segnini, 2015). Following is the code used to extract predictions from all four models and then new column “ensemble” is added which take the average of all models. Next, we calculate Theil’s U score on ensemble predictions using a simple function in DescTools package by supplying actual observations and predicted observations as shown below:

# extract predictions from all four models and get average
ensemble <- rowMeans(
  cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))

# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = ensemble),3)))
Theil's U score on Ensemble:  0.204

Although TBATS model is our best single model however ensemble predictions by averaging forecasts of other weak models is even better. We can see that the ensemble approach significantly improves the overall accuracy as measured by Theil’s U score of 0.2. The most recent theoretical framework also supports the ensemble approach in time-series forecasting. Researchers (Hyndman & Athanasopoulos, 2018), in their book “Forecasting: Principles and Practice”, suggests that using several different methods on the same time-series data and then averaging the results of forecast often guarantees better performance than any single best models.

To summarize, it is possible that TBATS model may not be the best model on other data, however, use of ensemble approach and corresponding Theil’s U score can be used in time-series forecasting to improve the accuracy and justify the reliability of final predictions.

6.1.7 Forecast future number of attacks

As we have evaluated the performance of all four models, the next step of the process is to generate a forecast using all the data points i.e 2000-2016. The forecast horizon can be changed based on business requirement and by observing the predictions. As shown in the code chunk below, first we will generate forecasts from all four models and then we will visualize the results with plots.

f_horizon <- 18
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft), h = f_horizon, level = c(80, 95))

Figure 6.11: Auto Arima forecast (Afghanistan)

Figure 6.12: Neural Network forecast (Afghanistan)

Figure 6.13: TBATS forecast (Afghanistan)

Figure 6.14: ETS forecast (Afghanistan)

The forecasting results are often represented by the mean value and by confidence interval of 80% and 95%. The mean value of the forecast is considered as final forecasting value. Next, we extract forecasts for the chosen horizon and add the ensembled predictions as predicted future attacks in Afghanistan.

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))
Table 6.3: Table of predicted future number of attacks in Afghanistan
Time_period Arima NN TBATS ETS Ensemble
Jan 2017 124 125 118 120 122
Feb 2017 137 139 117 114 127
Mar 2017 136 138 120 121 129
Apr 2017 136 139 134 133 136
May 2017 142 143 144 148 144
Jun 2017 134 138 141 142 139
Jul 2017 140 138 133 137 137
Aug 2017 149 143 132 134 140
Sep 2017 140 138 132 133 136
Oct 2017 154 145 124 124 137
Nov 2017 141 136 116 119 128
Dec 2017 147 137 116 115 129
Jan 2018 149 137 118 120 131
Feb 2018 151 140 117 114 130
Mar 2018 152 140 120 121 133
Apr 2018 154 141 134 133 140
May 2018 155 142 144 148 147
Jun 2018 156 142 141 142 145

6.2 Iraq (Predict future fatalities)

For this analysis, we use the exact same approach as before to estimate the number of fatalities in Iraq.

6.2.1 Data preparation

I have selected the data from 2004 to 2016 to make it appropriate for the modeling. Wherever an incident is part of multiple attacks, we have different reported figures from different sources. To overcome this issue, I have grouped data on specific variables and then taken the maximum reported value as shown in the code chunk below:

dft <- df %>%
  filter(year >= 2004 & country == "Iraq") %>%
  replace_na(list(nkill = 0)) %>% 
  group_by(group_name, region, year, month) %>% 
  filter(if_else(part_of_multiple_attacks == 1, 
                 nkill == max(nkill), nkill == nkill)) %>%
  ungroup() %>%
  distinct(group_name, region, country, year, month, nkill,
           nwound, part_of_multiple_attacks) %>%
  group_by(year, month) %>%
  summarise(total_count = sum(nkill)) %>%
  ungroup() %>%
  group_by(year) %>%
  # Add missing months and assign 0 where no occurence
  tidyr::complete(month = full_seq(seq(1:12), 1L), 
                  fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
         month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], 
          start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly

dft <- na.kalman(dft)

6.2.2 Seasonality analysis

Figure 6.15: Fatalities frequency by year- Iraq

From the time plot above, we can see an unusual spike indicating 2426 deaths in June 2014. This refers to the major incidents from ISIL where 1500 people were reportedly killed in a single incident followed by another single incident involving 600 deaths.

Figure 6.16: Seasonality Plots - Iraq

From the seasonal components, we can see that the number of fatalities is higher during mid-year. Specifically, July month accounts the most followed by April and May month. An interesting observation from the second plot above is that the variation in the number of fatalities by months between the year 2008 and 2013 is quite steady. Whereas in the years following 2013, we can see an upward trend as well as the noticeable difference in a number of fatalities by months.

From the boxplot, we can also see extreme outliers (in the statistical term) in June, August and October month indicating the very high number of fatalities in single incidents.

6.2.3 Correlation test

Figure 6.17: Correlation test

From the lag plot, we can see the slightly positive linear pattern as well as few outliers in all nine lags however there is no randomness in data. The linear pattern also suggests that auto-correlation is present. In statistical terms, correlation means the extent of a linear relationship between two variables. Same way, auto-correlation means the linear relationship between lagged values of a time series as shown in the plot above.

6.2.4 Modelling

set.seed(84)
# horizon (look ahead period)
horizon <- 18

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)
# Split the data into training and testing sets
train <- data$train
test  <- data$test

# Run models
fit_arima <- auto.arima(train)
fit_nn <- nnetar(train, repeats = 5)
fit_tbats <- tbats(train)
fit_ets <- ets(train, lambda = BoxCox.lambda(train))

#Get validation forecasts
fc_arima <- forecast(fit_arima, h = horizon)
fc_nn <- forecast(fit_nn, h = horizon)
fc_tbats <- forecast(fit_tbats, h = horizon)
fc_ets <- forecast(fit_ets, h = horizon)

metrics  <- rbind(as.data.frame(round(accuracy(fc_arima$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_nn$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_tbats$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_ets$mean, test), 3))) %>% 
  add_column(models = c("Auto Arima", "NeuralNet", "TBATS", "ETS"), 
             .before = "ME") %>% 
  arrange(MAPE)
Table 6.4: Performance comparison of all models (Iraq)
models ME RMSE MAE MPE MAPE ACF1 Theil’s U
Auto Arima 116.42 271.3 201.8 5.843 31.34 0.054 0.822
ETS 103.37 266.5 201.7 3.261 32.16 0.052 0.810
TBATS 102.52 266.1 201.7 3.096 32.21 0.052 0.809
NeuralNet -10.82 255.1 195.8 -18.967 38.47 -0.018 0.817

From the model comparison based on MAPE metric, we can see that Auto Arima model performs better on this data. The corresponding Theil’s U score is ~ 0.8 for all the models which mean forecasts from the chosen model are better than random guessing.

Next, we calculate Theil’s U score on ensembled predictions to see how much improvement can be achieved compared to the best single model.

6.2.5 Ensemble

# extract predictions from all four models and create ensemble
preds <- as.data.frame(
  cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))
preds$ensemble <- rowMeans(preds)

# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = preds$ensemble),3)))
Theil's U score on Ensemble:  0.399

As expected, we can see the significant improvement in forecasting accuracy by averaging predictions from all four models. Just to re-iterate, Theil’s U score less than 1 means predictions are better than a random guess (naive model).

6.2.6 Forecast future fatalities

In the validation part, data was into train and test in order to evaluate the performance of different models. For the forecast, we run the models all the data points.

# look ahead period 
f_horizon <- 12
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft, lambda = BoxCox.lambda(dft)), 
                     h = f_horizon, level = c(80, 95))

Figure 6.18: Auto Arima forecast (Iraq)

Figure 6.19: Neural Network forecast (Iraq)

Figure 6.20: TBATS Forecast (Iraq)

Figure 6.21: ETS Forecast (Iraq)

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))
Table 6.5: Table of predicted future fatalities in Iraq
Time_period Arima NN TBATS ETS Ensemble
Jan 2017 722 878 605 597 700
Feb 2017 647 559 605 597 602
Mar 2017 668 521 605 597 598
Apr 2017 662 900 605 597 691
May 2017 664 527 605 597 598
Jun 2017 663 521 605 597 596
Jul 2017 663 900 605 597 691
Aug 2017 663 521 605 597 596
Sep 2017 663 521 605 597 596
Oct 2017 663 592 605 597 614
Nov 2017 663 484 605 597 587
Dec 2017 663 521 605 597 596

We can see flat forecast in ETS and TBATS model on this data which means that the trend and seasonality are insufficient to allow the future observations to have different conditional means for that model. In that case, both models return the last observed value. We also computed the Theil’s U score for ensemble on test set which is ~ 0.39. By using the ensembled approach and corresponding Theil’s U score during model evaluation, we can ensure the reliability of forecasted values on unseen data.

6.3 SAHEL Region (Predict future attacks)

The Sahel region in Africa stretches from east to west across the African continent. At present, this region draws huge political attention due to the indications of possible geographical expansion of ISIL (Liautaud, 2018). To estimate the future number of attacks in this region, I have selected data from the year 2000 and filtered by eight countries that fall within the sahel region as shown in the data preparation step.

6.3.1 Data preparation

sahel_region <- c("Mauritania", "Mali", "Burkina Faso", 
                  "Niger", "Nigeria", "Chad", "Sudan", "Eritrea")

dft <- df %>%
  filter(year >= 2000 & country %in% sahel_region) %>%
  group_by(year, month) %>%
  summarise(total_count = n()) %>%
  ungroup() %>%
  group_by(year) %>%
  tidyr::complete(month = full_seq(seq(1:12), 1L), fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
  month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly
dft <- na.kalman(dft)

6.3.2 Seasonality analysis

Figure 6.22: Attack frequency by year- SAHEL Region

From the attack frequency by year, it is observed that the number of attacks have increased exponentially in the last decade and reaching a peak during the year 2014-2015. Several researchers (Crone, 2017; Onuoha & Oyewole, 2018) have indicated that Boko Haram affiliated itself with Islamic State in 2015 as well as a large number of small groups from the entire region have also declared their affiliation with Islamic State.

Figure 6.23: Seasonal pattern (heatmap) - SAHEL Region

From the heatmap above, we can see a sudden increase in a number of attacks from the year 2012 and more than 50 attacks a month on average. Let’s have a look at seasonal components to see if there is any pattern by cycles.

Figure 6.24: Seasonality pattern (boxplot) - SAHEL Region

In a comparison to a number of attacks in Afghanistan and number of fatalities in Iraq, we can see opposite trend in SAHEL region where months in the beginning and end of the year (Jan to Mar and Oct to Dec) indicates a higher number of attacks through the period (2000-2016). In the case of Afghanistan and Iraq, it was mostly observed in the months middle of the year.

6.3.3 Correlation test

Figure 6.25: Correlation test

Similar to correlation tests in Iraq and Afghanistan, a positive linear trend is visible in all nine lags while lag 1 and 2 suggesting strong auto-correlation.

6.3.4 Modelling

set.seed(84)
# horizon (look ahead period)
horizon <-  18

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)
# Split the data into training and testing sets
train <- data$train
test  <- data$test

# Run models
fit_arima <- auto.arima(train)
fit_nn <- nnetar(train, repeats = 5)
fit_tbats <- tbats(train)
fit_ets <- ets(train)
Table 6.6: Performance comparison of all models (SAHEL Regioin)
models ME RMSE MAE MPE MAPE ACF1 Theil’s U
Auto Arima -2.010 19.89 17.61 -10.75 26.46 -0.143 0.733
NeuralNet -4.881 22.28 19.73 -16.01 30.62 -0.117 0.811
TBATS -9.321 22.66 19.82 -22.25 32.11 -0.198 0.809
ETS -11.439 23.67 21.11 -25.90 34.97 -0.178 0.901

From the model comparison based on MAPE metric, we can see that Auto Arima followed by Neural Network performs better on this data and all four models having Theil’s U score below 1.

6.3.5 Ensemble

# extract predictions from all four models and get average
ensemble <- rowMeans(cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))
# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = ensemble),3)))
Theil's U score on Ensemble:  0.3

An ensemble prediction further improves the prediction accuracy as measured by Theil’s U score.

6.3.6 Forecast future attacks

# look ahead period 
f_horizon <- 18
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft), h = f_horizon, level = c(80, 95))

Figure 6.26: Auto Arima forecast (SAHEL Region)

Figure 6.27: Neural Network forecast (SAHEL Region)

Figure 6.28: TBATS forecast (SAHEL Region)

Figure 6.29: ETS forecast (SAHEL Region)

From the plots above, we can see that Auto Arima, Neural Network, and TBATS model are able to capture observed trend whereas the ETS model fails and generates flat predictions. However, the ensemble approach will compensate for the loss from any weak models as computed before.

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))
Table 6.7: Table of predicted future number of attacks in SAHEL Region
Time_period Arima NN TBATS ETS Ensemble
Jan 2017 88 80 73 67 77
Feb 2017 73 71 76 67 72
Mar 2017 79 82 77 67 76
Apr 2017 78 63 76 67 71
May 2017 80 85 74 67 76
Jun 2017 73 80 71 67 73
Jul 2017 85 89 67 67 77
Aug 2017 72 62 64 67 66
Sep 2017 74 90 63 67 74
Oct 2017 75 86 64 67 73
Nov 2017 74 90 66 67 74
Dec 2017 77 63 70 67 69
Jan 2018 77 83 73 67 75
Feb 2018 76 81 76 67 75
Mar 2018 80 92 77 67 79
Apr 2018 75 64 76 67 70
May 2018 82 92 74 67 79
Jun 2018 80 80 71 67 74

Summary

To summarize this chapter, we analyzed the seasonality components within time-series at the monthly frequency for Afghanistan, Iraq and SAHEL region. We found an upward trend in a number of attacks starting from February to May month in Afghanistan throughout all the years. Similarly, in Iraq, we found higher fatalities during July month followed by April and May month. Whereas in the SAHEL region, this pattern is completely the opposite.

From the time-series forecasting models, we estimated the future number of attacks and fatalities using four different models at a monthly frequency. We also illustrated the importance of using ensemble method and evaluated predicted vs actual values using Theil’s U statistic which indicates a significant improvement in forecasting accuracy than the best single model. Comparing the same models on different time-series data indicates that the best single model in one time-series data may not be the best single model in another time-series data.

Apart from prediction at a monthly frequency, and forecasting number of attacks and fatalities, the scope of this analysis can be further extended predict number of injuries, quarterly frequency and for any country using the shiny app which is also an integral part of this research.