Chapter 6 Time-series Forecasting

Time-series forecasting is a supervised machine learning approach that uses historical data to predict future occurrences. This is particularly helpful in terrorism context for long-term strategic planning. For this analysis, the forecasting goal and corresponding data are chosen as below:

Table 6.1: Scope of Analaysis
Forecasting_Goal	Frequency	Chosen_Country
Predict future number of attacks	By Months	Afghanistan
Predict future number of fatalities	By Months	Iraq
Predict future number of attacks	By Months	SAHEL region

For each analysis, first, we select the appropriate data, examine seasonal components and then split the data in training and test set to evaluate the performance of Auto Arima, Neural Network, TBATS and ETS models with seven different metrics. To examine whether an ensemble prediction can improve the overall accuracy, we take the average of all the predictions and compute Theil’s U statistic. In the last part of the analysis, we use all the data points (train + test) to make a forecast for the chosen future period.

6.1 Afghanistan (Predict future attacks)

6.1.1 Data preparation

Based on exploratory data analysis, it is observed that the number of attacks with visible pattern began from the year 2000 so the data is selected between the year 2000 to 2016. To get the time-series frequency by months for all the years, we use complete function from tidyr package to turn implicit missing values into explicit missing values. In other words, we add missing months and assign zero as shown in the code below:

dft <- df %>%
  filter(year >= 2000 & country == "Afghanistan") %>%
  group_by(year, month) %>%
  summarise(total_count = n()) %>%
  ungroup() %>%
  group_by(year) %>%
  # Add missing months and assign 0 where no occurences
  tidyr::complete(month = full_seq(seq(1:12), 1L), 
                  fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
         month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], 
          start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly

dft <- na.kalman(dft)

6.1.2 Seasonality analysis

First, we take a look at time plot to get an idea about how a number of attacks have changed over the period of time. In the plot below, observations (number of attacks) are plotted against the time of observation.

Figure 6.1: Attack frequency by year- Afghanistan

The seasonal plot is similar to time plot above with seasonality component (i.e. months) in which the number of attacks were observed.

Figure 6.2: Seasonal pattern within year- Afghanistan

From the seasonal patterns within a year, as shown in the plot above, we can see that year 2015 (followed by 2012) was the deadliest year in terms of number of terror attacks. In both years, the spike is visible in May month.

Figure 6.3: Seasonal pattern (boxplot)- Afghanistan

From the boxplot, we can confirm that the May month contributes the most in terms of terrorist incidents throughout all the years (2000-2016) in Afghanistan. We can see the upward trend in a number of attacks starting from February and reaching a peak in May month.

Decomposition by additive and multiplicative time-series is helpful to describe the trend and seasonal component within data. This also helps understand anomalies in data as shown in the plot below:

Figure 6.4: Time-series decomposition- Afghanistan

Time-series decomposition comprises three components depending on observed patterns:

a seasonal component,
a trend-cycle component and
a remainder component

The seasonal component as shown in the plot above represents a pattern that occurs frequently within a fixed period of time. Trend-cycle contains both trend and cycle and a remainder component contains everything else in the time-series. The remainder component is also called random component/ noise and it represents residuals of the original time-series after removing seasonal and trend component (Anomaly.io, 2015; Hyndman & Athanasopoulos, 2018).

6.1.3 Correlation test

There are several methods to identify a correlation between series and lags such as ACF, PACF and lag plots. In a lag plot, two variables are lagged and presented in scatterplot manner. In simple words, lag means a fixed amount of time from time-series data. We use lag plots method for this analysis which allows us to quickly visualize three things:

outliers
randomness and
auto-correlation.

The plot as shown below represents nine different lags. Although we can see a few outliers but there is no randomness in data. To further explain this, we can see the positive linear trend going upward from left to right in all nine plots. The positive linear trend is an indication that positive auto-correlation is present in our data.

Figure 6.5: Correlation test

Specifically, lags 1, 2, 3 and 9 show strong positive auto-correlation. Presence of auto-correlation can be problematic for some models.

6.1.4 Modelling

In this part of the analysis, we split the data in training and test set in order to evaluate the performance of four different models before making the actual forecasts.

6.1.4.1 Train-Test Split

set.seed(84)

# horizon (look ahead period)
horizon <- 12

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)

# Split the data into training and testing sets
train <- data$train
test  <- data$test

We have chosen 12 months look ahead period (horizon) so the test set contains the last 12 months from our data i.e. all the months in the year 2016 on which we will be evaluating the performance of the model.

6.1.4.2 Auto Arima

fit_arima <- auto.arima(train)

Figure 6.6: Auto Arima: residuals

A quick look at residuals from Auto Arima suggests that the mean of residuals is very close to zero however from the histogram, we can see that residuals don’t follow the normal distribution. What this means is, forecasts from this method will probably be quite good but prediction intervals computed assuming a normal distribution may be inaccurate (Hyndman & Athanasopoulos, 2018).

# Accuracy check/ Forecast evaluation
fc_arima <- forecast(fit_arima, h = horizon)

Figure 6.7: Auto Arima: Actual vs Fitted vs Forecasted

From the plot above, it is observed that the Auto Arima model nearly captures fitted values based on training data but forecasted values are a little bit apart from actual values (test data- year 2016).

Next, we examine the pattern in actual vs fitted and forecasted values for the remaining three models.

6.1.4.3 Neural Network

fit_nn <- nnetar(train, repeats = 5)
# Accuracy check/ Forecast evaluation 
fc_nn <- forecast(fit_nn, h = horizon)

Figure 6.8: Neural Net: Actual vs Fitted vs Forecasted

6.1.4.4 TBATS

fit_tbats <- tbats(train)
# Accuracy check/ Forecast evaluation 
fc_tbats <- forecast(fit_tbats, h = horizon)

Figure 6.9: TBATS: Actual vs Fitted vs Forecasted

6.1.4.5 ETS

fit_ets <- ets(train)
# Accuracy check/ Forecast evaluation 
fc_ets <- forecast(fit_ets, h = horizon)

Figure 6.10: ETS: Actual vs Fitted vs Forecasted

6.1.5 Evaluating models’ Performance

To compare the performance of all four models on test data, I have extracted mean accuracy from each model and have arranged the models by MAPE metric which is most commonly used. We will also look at six other metrics to get a better idea of the model’s performance.

Out of all the seven metrics, as shown in the table below, ME (Mean Error), RMSE (Root Mean Squared Error) and MAE (Mean Absolute Error) are a scale-dependent error. Whereas MPE (Mean Percentage Error) and MAPE (Mean Absolute Percent Error) are percentage errors and ACF stands for first-order correlation. Researchers (Hyndman & Athanasopoulos, 2018) suggest that percentage errors have the advantage of being unit-free, and so are frequently used to compare forecast performances between data sets.

metrics  <- rbind(as.data.frame(round(accuracy(fc_arima$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_nn$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_tbats$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_ets$mean, test), 3))) %>% 
  add_column(models = c("Auto Arima", "NeuralNet", "TBATS", "ETS"), 
             .before = "ME") %>% 
  arrange(MAPE)

Table 6.2: Performance comparison of all models (Afghanistan)
models	ME	RMSE	MAE	MPE	MAPE	ACF1	Theil’s U
TBATS	-6.374	23.34	18.93	-7.424	15.29	-0.303	0.645
ETS	-6.615	24.75	19.50	-7.781	15.79	-0.315	0.684
NeuralNet	-24.098	33.63	26.68	-21.038	22.64	-0.170	0.899
Auto Arima	-23.698	34.26	27.21	-20.991	23.22	-0.064	0.953

Based on MAPE metrics, we can see that TBATS and ETS models achieve the higher accuracy (~ 15) and out performs Auto Arima and Neural Network models. TBATS (Exponential Smoothing State Space Model With Box-Cox Transformation) and ETS (Exponential Smoothing State Space Model) both use exponential smoothing method. Specifically, TBATS modeling approach offers several key advantages such as handling of typical nonlinear features and allowing any auto-correlation in the residuals to be taken into account (Livera, Hyndman, & Snyder, 2011).

In addition to MAPE metric which is chosen to identify the best model, we also look at Theil’s U statistic to estimate how good or bad the model is. In simple words, Theil’s U-statistic compares the performance of the model with naïve/ random walk model(U=1). If Theil’s U statistic value equals one, it means that the model forecasting method is as good as naïve model (guessing). A value greater than one means the forecasting method is even worst than guessing. Similarly, the value less than 1 indicates that the forecasting method is better than naïve model and worth considering (Oracle, n.d.).

From the comparison, we can see that all four models have Theil’s U score less than one while TBATS and ETS models having a comparatively good score of 0.6 compared to Neural Network at 0.95.

6.1.6 Ensemble

As stated in the literature review, many research focuses on a single model approach or using the best single model out of all the models. Instead of throwing out weak models, I employ simple ensemble approach (averaging predictions of all four models) to improve the overall accuracy on the test set. This is one of the well-known approach used in machine learning competitions such as on Kaggle (Jacob van Veen, Nguyen, Dat, & Segnini, 2015). Following is the code used to extract predictions from all four models and then new column “ensemble” is added which take the average of all models. Next, we calculate Theil’s U score on ensemble predictions using a simple function in DescTools package by supplying actual observations and predicted observations as shown below:

# extract predictions from all four models and get average
ensemble <- rowMeans(
  cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))

# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = ensemble),3)))

Theil's U score on Ensemble:  0.204

Although TBATS model is our best single model however ensemble predictions by averaging forecasts of other weak models is even better. We can see that the ensemble approach significantly improves the overall accuracy as measured by Theil’s U score of 0.2. The most recent theoretical framework also supports the ensemble approach in time-series forecasting. Researchers (Hyndman & Athanasopoulos, 2018), in their book “Forecasting: Principles and Practice”, suggests that using several different methods on the same time-series data and then averaging the results of forecast often guarantees better performance than any single best models.

To summarize, it is possible that TBATS model may not be the best model on other data, however, use of ensemble approach and corresponding Theil’s U score can be used in time-series forecasting to improve the accuracy and justify the reliability of final predictions.

6.1.7 Forecast future number of attacks

As we have evaluated the performance of all four models, the next step of the process is to generate a forecast using all the data points i.e 2000-2016. The forecast horizon can be changed based on business requirement and by observing the predictions. As shown in the code chunk below, first we will generate forecasts from all four models and then we will visualize the results with plots.

f_horizon <- 18
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft), h = f_horizon, level = c(80, 95))

Figure 6.11: Auto Arima forecast (Afghanistan)

Figure 6.12: Neural Network forecast (Afghanistan)

Figure 6.13: TBATS forecast (Afghanistan)

Figure 6.14: ETS forecast (Afghanistan)

The forecasting results are often represented by the mean value and by confidence interval of 80% and 95%. The mean value of the forecast is considered as final forecasting value. Next, we extract forecasts for the chosen horizon and add the ensembled predictions as predicted future attacks in Afghanistan.

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))

Table 6.3: Table of predicted future number of attacks in Afghanistan
Time_period	Arima	NN	TBATS	ETS	Ensemble
Jan 2017	124	125	118	120	122
Feb 2017	137	139	117	114	127
Mar 2017	136	138	120	121	129
Apr 2017	136	139	134	133	136
May 2017	142	143	144	148	144
Jun 2017	134	138	141	142	139
Jul 2017	140	138	133	137	137
Aug 2017	149	143	132	134	140
Sep 2017	140	138	132	133	136
Oct 2017	154	145	124	124	137
Nov 2017	141	136	116	119	128
Dec 2017	147	137	116	115	129
Jan 2018	149	137	118	120	131
Feb 2018	151	140	117	114	130
Mar 2018	152	140	120	121	133
Apr 2018	154	141	134	133	140
May 2018	155	142	144	148	147
Jun 2018	156	142	141	142	145

6.2 Iraq (Predict future fatalities)

For this analysis, we use the exact same approach as before to estimate the number of fatalities in Iraq.

6.2.1 Data preparation

I have selected the data from 2004 to 2016 to make it appropriate for the modeling. Wherever an incident is part of multiple attacks, we have different reported figures from different sources. To overcome this issue, I have grouped data on specific variables and then taken the maximum reported value as shown in the code chunk below:

dft <- df %>%
  filter(year >= 2004 & country == "Iraq") %>%
  replace_na(list(nkill = 0)) %>% 
  group_by(group_name, region, year, month) %>% 
  filter(if_else(part_of_multiple_attacks == 1, 
                 nkill == max(nkill), nkill == nkill)) %>%
  ungroup() %>%
  distinct(group_name, region, country, year, month, nkill,
           nwound, part_of_multiple_attacks) %>%
  group_by(year, month) %>%
  summarise(total_count = sum(nkill)) %>%
  ungroup() %>%
  group_by(year) %>%
  # Add missing months and assign 0 where no occurence
  tidyr::complete(month = full_seq(seq(1:12), 1L), 
                  fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
         month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], 
          start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly

dft <- na.kalman(dft)

6.2.2 Seasonality analysis

Figure 6.15: Fatalities frequency by year- Iraq

From the time plot above, we can see an unusual spike indicating 2426 deaths in June 2014. This refers to the major incidents from ISIL where 1500 people were reportedly killed in a single incident followed by another single incident involving 600 deaths.

Figure 6.16: Seasonality Plots - Iraq

From the seasonal components, we can see that the number of fatalities is higher during mid-year. Specifically, July month accounts the most followed by April and May month. An interesting observation from the second plot above is that the variation in the number of fatalities by months between the year 2008 and 2013 is quite steady. Whereas in the years following 2013, we can see an upward trend as well as the noticeable difference in a number of fatalities by months.

From the boxplot, we can also see extreme outliers (in the statistical term) in June, August and October month indicating the very high number of fatalities in single incidents.

6.2.3 Correlation test

Figure 6.17: Correlation test

From the lag plot, we can see the slightly positive linear pattern as well as few outliers in all nine lags however there is no randomness in data. The linear pattern also suggests that auto-correlation is present. In statistical terms, correlation means the extent of a linear relationship between two variables. Same way, auto-correlation means the linear relationship between lagged values of a time series as shown in the plot above.

6.2.4 Modelling

set.seed(84)
# horizon (look ahead period)
horizon <- 18

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)
# Split the data into training and testing sets
train <- data$train
test  <- data$test

# Run models
fit_arima <- auto.arima(train)
fit_nn <- nnetar(train, repeats = 5)
fit_tbats <- tbats(train)
fit_ets <- ets(train, lambda = BoxCox.lambda(train))

#Get validation forecasts
fc_arima <- forecast(fit_arima, h = horizon)
fc_nn <- forecast(fit_nn, h = horizon)
fc_tbats <- forecast(fit_tbats, h = horizon)
fc_ets <- forecast(fit_ets, h = horizon)

metrics  <- rbind(as.data.frame(round(accuracy(fc_arima$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_nn$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_tbats$mean, test), 3)),
                  as.data.frame(round(accuracy(fc_ets$mean, test), 3))) %>% 
  add_column(models = c("Auto Arima", "NeuralNet", "TBATS", "ETS"), 
             .before = "ME") %>% 
  arrange(MAPE)

Table 6.4: Performance comparison of all models (Iraq)
models	ME	RMSE	MAE	MPE	MAPE	ACF1	Theil’s U
Auto Arima	116.42	271.3	201.8	5.843	31.34	0.054	0.822
ETS	103.37	266.5	201.7	3.261	32.16	0.052	0.810
TBATS	102.52	266.1	201.7	3.096	32.21	0.052	0.809
NeuralNet	-10.82	255.1	195.8	-18.967	38.47	-0.018	0.817

From the model comparison based on MAPE metric, we can see that Auto Arima model performs better on this data. The corresponding Theil’s U score is ~ 0.8 for all the models which mean forecasts from the chosen model are better than random guessing.

Next, we calculate Theil’s U score on ensembled predictions to see how much improvement can be achieved compared to the best single model.

6.2.5 Ensemble

# extract predictions from all four models and create ensemble
preds <- as.data.frame(
  cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))
preds$ensemble <- rowMeans(preds)

# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = preds$ensemble),3)))

Theil's U score on Ensemble:  0.399

As expected, we can see the significant improvement in forecasting accuracy by averaging predictions from all four models. Just to re-iterate, Theil’s U score less than 1 means predictions are better than a random guess (naive model).

6.2.6 Forecast future fatalities

In the validation part, data was into train and test in order to evaluate the performance of different models. For the forecast, we run the models all the data points.

# look ahead period 
f_horizon <- 12
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft, lambda = BoxCox.lambda(dft)), 
                     h = f_horizon, level = c(80, 95))

Figure 6.18: Auto Arima forecast (Iraq)

Figure 6.19: Neural Network forecast (Iraq)

Figure 6.20: TBATS Forecast (Iraq)

Figure 6.21: ETS Forecast (Iraq)

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))

Table 6.5: Table of predicted future fatalities in Iraq
Time_period	Arima	NN	TBATS	ETS	Ensemble
Jan 2017	722	878	605	597	700
Feb 2017	647	559	605	597	602
Mar 2017	668	521	605	597	598
Apr 2017	662	900	605	597	691
May 2017	664	527	605	597	598
Jun 2017	663	521	605	597	596
Jul 2017	663	900	605	597	691
Aug 2017	663	521	605	597	596
Sep 2017	663	521	605	597	596
Oct 2017	663	592	605	597	614
Nov 2017	663	484	605	597	587
Dec 2017	663	521	605	597	596

We can see flat forecast in ETS and TBATS model on this data which means that the trend and seasonality are insufficient to allow the future observations to have different conditional means for that model. In that case, both models return the last observed value. We also computed the Theil’s U score for ensemble on test set which is ~ 0.39. By using the ensembled approach and corresponding Theil’s U score during model evaluation, we can ensure the reliability of forecasted values on unseen data.

6.3 SAHEL Region (Predict future attacks)

The Sahel region in Africa stretches from east to west across the African continent. At present, this region draws huge political attention due to the indications of possible geographical expansion of ISIL (Liautaud, 2018). To estimate the future number of attacks in this region, I have selected data from the year 2000 and filtered by eight countries that fall within the sahel region as shown in the data preparation step.

6.3.1 Data preparation

sahel_region <- c("Mauritania", "Mali", "Burkina Faso", 
                  "Niger", "Nigeria", "Chad", "Sudan", "Eritrea")

dft <- df %>%
  filter(year >= 2000 & country %in% sahel_region) %>%
  group_by(year, month) %>%
  summarise(total_count = n()) %>%
  ungroup() %>%
  group_by(year) %>%
  tidyr::complete(month = full_seq(seq(1:12), 1L), fill = list(total_count = 0)) %>%
  ungroup()

dft <- dft %>%
  mutate(month_year = paste(year, month, sep="-"),
  month_year = zoo::as.yearmon(month_year)) %>%
  select(month_year, total_count)

# Create a ts object
dft <- ts(dft[, 2], start = Year(min(dft$month_year)), 
          frequency = 12) # 1=annual, 4=quartly, 12=monthly
dft <- na.kalman(dft)

6.3.2 Seasonality analysis

Figure 6.22: Attack frequency by year- SAHEL Region

From the attack frequency by year, it is observed that the number of attacks have increased exponentially in the last decade and reaching a peak during the year 2014-2015. Several researchers (Crone, 2017; Onuoha & Oyewole, 2018) have indicated that Boko Haram affiliated itself with Islamic State in 2015 as well as a large number of small groups from the entire region have also declared their affiliation with Islamic State.

Figure 6.23: Seasonal pattern (heatmap) - SAHEL Region

From the heatmap above, we can see a sudden increase in a number of attacks from the year 2012 and more than 50 attacks a month on average. Let’s have a look at seasonal components to see if there is any pattern by cycles.

Figure 6.24: Seasonality pattern (boxplot) - SAHEL Region

In a comparison to a number of attacks in Afghanistan and number of fatalities in Iraq, we can see opposite trend in SAHEL region where months in the beginning and end of the year (Jan to Mar and Oct to Dec) indicates a higher number of attacks through the period (2000-2016). In the case of Afghanistan and Iraq, it was mostly observed in the months middle of the year.

6.3.3 Correlation test

Figure 6.25: Correlation test

Similar to correlation tests in Iraq and Afghanistan, a positive linear trend is visible in all nine lags while lag 1 and 2 suggesting strong auto-correlation.

6.3.4 Modelling

set.seed(84)
# horizon (look ahead period)
horizon <-  18

# crete split for train and test set
data <- ts_split(dft, sample.out = horizon)
# Split the data into training and testing sets
train <- data$train
test  <- data$test

# Run models
fit_arima <- auto.arima(train)
fit_nn <- nnetar(train, repeats = 5)
fit_tbats <- tbats(train)
fit_ets <- ets(train)

Table 6.6: Performance comparison of all models (SAHEL Regioin)
models	ME	RMSE	MAE	MPE	MAPE	ACF1	Theil’s U
Auto Arima	-2.010	19.89	17.61	-10.75	26.46	-0.143	0.733
NeuralNet	-4.881	22.28	19.73	-16.01	30.62	-0.117	0.811
TBATS	-9.321	22.66	19.82	-22.25	32.11	-0.198	0.809
ETS	-11.439	23.67	21.11	-25.90	34.97	-0.178	0.901

From the model comparison based on MAPE metric, we can see that Auto Arima followed by Neural Network performs better on this data and all four models having Theil’s U score below 1.

6.3.5 Ensemble

# extract predictions from all four models and get average
ensemble <- rowMeans(cbind(fc_arima$mean, fc_nn$mean, fc_tbats$mean, fc_ets$mean))
# Compute Theil's U statistic (a = actual values, p= predicted values)
cat(paste("Theil's U score on Ensemble: ", 
          round(TheilU(a = test, p = ensemble),3)))

Theil's U score on Ensemble:  0.3

An ensemble prediction further improves the prediction accuracy as measured by Theil’s U score.

6.3.6 Forecast future attacks

# look ahead period 
f_horizon <- 18
# run model on full data
fore_arima <- forecast(auto.arima(dft), h = f_horizon, level = c(80, 95))
fore_nn <- forecast(nnetar(dft, repeats = 5), h = f_horizon, 
                    level = c(80, 95), PI = TRUE)
fore_tbats <- forecast(tbats(dft), h = f_horizon, level = c(80, 95))
fore_ets <- forecast(ets(dft), h = f_horizon, level = c(80, 95))

Figure 6.26: Auto Arima forecast (SAHEL Region)

Figure 6.27: Neural Network forecast (SAHEL Region)

Figure 6.28: TBATS forecast (SAHEL Region)

Figure 6.29: ETS forecast (SAHEL Region)

From the plots above, we can see that Auto Arima, Neural Network, and TBATS model are able to capture observed trend whereas the ETS model fails and generates flat predictions. However, the ensemble approach will compensate for the loss from any weak models as computed before.

tbl_arima   <- timetk::tk_tbl(round(fore_arima$mean)) 
tbl_nn      <- timetk::tk_tbl(round(fore_nn$mean))
tbl_tbats   <- timetk::tk_tbl(round(fore_tbats$mean))
tbl_ets     <- timetk::tk_tbl(round(fore_ets$mean))

tbl <- tbl_arima %>% 
    left_join(tbl_nn, by = "index") %>% 
    left_join(tbl_tbats, by = "index") %>% 
    left_join(tbl_ets, by = "index")

names(tbl) <- c("Time_period", "Arima", "NN", "TBATS", "ETS")
tbl$Ensemble <- round(rowMeans(tbl[,2:5]))

Table 6.7: Table of predicted future number of attacks in SAHEL Region
Time_period	Arima	NN	TBATS	ETS	Ensemble
Jan 2017	88	80	73	67	77
Feb 2017	73	71	76	67	72
Mar 2017	79	82	77	67	76
Apr 2017	78	63	76	67	71
May 2017	80	85	74	67	76
Jun 2017	73	80	71	67	73
Jul 2017	85	89	67	67	77
Aug 2017	72	62	64	67	66
Sep 2017	74	90	63	67	74
Oct 2017	75	86	64	67	73
Nov 2017	74	90	66	67	74
Dec 2017	77	63	70	67	69
Jan 2018	77	83	73	67	75
Feb 2018	76	81	76	67	75
Mar 2018	80	92	77	67	79
Apr 2018	75	64	76	67	70
May 2018	82	92	74	67	79
Jun 2018	80	80	71	67	74

Summary

To summarize this chapter, we analyzed the seasonality components within time-series at the monthly frequency for Afghanistan, Iraq and SAHEL region. We found an upward trend in a number of attacks starting from February to May month in Afghanistan throughout all the years. Similarly, in Iraq, we found higher fatalities during July month followed by April and May month. Whereas in the SAHEL region, this pattern is completely the opposite.

From the time-series forecasting models, we estimated the future number of attacks and fatalities using four different models at a monthly frequency. We also illustrated the importance of using ensemble method and evaluated predicted vs actual values using Theil’s U statistic which indicates a significant improvement in forecasting accuracy than the best single model. Comparing the same models on different time-series data indicates that the best single model in one time-series data may not be the best single model in another time-series data.

Apart from prediction at a monthly frequency, and forecasting number of attacks and fatalities, the scope of this analysis can be further extended predict number of injuries, quarterly frequency and for any country using the shiny app which is also an integral part of this research.