Forecasting Volatility of Ether
An empirical evaluation of volatility models and their
capacity to forecast one-day-ahead volatility of Ether
University of Gothenburg Authors:
School of Business, Economics and Law Johannes Marmdal
Department: Graduate School 199603084493
Master’s Thesis in Finance Adam Törnqvist
Spring 2023 199502069694
Supervisor:
Oben K. Bayrak
Abstract
This study evaluates the performance of volatility models in forecasting one-day-ahead
volatility of the cryptocurrency Ether. The selected models are: GARCH, EGARCH,
GJR-GARCH, SMA9, SMA20, and EWMA. We investigate both in-sample perfor-
mance and out-of-sample performance. In-sample performance concerns only the set
of GARCH models, where the parameters of the models are estimated and the de-
gree of goodness-of-fit is evaluated using Akaike Information Criterion and Bayesian
Information Criterion. For out-of-sample performance, we use Realized Volatility as
a measure of ex-post volatility. The models are evaluated by conducting the Diebold-
Mariano test for statistical difference between the models, and two loss functions:
mean squared errors (MSE) and mean absolute errors (MAE). The results from the
in-sample performance show that GARCH minimizes AIC and BIC using Student’s t-
distribution as well as BIC using the Gaussian distribution. The best model in terms
of AIC using the Gaussian distribution was found to be GJR-GARCH. The out-of-
sample results show that EGARCH is the best performing model using MSE, while
SMA9 is the optimal model using MAE. However, the models are not statistically
different and either one may be considered for forecasting purposes.
Keywords: Forecast, Volatility, Ether, GARCH, EWMA, SMA
Acknowledgement
We would like to thank our supervisor Oben K. Bayrak for valuable insights and
support during the process of writing our thesis.
Contents
1 Introduction 1
2 Background 3
2.1 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 The Cryptocurrency Market . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Literature Review 7
3.1 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Volatility in Cryptocurrencies . . . . . . . . . . . . . . . . . . . . . . 9
4 Data 12
5 Methodology 14
5.1 Realized Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Maximum Likelihood Estimation (MLE) . . . . . . . . . . . . . . . . 14
5.3 GARCH(1,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.4 EGARCH(1,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.5 GJR-GARCH(1,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.6 Simple Moving Average (SMA) . . . . . . . . . . . . . . . . . . . . . 18
5.7 Exponentially Weighted Moving Average (EWMA) . . . . . . . . . . 18
5.8 In-Sample Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.9 Out-of-Sample Performance . . . . . . . . . . . . . . . . . . . . . . . 19
6 Results 22
6.1 In-Sample Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Out-of-Sample Results . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 Discussion 32
8 Conclusion 33
References 34
Appendix 39
1 Introduction
Volatility research has been an active area of interest for many years, because of
its importance in making informed decisions and managing risk effectively (Poon &
Granger, 2003). Practioners use volatility in various ways, including pricing deriva-
tives and assessing portfolio performance through measures such as the Sharpe ratio
(Sharpe, 1998). Furthermore, since the implementation of the Basel Accord in 1996,
volatility forecasting has become a compulsory risk-management exercise for finan-
cial institutions, further emphasizing the importance of studying volatility (Poon &
Granger, 2003). As a consequence of the central role of volatility in finance, there
is a vast amount of literature on volatility for traditional assets. At the same time,
a new asset class has emerged since Nakamoto (2008) released the whitepaper for
Bitcoin, which was the start of Bitcoin and ultimately launched the foundation for
digital currencies called Cryptocurrencies. Since then, scholars have taken an interest
in researching the volatility of cryptocurrencies (Chu et al., 2017; Dyhrberg, 2016a,
2016b; Katsiampa, 2017). The current literature has a focus on studying Bitcoin and
there are few to no studies within the asset class that is conducting an evaluation of
different volatility models. This creates an opportunity for us, as we aspire to gener-
ate new insights into the field by performing volatility model evaluation forecasting
the one-day-ahead volatility of Ether.
In this study, we employ the following volatility models: SMA, EWMA, GARCH,
EGARCH, and GJR-GARCH. The set of GARCH models is evaluated both under
the Gaussian and Student’s t distributions. Our choice of the SMA order is 9 and 20
(SMA9, SMA20). SMA and EWMA are simple time series models that are calculated
based on historical returns, and there is thus no estimation using maximum likelihood,
unlike the set of GARCH models. As a consequence, the evaluation of goodness-of-fit
to the underlying data, or the in-sample analysis will solely focus on the set of GARCH
models. The in-sample analysis involves the estimation of GARCH parameters and
minimizing Akaike Information Criterion (AIC) and Bayesian Information Criterion
(BIC) which does not apply to SMA and EWMA.
All models are evaluated out-of-sample, meaning that the models produce forecasts
on new, unseen data. Realized Volatility is used as an ex-post measure of actual
volatility (see Figure 5 in the Appendix). The models are then evaluated using the
1
Diebold-Mariano test and the loss functions mean squared error (MSE) and mean
absolute error (MAE). The DM-test checks whether the forecast errors are different,
and the loss functions measure the distance between the models’ predicted values and
ex-post actual volatility.
Ultimately, we ask the following question: which of the different volatility models
produces the best one-day-ahead forecasts of volatility for Ether? The predictive
accuracy of the models is evaluated both in-sample and out-of-sample.
The GARCH model has the strongest in-sample performance for both probability
distributions. GARCH was found to minimize AIC and BIC using Student’s t-
distribution and BIC using the Gaussian distribution, whereas GJR-GARCH min-
imizes AIC under the Gaussian distribution. The strong result explains to some
degree why GARCH is a popular choice for modeling volatility.
The out-of-sample result shows that EGARCH is the best model forecasting the one-
day-ahead of Ether using MSE and the DM-test. This may be attributed to the
capacity of EGARCH to capture volatility asymmetry and persistence in volatility.
Using MAE and the DM-test, the SMA9 is the best forecasting model. The success
may be credited to the model quickly adapting to underlying volatility changes.
The subsequent sections of this thesis are structured as follows. The first section is the
background which delves into volatility, the cryptocurrency market, and the object of
analysis, Ether. The literature review is composed of two parts. The first part exam-
ines the existing literature on volatility in traditional assets, while the second assesses
the existing literature related to volatility in cryptocurrencies. Next, the data section
explains how we acquired the data to write the thesis. Thereafter is the methodol-
ogy section which presents the Realized Volatility, Maximum Likelihood Estimation,
the selected models and outlines the procedure for evaluating in-sample performance
and out-of-sample performance. After methodology, the results which are organized
through in-sample results and out-of-sample results are presented. Thereafter, a dis-
cussion that focuses on the particular choices made during the process of writing the
thesis and its limitations, and how they may be changed as suggestions for further
research. Finally, the conclusion summarizes the key findings of the thesis.
2
2 Background
The purpose of this section is to provide context and establish the foundation for
the research that subsequently will be presented in this thesis. The first part covers
volatility, which is a discussion of stylized facts about volatility as well as a distinction
between volatility, standard deviation, and risk. Next, we provide information about
the broader cryptocurrency market to provide context for this relatively new asset
class for the reader. Finally, we discuss the object of analysis for this thesis, which is
Ether, the cryptocurrency used on the Ethereum blockchain.
2.1 Volatility
According to Poon and Granger (2003), volatility is often referred to as standard
deviation in finance, or variance computed from a set of observations as:
∑N
σ̂2
1
= (Rt − R̄)2 (1)
N − 1
t=1
Where Rt is the asset return at time t. Volatility is therefore a measure of the degree
of variation of an asset’s price over time. It is commonly used to quantify risk,
even though the risk is typically associated with small or negative returns, whereas
volatility makes no such distinction. Standard deviation, on the other hand, is a
statistical measure that represents the average deviation of a set of numbers to its
mean. The sample standard deviation σ̂ is then a distribution-free parameter that
represents the second moment characteristic of the sample.
Some regular patterns and characteristics are commonly observed in financial data
across time Poon and Granger (2003). These patterns are important to take into
consideration for proper model specification, estimation, and forecasting. Poon and
Granger (2003) mention five important characteristics of financial time series:
1. Fat-tailed distributions. Riskier assets have probability distributions that have
higher probability of extreme outcomes than a normal distribution. There is an
increased likelihood of extreme events, i.e. large price changes.
2. Volatility clustering. Financial time series typically show that large price vari-
ations are followed by large price variations and small price variations are typ-
ically followed by small price variations. This behavior is favorable because it
3
suggests that volatility is predictable.
3. Asymmetry. Volatility tends to be greater when the price declines rather than
price increases. This means that the magnitude of price declines tends to be
greater than the magnitude of price increases.
4. Mean reversion. Volatility tends to return to a long-term average level over
time, i.e. a period of high volatility will eventually fade and a period of low
volatility will eventually increase to its long-term average level.
5. Long memory. There is a tendency for past values of volatility to have an effect
on future values, even for long lags.
2.2 The Cryptocurrency Market
The cryptocurrency market began with the launch of Bitcoin in 2009, created by an
individual or group of individuals using the pseudonym Satoshi Nakamoto (Nakamoto,
2008). Bitcoin was the first decentralized digital currency, and it used a technology
called blockchain to record and verify transactions. This technology allows for the
creation of a digital ledger that is distributed across a network of computers, mak-
ing it difficult to hack or manipulate. The launch of Bitcoin sparked interest in the
potential of other digital currencies, and since then, thousands of cryptocurrencies
have been created (CoinMarketCap, 2023). The market has grown significantly in
size, with a total market capitalization of over 1 trillion USD as of April 2023, despite
being worth significantly less than the market peak. Moreover, the market is always
open, and the currencies may be traded at any time (FOREX, 2023). Cryptocur-
rencies are attractive because they offer an alternative to traditional fiat currencies
and financial systems, allowing for faster and cheaper cross-border transactions, and
greater financial privacy.
There are several examples of how the cryptocurrency market has risen in popularity
and size. The American bank Wells Fargo (2022) compares the current high growth
to that of the hyper-adoption phase of the internet in the mid-1990s. Retail adop-
tion has also increased. By retail adoption, we mean both retail traders/investors,
as opposed to professional traders/investors, as well as actual businesses accepting
digital currencies as payment. Referring to the former, Makarov and Schoar (2020)
show that in 2020, there were 50 million investors trading bitcoin and other cryp-
4
tocurrencies. Some examples of companies accepting some sort of cryptocurrency as
a payment are Paypal, Virgin Group, and Whole Foods (Haqqi, 2022). Further, as the
market matures, the attractiveness of investing in the new asset class for institutional
investors has increased, where institutional investments into the cryptocurrency as-
set class increased five-fold in 2021 to 13.65 billion USD (Thomas & Sabater, 2022).
Another signal that digital currencies play a role in the future economy is the fact
that over 100 nation states are exploring Central Bank Digital Currencies (CBDCs)
as a method of payment (Georgieva, 2022).
2.3 Ethereum
Ethereum is a cryptocurrency that was launched in 2015, created by Vitalik Buterin.
It is similar to Bitcoin, but it also includes a programming language that allows for
the creation of smart contracts and decentralized applications (Buterin et al., 2014).
The Ethereum network has grown in size and has become one of the most widely
used blockchain platforms. Currently, Ether is the second-largest cryptocurrency by
market capitalization according to CoinMarketCap (2023), and proponents consider
it to be special because of its ability to support smart contracts and decentralized
applications, which has led to the development of a thriving ecosystem of decentralized
finance (DeFi) and non-fungible tokens (NFTs) on top of its blockchain.
In recent years, Ethereum has undergone several significant changes and develop-
ments. One of the most notable changes has been the transition from a Proof of Work
(PoW) consensus mechanism to a Proof of Stake (PoS) mechanism. The Ethereum
2.0 upgrade, which began rolling out in December 2020 and was completed in Septem-
ber 2022, moved the network from a PoW to a PoS mechanism, which is designed
to improve the network’s scalability, security, and energy efficiency (Ethereum Foun-
dation, 2023). This is a major change in the trajectory of the Ethereum network.
PoS is a different approach to reaching a consensus on the state of the blockchain. In
PoW, miners compete on solving complex mathematical problems to validate trans-
actions and receive (for example) Bitcoin in return. In contrast, PoS validators are
chosen proportional to their economic stake in the network. Meaning that validators
are chosen based on the amount of Ether they hold and are willing to “stake” as
collateral. This is important because one of the main criticisms of cryptocurrencies
is the amount of energy that is used by PoW mining. With PoS, however, there is no
5
need for powerful computing resources, and as a result, the Crypto Carbon Ratings
Institute estimates that electricity consumption drops more than 99% as a result of
transitioning into PoS (Ethereum Foundation, 2021).
6
3 Literature Review
Volatility forecasting in financial markets has been an active area of interest for many
years. The ability to accurately forecast volatility is crucial for investors, traders,
and risk managers, as it allows them to make informed decisions and manage risk
effectively (Poon & Granger, 2003). This literature review aspires to provide an
overview of the various methods used to forecast volatility in financial markets and
to evaluate their performance. The models that are used for forecasting volatility for
Ether will be further elaborated on in the methodology section. The first section of
this literature review covers research related to volatility forecasting to traditional
assets and the second section covers the current state of volatility forecasting in
cryptocurrencies.
3.1 Volatility
Since understanding volatility is crucial to make informed investment decisions, there
is a lot of research covering the topic in finance. The literature indicates that increas-
ing the complexity of volatility models does not necessarily result in better forecasts
(Brailsford & Faff, 1996). There are studies that support the superiority of complex
models outperforming simpler ones and vice versa. Broadly speaking, there are two
approaches to forecasting volatility: there are time series forecasting models, and
there is implied volatility from options (Poon & Granger, 2003). For the purpose of
this thesis, only time series forecasting models will be applied.
The simplest historical price model is the random walk (Poon & Granger, 2003). The
model assumes that the best forecast for volatility is the previous value of volatility.
Random walk therefore assumes that volatility changes randomly over time and is
unpredictable. One slightly more complex approach to modeling volatility is the
simple moving average (SMA) model. The volatility forecast is computed from an
average of the past values of volatility (Poon & Granger, 2003). Choosing the number
of past values to include is essential because small windows will contain too much noise
and large windows are insensitive to changes in the volatility. If we compare SMA
to GARCH, which is a common volatility model which will soon be elaborated upon,
SMA has been superior to GARCH in various settings, for example, Brooks (1998)
forecasting daily volatility on the DJ, and MCMillan et al. (2000) forecasting daily
7
and weekly volatility for FTSE100.
An extension of the SMA model is the Exponentially Weighted Moving Average
(EWMA) model. Using the EWMA, more recent observations are given a higher
weight than older observations, because the weights decrease exponentially as the
distance from the present increases (Poon & Granger, 2003). In other words, the
model reacts more quickly to changes in the underlying time series, thus reducing
the weakness of large windows using the SMA model. EWMA has been superior to
GARCH for various settings, for example, Boudoukh et al., (1997) forecasting daily
volatility for 3-month T-bills, Brooks (1998) daily volatility on the DJ, Taylor SJ
(1986) forecasting daily volatility for several types of asset classes, Tse (1991) fore-
casting daily volatility in Japan and Tse and Tung (1992) forecasting daily volatility
in the Singaporean stock market.
One of the most widely used volatility forecasting models is the Generalized Au-
toregressive Conditional Heteroskedastic model (GARCH), introduced by Bollerslev
(1986), and is an extension of Engle (1982) autoregressive conditional heteroskedas-
ticity model (ARCH). One main difference between the models is that lagged con-
ditional variances are allowed in the conditional variance equation in the GARCH
model whereas in the ARCH model, the conditional variance equation is only a linear
function of prior sample variances. Bollerslev (1986) argues that the lag structure
in the GARCH model is more flexible compared to the lag structure in the ARCH
model.
Akgiray (1989) compares the capacity of four different methods to forecast the volatil-
ity of stock returns. The simple historical average was utilized as the benchmark
forecast method, the exponentially weighted moving average forecast was utilized as
the second method, the third and fourth methods were ARCH and GARCH. The
results displayed that the ARCH and GARCH have the most accurate forecast of the
24 monthly return volatilities among the four methods included in the study. Further
comparison between the ARCH and GARCH indicates that the GARCH model has a
superior accuracy when forecasting the 24 monthly return volatilities. Another study
that examined the effectiveness of different models for forecasting currency exchange
rates, West and Cho (1995) utilized bilateral weekly data for the US dollar. The
study compared a total of six models that included both ARCH and GARCH. The
8
findings revealed that when the models were evaluated for a one-week ahead forecast,
the GARCH model demonstrated the most accurate prediction out of all the mod-
els. However, when the models were assessed for twelve-week and twenty-four-week
ahead forecasts, no evidence was found to suggest that any one model produced more
accurate predictions than the other.
Andersen and Bollerslev (1998) studied the capacity of the GARCH (1,1) model
to forecast the conditional variance of the Deutschemark-U.S. Dollar (DM–$) and
Japanese Yen-U.S. Dollar (Y-$) spot exchange rates. Results demonstrate that when
using daily sampling frequencies of the squared intraday returns as an ex-post es-
timate for volatility, the GARCH (1,1) performed poorly. However, increasing the
sampling frequencies led to better performance by the model where five-minute sam-
pling frequencies led to the best performance of the model. Moreover, Hansen and
Lunde (2005) compared 330 different types of GARCH models with regards to their
capacity to forecast the one-day-ahead conditional variance of the DM–$ spot ex-
change rate and IBM stock returns. The results when using DM–$ spot exchange
rate shows that the GARCH (1,1) model, which was proposed by Bollerslev (1986),
are not outperformed by the other models, however when using IBM stock returns,
the GARCH (1,1) model are outperformed by other models, specifically models that
contains a leverage effect.
3.2 Volatility in Cryptocurrencies
Bitcoin paved the way for the cryptocurrency market and as a result, the majority
of research and analysis regarding volatility in the crypto space has centered on the
largest cryptocurrency in the world in terms of market capitalization, Bitcoin.
Several studies have used extensions of the GARCH model to analyze the volatility
of Bitcoin. For example, Glaser et al. (2014) studies the volatility of Bitcoin using
an ARCH model, while Dyhrberg (2016a) employs the Exponential GARCH, which
allows for non-linear modeling of volatility and includes exponential functions of the
residuals in the GARCH equation. In addition, Dyhrberg (2016b) and Bouri et al.
(2017) utilize the Threshold GARCH model, which enables a non-linear relationship
between the conditional volatility and the past residuals, thereby capturing asymme-
try in the conditional volatility.
9
These studies show that research in the asset class utilizes advanced extensions of the
GARCH model, rather than the basic GARCH specification. However, it is worth
noting that these studies have only employed a single conditional autoregressive het-
eroskedasticity model. In contrast, this paper aims to compare and evaluate multiple
models to arrive at a more comprehensive understanding of the volatility of Ether.
Katsiampa (2017) investigated the volatility of Bitcoin using the simple GARCH and
five extensions of the model and found that the Asymmetric Component GARCH
was the best-performing model based on its goodness-of-fit to the data. However,
the evaluation was limited to an in-sample basis. In-sample performance tests how
well a model fits the data during the model-fitting process, while a strong forecasting
model should provide accurate forecasts on new, unseen data. Moreover, Chu et al.
(2017) evaluated twelve GARCH models for seven different cryptocurrencies on an
in-sample basis. They find that the normal distribution produces models that are
superior for all models except TGARCH and AVGARCH. However, different models
worked well with different cryptocurrencies, and the IGARCH and GJR-GARCH
seemed to perform the best for the set of cryptocurrencies studied.
Baur and Dimpfl (2018), studies the asymmetric volatility effects of the 20 largest
cryptocurrencies and found that volatility was higher after a positive shock in returns
than a negative shock. This is in contrast to volatility asymmetry discussed in the
background section, where volatility tends to be greater on the upside. The authors
attribute this difference to uninformed investors buying due to fear of missing out
and the presence of pump and dump schemes in the cryptocurrency market.
The GARCH model and its variants have gained significant popularity in modeling
volatility across various asset classes, including cryptocurrencies, and have shown
promising results. In this study, we aim to forecast the one-day-ahead volatility of
Ether by employing different models, including GARCH, EGARCH, GJR-GARCH,
simple moving average (SMA) with 9 and 20 lags, and exponentially weighted moving
average (EWMA) models.
Our choice of models is based on this literature review, where it has been found that
GARCH and EGARCH models are commonly used to model volatility in financial
assets, while the GJR-GARCH model is preferred for assets with asymmetric volatil-
ity. Additionally, we found no studies in the volatility cryptocurrency literature that
10
have used SMA and EWMA models for forecasting. Therefore, by incorporating these
simpler models, we aim to provide a more accessible approach to understanding the
volatility patterns of Ether.
Overall, this thesis aspires to contribute to the existing literature on cryptocurrency
volatility by exploring a range of modeling techniques and evaluating their forecasting
accuracy for Ether.
11
4 Data
We obtain Ether historical price data from finnhub.io using their API. We consider
00:00 UTC as the start of a new day. The dataset includes prices of Ether sampled
at a five-minute frequency from the first of January 2018 00:00 UTC to the first of
January 2023 00:00 UTC. Unfortunately, some days had incomplete observations, and
these were dispersed across 34 different trading days. To tackle this problem, all 34
days with missing data are excluded entirely.
To get the return of the cryptocurrencies (we use t)he following equation:
Pi,t
ri,t = ln (2)
Pi−1,t
where ri,t is defined as the logarithmic return of the cryptocurrency at interval i on
day t, Pi,t is the closing price of interval i on day t, and Pi−1,t is the closing price of
interval i− 1 on day t.
We compute log returns for all five-minute intervals using equation (2). The daily log
returns are acquired by summing all five-minute intervals that constitute a day. Since
the cryptocurrency market is open 24 hours a day, the total number of five-minute
intervals for a day is 288. This procedure is repeated for all days in our sample and
after the exclusion of 34 days, the daily log returns amounts to 1792 observations.
Table 1 presents the descriptive statistics for the log returns of Ether. The sample
consists of 1792 observations, with a mean log return of 0.00 and a standard deviation
of 0.05. The minimum and maximum log returns are -45% and 25.5% respectively,
indicating a large range of variation in returns. The skewness and kurtosis coefficients
are -0.722 and 9.291, respectively, indicating a negatively skewed and leptokurtic
distribution of returns. A negatively skewed distribution of log returns means that
the distribution has a longer tail to the left of the mean than to the right, thus
suggesting more extreme negative returns than positive returns, which is in line with
the min and max log returns. A leptokurtic distribution implies that the daily log
returns have more extreme values (positive or negative) than the normal distribution.
Table 1: Descriptive Statistics of Log Returns
Obs Mean Std Min Max Skewness Kurtosis
Log Returns 1792 0.000 0.050 -0.450 0.255 -0.722 9.291
12
Table 2 displays the results of various diagnostic tests for the daily log returns of
Ether. The Ljung-Box test is used to test for autocorrelation, while the augmented
Dickey-Fuller test is used to test for stationarity. The Jarque-Bera test is used to
test for normality in the distribution of returns. The results from the Ljung-Box test
show that the daily log returns exhibit significant autocorrelation for up to 10 lags at
a 95% confidence level. The result from the Augmented Dickey-Fuller test show that
the daily log returns are stationary at a 95% confidence level. Finally, the result from
the Jarque-Bera test show that the daily log returns are non-normal, as indicated by
the test statistic with a value of 3110.700, in addition to the negative skewness and
excess kurtosis from Table 1. Overall, the results suggest that the daily log returns
of Ether are non-normal, stationary, and exhibit significant autocorrelation up to 10
lags.
Table 2: Tests for Autocorrelation, Stationarity, and Normality
Ljung-Box (10) Augmented Dickey-Fuller Jarque-Bera
Log Returns 23.209** -11.551** 3110.700**
Note: ** p < 0.05.
Figure 1: Daily Log Returns
13
5 Methodology
This section presents Realized Volatility, the ex-post estimate for actual volatility
that we compare the performance of the models against. Thereafter the maximum
likelihood estimation is discussed, and the volatility models that are utilized for fore-
casting the one-day-ahead volatility of Ether are presented. Lastly, the Methodology
section concludes with a discussion of the methods used to evaluate the predictive
power of these models, both in-sample and out-of-sample.
5.1 Realized Volatility
According to Andersen et al. (2003), the estimates of realized volatility are unbiased
and efficient estimates for actual volatility. Therefore, we use realized volatility in
equation (3) as an ex-post estimate for actual volatility. Since the price of Ether is
sampled at a five-minute frequency, the daily amount of five-minute squared returns is
equal to 288. The reason to use five-minute intervals is to avoid microstructure noise
which results in biased Realized Volatility (Dimpfl & Peter, 2021), which according to
Poon and Granger (2003) is solved by using five-minute intervals. We follow equation
(2) to acquire the log returns ri,t for each five-minute interval. Then, the returns are
squared and finally summed up to acquire the Realized Variance:
∑n
RV = r2t i,t (3)
i=1
where r2i,t is the square of the logarithmic return for one five-minute interval at day t
for the cryptocurrency. Thus, the Realized Volatility is equal to:
√
RVt (4)
5.2 Maximum Likelihood Estimation (MLE)
The set of GARCH models are estimated using the Maximum Likelihood Estimation
(MLE) provided by the rugarch package (Ghalanos, 2022) in R. The MLE approach
involves finding the values of the parameters that maximize the likelihood function,
which measures how well the observed data fit the assumed distribution (Myung,
2003). In other words, MLE provides the best estimate of the parameters that most
likely generated the observed data.
14
We can write the daily log return series for the set of GARCH models as:
yt = σtϵt (5)
where
ϵt ∼ i.i.d(0, 1) (6)
The probability distributions of the error term that we use are Gaussian and Student’s
t. In this thesis, the Student’s t distribution has been utilized due to its relevance in
modeling financial assets. As previous research has indicated, financial assets often
exhibit heavy-tailed characteristics (Loretan & Phillips, 1994; Poon & Granger, 2003),
which makes the Student’s t distribution a suitable choice due to its ability to capture
heavy-tailedness. The Gaussian distribution, also known as the normal distribution,
assumes that the errors follow a bell-shaped curve. The probability density function
(PDF) for the Gaussian distribution is given(by:√ )1 y2exp − t (7)
2πσ2 2σ
2
t t
To estimate the GARCH models with a probability distribution of a Gaussian distri-
bution, we maximize the following log-likelihood function:
∑n
L −1 1 y
2
= [ log(2π)− log(σ2t )− t ] (8)2 2 2σ2
t=1 t
We follow the procedure from Bollerslev (1987) for the Student’s t distribution with
a PDF given by:
( )√( ) ( )− ν+1Γ ν+12 y2 21 + t (9)
Γ ν π(ν − 2)σ2 (ν − 2)σ2
2 t t
To estimate the GARCH models with a probability distribution of a Student’s t
distribution we maximize the following log-likelihood function:
∑n [ ( )] [ ( )] ∑ [ ]2L = [log Γ ν+1 − log Γ ν n yt=1 − log[π(ν − 2)]− 1 log(σ2)− ν+1 nt i=1 log 1 + i2 2 2 2 2 (ν− ] (10)2)σ2t
15
5.3 GARCH(1,1)
The GARCH(1,1) proposed by Bollerslev (1986) may be written as follows:
σ2t+1 = ω + αy
2 2
t + βσt (11)
The GARCH model has some restrictions to ensure its parameters are positive and the
model produces valid and meaningful forecasts of volatility. Specifically, we require
α > 0, β > 0, and ω > 0 to ensure positivity. In addition, to ensure stationarity, we
require that α + β < 1.
A strength of the GARCHmodel is its ability to capture the well documented behavior
of volatility clustering in financial time series (Poon & Granger, 2003; Tsay, 2010).
This means that the model can identify periods of high and low volatility and adjust
the forecast accordingly. Moreover, the tail distribution of a GARCH (1,1) process is
heavier than that of a normal distribution (Tsay, 2010). This implies that extreme
events are more likely to occur in a GARCH(1,1) process than in a normal distribution.
On the other hand, Hansen and Huang (2016) discusses that the GARCH model
performs poorly for scenarios where volatility “jumps” to a new level over a short
period of time. In such situations, the GARCH model will be slow at catching up to
the new level of volatility.
5.4 EGARCH(1,1)
EGARCH, or Exponential Generalized Autoregressive Conditional Heteroskedastic-
ity, was proposed by Nelson (1991) to overcome some of the shortcomings of the
GARCH model. In particular, the EGARCH extends the GARCH model by allow-
ing for asymmetry and leverage effects in the way volatility responds to positive and
negative returns. Nelson (1991) used the natural logarithm of conditional variance
σ2 to guarantee a positive conditional variance, instead of imposing the previously
described restrictions of the GARCH model. The asymmetric effect is shown through
the weighted innovation g(ϵt). The EGARCH(1,1) may be written as follows:
ln(σ2t+1) = ω + αg(ϵt) + β ln(σ
2
t ) (12)
16
where
g(ϵt) = θϵt + γ[|ϵt| − E|ϵt|] (13)
From the weighted innovation g(et), θ and γ are real constants. Both ϵt and |ϵt|−E|ϵt|
are zero-mean, independent, and identically distributed (iid) processes with continu-
ous distributions. We follow the example of Tsay (2010) illustrating the asymmetry
of g(ϵt). If ϵt ≥ 0, then g(ϵt) = (θ + γ)ϵt − γE(|ϵt|). Conversely, if ϵt < 0, then
g(ϵt) = (θ − γ)ϵt − γE(|ϵt|). Clearly, the model allows for the conditional variance
of a time series to respond differently depending on whether the shock in return is
positive or negative.
5.5 GJR-GARCH(1,1)
Another method of modeling volatility asymmetry is by the GJR-GARCH model, or
the Glosten-Jagannathan-Runkle Generalized Autoregressive Conditional Heteroscedas-
ticity model (Glosten et al., 1993). The key difference to EGARCH is the way they
model the impact of negative shocks on volatility. While EGARCH models the im-
pact of negative shocks on the conditional variance by using the natural logarithm, the
GJR-GARCH measures the impact of negative shocks directly through an additional
parameter. The GJR-GARCH(1,1) may be written as follows:
σ2t+1 = ω + (α + γIt)y
2
t + βσ
2
t (14)


1, if y2t < 0
It = (15)
0, if y2t ≥ 0
It is an indicator variable used to capture the asymmetry in the model, which takes
on a value of one if the return of the previous period is negative or takes on a value
of zero if the return of the previous period is positive. Due to the indicator variable
and its associated coefficient γ, previous periods with negative returns will lead to
greater estimates of volatility.
We again impose restrictions on ω > 0, α > 0, γ > 0, and β > 0 to ensure positive
conditional variance for all t.
17
5.6 Simple Moving Average (SMA)
The simple moving average (SMA) is an estimate of volatility based on the average
value of a window of past volatilities (Poon & Granger, 2003). The formula for the
SMA model may be written as follows:
∑N
σ2
1 2
t+1 = σN t−i+1
(16)
i=1
where N is the SMA order, or the rolling window length. The procedure is that the
order of the model determines how many past observations are included in the model,
i.e. the window. As a day passes, the window moves one day ahead which means
that the model incorporates data for a new day and drops the data for the oldest
day. The choice of the SMA order is essentially arbitrary (Brailsford & Faff, 1996;
Christoffersen, 2011; MCMillan et al., 2000) An excessively large window produces
a smooth model that may not respond to sudden changes in volatility levels. Con-
versely, including too few lags oversimplifies and makes it overly sensitive to noise
(Christoffersen, 2011). Consequently, we utilize two SMA’s: a 9-day SMA and a 20-
day SMA, to see how accurately a short-term and a longer-term SMA forecasts the
one-day-ahead volatility of Ether.
5.7 Exponentially Weighted Moving Average (EWMA)
Similar to SMA, EWMA also belongs to a group of models that produce historical
based forecasts. The EWMA equation may be written as follows:
σ2t+1 = λσ
2
t + (1− λ)r2t (17)
The difference that distinguishes EWMA from SMA is that EWMA assigns more
weight to recent observations (Taylor, 1986). Thus, EWMA should perform well if
there are occasional changes in past volatilities. The degree of the weight assigned to
each past observation is determined by a smoothing parameter λ. A large value of λ
gives more weight to recent data, and λ must not exceed the value of one because a
value of one would mean that only the most recent observation influences the EWMA.
Similar to the SMA model, the rolling window approach is used for estimation. The
only difference is that regarding EWMA, the smoothing parameter λ dictates how
many past values are incorporated.
18
5.8 In-Sample Performance
In-sample performance relates to the accuracy of a model when it is applied to the
same data that is used to estimate the model parameters. In-sample performance
measures how well the model fits the data. We choose to evaluate in-sample for the
entire sample. It is important to note that this procedure does not apply to SMA
and EWMA because these models do not involve any estimation of parameters or
likelihood functions. To assess the in-sample performance, we use two information
criteria, the Akaike’s Information Criterion and Bayesian information criterion.
The Akaike (1974) formula is written as follows:
AIC = −2 log(L) + 2k (18)
Where k is defined as the number of estimated parameters by the model and L is
defined as the maximum likelihood estimate. The model that minimizes the AIC
value is thus showing superior in-sample performance.
Moreover, we follow the formula made by Schwarz (1978) for the Bayesian information
criterion which is given by:
BIC = −2 log(L) + k ln(n) (19)
where n is defined as the number of observations. The difference between the criterions
is consequently the penalizing schemes of how many parameters are used, whereas
BIC penalizes complex models relatively more.
5.9 Out-of-Sample Performance
We follow the procedure of Stock and Watson (2020) to get out-of-sample forecasts
for our set of GARCH models. The data is split up into two periods, the first period
contains the initial 1742 observations and is used to fit the models. In other words, the
GARCH models first estimation will be slightly different than in-sample performance,
because we now use 50 fewer observations.
We adopt the same method as Taylor (1986) to find the optimal EWMA by estimating
λ through minimizing the mean squared error using the observations within the first
period. The procedure is that we produce forecasts with different values of λ, ranging
19
from 0.87-0.99, and then choose the value that has the lowest MSE, which is shown
in Appendix.
The second period consists of the last 50 days of the sample. With the estimated
parameters from the first period, the GARCH models now produce forecasts on new,
unseen data. We use an expanding window approach to get the forecasts for the
GARCH models. The procedure is that we obtain the first forecast solely based
on observations from the first period. To get the second forecast, the models are re-
estimated when incorporating the data from the first observation in the second period
of the sample in addition to all observations in the first stage, hence the expanding
window approach. This is an iterative process that continues up until the models
have been re-estimated to generate the last forecast for the final observation.
Afterward, we follow the procedure and notations of the DM-test of Diebold and
Mariano (1995) to evaluate which of the two models performs the best out-of-sample.
The DM-test is a statistical test used to determine whether the forecast errors of the
models are different. We can think of the test as a test for model superiority based on
statistical significance, rather than chance or luck. The forecast errors are calculated
in the following way:
√
ϵit = σ̂it − RVt (20)
√
where σ̂it is the estimated volatility from model i on day t, and RVt is the Realized
Volatility on day t.
The DM-test has a loss function which is given by the equation below:
d = ϵ2 − ϵ2t it jt (21)
where ϵ2it is defined as the squared error of the forecasting model i at time t and ϵ
2
jt is
the squared error of forecast model j at time t. The null and alternative hypotheses
are written as follows:
H0 : E[dt] = 0 (22)
H1 : E[dt] ̸= 0 (23)
20
and the Diebold and Mariano (1995) test statistic is given by:
√ d̄ (24)
2πfd̂(0)
T
If we reject the null hypothesis, the next step is to evaluate the accuracy of the
models in comparison to Realized Volatility by using loss functions. However, there
is no consensus on which loss function is suitable for assessing volatility models,
as noted by Bollerslev et al. (1994), Diebold and Lopez (1996), and Lopez (2001).
Consequently, we employ two different loss functions - MSE and MAE - to provide a
comprehensive evaluation. To do so, we adopt the formulas proposed by Hansen and
Lunde (2005):
N
1 ∑ √
MSE = (σ̂ 2it − RVt) (25)
N
t∑=1N1 ∣∣∣ √ ∣∣MAE = σ̂it − RVt∣ (26)
N
t=1
where σ̂it is the estimated volatility for model i on day t. The model that minimizes
MSE and MAE is thus the best performing model.
21
6 Results
This thesis presents two sets of results: in-sample and out-of-sample. The in-sample
results display plots of the GARCH models fitted to the time series, along with the
estimated parameters and measures of goodness-of-fit using AIC and BIC. The out-of-
sample results evaluate the performance of all models using the Diebold-Mariano test,
MSE, and MAE. The best performing model, determined by statistical significance
and lowest MSE and MAE, is presented.
6.1 In-Sample Results
In this section, we present the in-sample results for our GARCH models with Gaussian
and Student’s t distributions. We estimated the parameters of three different models:
GARCH, EGARCH, and GJR-GARCH. Table 3 shows the estimated parameters and
goodness-of-fit measures for each model with a Gaussian distribution, while Table 4
shows the results for each model with a Student’s t-distribution.
Table 3: In-Sample Performance with Gaussian Distribution
ω α β γ AIC BIC
GARCH 0.0002 0.0845 0.8343 -3.2316 -3.2132
EGARCH -0.5879 -0.0615 0.9008 0.1920 -3.2332 -3.2118
GJR-GARCH 0.0003 0.0615 0.8066 0.0648 -3.2340 -3.2125
For the GARCH model with a Gaussian distribution, the estimated parameter val-
ues for ω, α, and β are 0.0002, 0.0845, and 0.8343 respectively. ω is the constant,
representing the long-run average value of the conditional variance. α represents the
weight or influence given to the past squared returns in determining the forecasted
volatility. A higher α value indicates a stronger persistence of past shocks in the
forecasted volatility. Finally, the β in the GARCH model represents the weight or
influence given to the lagged conditional variance term in determining volatility. A
higher β value indicates a stronger dependence on past conditional variances and
recent volatility levels.
The EGARCH model estimates the constant ω, which can take both positive and
negative values. The negative value of ω suggests that volatility is less persistent, but
it does not necessarily imply mean-reverting behavior. A negative value of α indicates
22
the presence of a leverage effect, where negative returns are associated with higher
volatility than positive returns, thus a finding that contrasts the result of Baur and
Dimpfl (2018) discussed in the literature review. If we consider β = 0.9008 as close
to one, then the value of β suggests that the conditional variance will have a long-
lasting impact on future values of the conditional variance, suggesting that volatility
persistence is high. The last parameter of EGARCH, γ, also captures aspects of
the leverage effect, but it captures the asymmetric response to volatility to positive
and negative shocks. A positive value indicates the presence of volatility asymmetry,
similar to a negative α.
In our GJR-GARCH model, the intercept term ω is estimated to be 0.0003. The
average volatility, before considering impacts from past returns and past conditional
volatility, is thus 0.0003. The parameter α is estimated to be 0.0615, suggesting that
past returns have a positive effect on current volatility. Similarly, the parameter β is
estimated to be 0.8066, indicating that past volatility has a positive effect on current
volatility. Finally, the parameter γ is estimated to be 0.0648, suggesting a weak
asymmetry effect, with negative shocks having a slightly larger impact on volatility
than positive shocks.
Overall, we find that the goodness-of-fit for the models using the Gaussian distribu-
tion varies depending on which metric we use. Using AIC, GJR-GARCH produces
the lowest value followed by EGARCH. The superiority of GJR-GARCH regarding
goodness-of-fit is similar to the findings of Chu et al. (2017), although they use more
models and other cryptocurrencies. On the other hand, if we look at BIC, which
penalizes complex models relatively more, we find that GARCH produces the lowest
value followed by GJR-GARCH.
For the Student’s t-distribution, we have estimated the parameters of GARCH,
EGARCH, and GJR-GARCH, and the results are presented in Table 4. We can see
that by switching probability distributions, the value of α is greater than the value
of the Gaussian distribution, indicating a higher degree of volatility persistence. Fur-
thermore, Table 4 presents the degrees of freedom ν which suggests heavier tails than
those of the normal distribution. This characteristic is also shown in the visual plots
Figure 2, Figure 3, and Figure 4.
23
Table 4: In-Sample Performance with Student’s t-distribution
ω α β γ ν AIC BIC
GARCH 0.0002 0.1364 0.8345 3.3401 -3.3845 -3.3631
EGARCH -0.3575 -0.0228 0.9399 0.2607 3.3335 -3.3836 -3.3591
GJR-GARCH 0.0002 0.1272 0.8274 0.0226 3.3475 -3.3836 -3.3591
Figure 2: GARCH
24
Figure 3: EGARCH
Figure 4: GJR-GARCH
In the GARCH model with the Student’s t-distribution, the parameter estimates
for ω and β are similar to those obtained under the Gaussian distribution. For
25
EGARCH, consistent among the distributions is that ω is negative, again suggesting
less persistent volatility. Moreover, the value of γ is also positive under Student’s t-
distribution, suggesting again that negative returns have a larger impact on volatility
than positive returns and that the distribution of returns is leptokurtic.
The GJR-GARCH model also produces similar parameter estimates using Student’s
t-distribution compared to Gaussian. However, we can see that γ is lower, albeit
positive, using the Student’s t-distribution. This means that the leverage effect is
weaker in the Student’s t-distribution, implying that negative returns do not have as
strong of an impact on future volatility.
Regarding the goodness-of-fit, we can observe that the AIC and BIC values for the
models with the Student’s t-distribution are lower than those with the Gaussian distri-
bution. This result suggests a better goodness-of-fit using the Student’s t-distribution.
In general, the result is in contrast to Chu et al. (2017), who found that the Gaussian
distribution was superior for almost all models and cryptocurrencies. In terms of
AIC, we can see that the GARCH model has the lowest value, followed by EGARCH
and GJR-GARCH which produces the same value. Similarly for BIC, GARCH has
the lowest value, followed by EGARCH and GJR-GARCH which produces the same
value. This result suggests that the GARCH provides the best fit to the data among
the three models with the Student’s t-distribution.
In conclusion, we observe that the GARCH model has the strongest in-sample per-
formance for both probability distributions, a result that is in contrast to both Kat-
siampa (2017) and Chu et al. (2017), where GARCH was not found to exhibit strong
goodness-of-fit. This does not necessarily mean superiority in the subsequent out-of-
sample forecasting. It does, however, explain to some degree why the GARCH model
is such a popular choice for modeling financial time series due to its ability to capture
volatility clustering and its flexibility in allowing for different distributions of returns.
26
6.2 Out-of-Sample Results
This section evaluates how accurate the models are in producing the one-day-ahead
forecasts of Ether compared to the actual volatility Realized Volatility. First and
foremost, the EWMA that minimizes MSE from the in-sample stage is found to have
a λ of 0.89. GARCH, EGARCH, and GJR-GARCH are denoted as T-GARCH, T-
EGARCH, and T-GJR when using Student’s t-distribution.
Table 5 shows the result of the DM-test. There we can see that three models stand
out as having significant differences from most other models. Two of the models are
T-GARCH, and the T-GJR-GARCH, which show significance for all models except
two, one being against each other. The last model that shows significance against six
different models is SMA9.
EGARCH, on the other hand, is the only model that demonstrates statistically sig-
nificant differences against five other models. Meanwhile, GARCH, GJR-GARCH,
and T-EGARCH are statistically different from four other models. Finally, EWMA
shows significance against three other models. SMA20 is not statistically different
from any other model with 5% significance level, and as a consequence, we drop this
model altogether from further analysis.
27
28
Table 5: Diebold-Mariano Test
GARCH EGARCH GJR-GARCH SMA9 SMA20 EWMA T-GARCH T-EGARCH T-GJR
GARCH 3.5232** 0.8010 2.9127** 0.7160 1.8213 −4.0128** 0.6572 −3.9485**
EGARCH 3.5232** −3.8546** 1.8708 −0.1185 0.2761 4.0652** −2.5480** −4.0783**
GJR-GARCH 0.8010 3.8546** 2.7983** 0.6136 1.5359 −3.6993** 0.2870 −3.7451**
SMA9 2.9127** 1.8708 2.7983** −1.8138 −2.4140** −4.7221** −2.938 ** −4.7157**
SMA20 0.7160 −0.1185 0.6136 −1.8138 0.4788 −1.9321 −0.5866 −1.8787
EWMA 1.8213 0.2761 1.5359 −2.4140** 0.4788 −5.3108** −1.8051 −4.9891**
T-GARCH −4.0128** −4.0652** −3.6883** −4.7221**−1.9321 −5.3108** 4.9165** 0.4015
T-EGARCH 0.6572 −2.5480** 0.2870 −2.938 **−0.5866 −1.8051 4.9165** −4.8763**
T-GJR −3.9485** −4.0783** −3.7451** −4.7157**−1.8787 −4.9891** 0.4015 −4.8763**
Note: ** p < 0.05.
Table 6 shows the MSE and MAE for all models and a ranking of the models. The
first loss function of discussion is MSE. We find that EGARCH is the model that
minimizes MSE. Consequently, we can say that EGARCH is superior to all models
except SMA9 and EWMA because we could not reject the Diebold-Mariano null
hypothesis for these models. EWMA ranks second using MSE, while we can only
remain 95% confident that the model has a superior predictive ability against SMA9,
T-GARCH, and T-GJR-GARCH. GJR-GARCH ranks third, which along with the
DM-test shows that the model is inferior to EGARCH and superior to SMA9, T-
GARCH, and T-GJR-GARCH. The model that produces the fourth lowest MSE is
T-EGARCH. We may thus say that with 95% confidence that the model is inferior
to EGARCH and superior to SMA9, T-GARCH, and T-GJR-GARCH.
SMA9 has the fifth lowest MSE and is statistically better than GARCH, T-GJR-
GARCH, and T-GARCH and statistically less accurate than GJR-GARCH, EWMA,
and T-EGARCH. GARCH is ranked 6 by MSE, which is a rather low rank if we
consider the strong goodness-of-fit the model exhibited in the previous section, which
emphasizes the importance of evaluating the model on an out-of-sample basis. The
GARCH model is thus superior to T-GJR-GARCH, and T-GARCH, while inferior
to EGARCH, SMA9. Finally, the two worst models are clearly T-GJR-GARCH and
T-GARCH, where T-GJR-GARCH ranks 7th and T-GARCH ranks 8th.
By looking at the MSE result, it is worth noting that the models are more accurate
using the Gaussian distribution. This result suggests that the Student’s t-distribution
may not be an appropriate choice for modeling the conditional volatility for this
particular type of asset. The main takeaway should be, however, that the EGARCH
model is the best performing model forecasting the one-day-ahead volatility of Ether
using MSE and the DM-test.
Moving on to the MAE metric, we observe that the SMA9 is the model with the
lowest MAE, indicating that the model has the best predictive accuracy. EWMA
is ranked second, followed by EGARCH, T-EGARCH, GJR-GARCH, GARCH, T-
GJR-GARCH, and finally T-GARCH.
29
Table 6: Mean Squared Errors & Mean Absolute Errors
GARCH EGARCH GJR-GARCH SMA9 EWMA T-GARCH T-EGARCH T-GJR
MSE 0.000563 0.000472 0.000556 0.000560 0.000550 0.000717 0.000559 0.000716
Rank 6 1 3 5 2 8 4 7
MAE 0.021890 0.020120 0.021744 0.015398 0.0196902 0.024191 0.021613 0.024163
Rank 6 3 5 1 2 8 4 7
Interestingly, the models based on MAE differ significantly from those based on MSE.
For example, SMA9 and EWMA, which are relatively simpler models, are ranked
higher using MAE, but lower using MSE. This suggests that simpler models that do
not have the same complexity as the GARCH models may outperform in predictive
accuracy, at least when considering MAE.
Due to the difference in results between MSE and MAE, we can consider the different
properties of these two loss functions. MAE treats all errors equally and is less
sensitive to outliers or extreme values, as it only considers the absolute difference
between the predicted and actual values. MSE, in contrast, squares the error which
results in an even larger value, which can cause MSE to be more sensitive to outliers
or extreme values, which may not be representative of the overall performance of the
model. Consequently, even large errors in MAE will not have as much impact on the
overall value of the MAE. A possible explanation for why simpler models excel using
MAE is thus that outliers have less of an impact on the overall value of MAE, where
the outliers can be seen from Realized Volatility (see Figure 5 in Appendix).
Our results suggest that EGARCH and SMA9 models are the best performing mod-
els for predicting the one-day-ahead volatility of Ether, based on their low MSE and
MAE values combined with the DM-test. The success of the EGARCH model may
be attributed to its ability to both capture asymmetry and persistence in volatility,
while the SMA9 model benefits from its ability to quickly adapt to changing market
conditions. The literature review demonstrated that EGARCH is a popular choice
for modeling volatility (Chu et al., 2017; Dyhrberg, 2016a; Katsiampa, 2017), and the
result thus reinforces both its popularity and the advantage of applying models that
contains a leverage effect, as discussed by Bollerslev (1986). Moreover, the robust
performance of SMA9 aligns with the findings of Brooks (1998) and MCMillan et al.
(2000). Additionally, the EWMA model demonstrates impressive results, consistent
30
with the findings of Brooks (1998), Tse (1991), and Tse and Tung (1992). Inter-
estingly, when reviewing the literature we found that both SMA and EWMA were
underutilized in previous research. Thus, it is crucial to underscore that the strong
performance of these models highlights the need for their increased adoption in future
studies.
The DM-test shows that the highest ranked models, EGARCH and SMA9, are not
significantly different from each other, indicating that either model may be considered
depending on the specific needs of the application. Moreover, we see that model
accuracy decreases by applying the Student’s t-distribution, as suggested by both
MSE and MAE. Finally, given the central role of GARCH in the literature and its
promising in-sample result, it is worth noting is that the model was found to be
relatively weak, ranked 6th by both loss functions.
31
7 Discussion
This thesis aimed to empirically evaluate the selected volatility models and iden-
tify which model has superior predictive accuracy in forecasting the one-day-ahead
volatility of Ether. The contribution of this thesis depends on the choices made dur-
ing the research process, such as selecting models. This section discusses some of
these limitations and provides suggestions for future research.
While the forecast horizon of this thesis is one day, a similar methodology could
be used with an increased forecast horizon, such as one week or one month. The
difference in forecast horizon could entirely change the concluding remarks about
superior predictive models.
We considered five years of price data as sufficient because Ethereum was created in
2015, and the sample period constitutes a majority of the assets’ lifetime. However,
increasing the sample size could lead to different results. Additionally, we chose to
set the out-of-sample forecasting for the last 50 days. While this choice was intended
to ensure the similarity of the in-sample and out-of-sample models, varying the time
period to 100, 200, or 300 days could further test the results and strengthen the
models.
The primary objective of this work was to evaluate numerous volatility models, and we
have identified several models that demonstrate superior predictive ability. However,
there are numerous other volatility models that could be evaluated, as discussed
in the literature review. Thus, there is an opportunity to build on our work by
selecting strong models from this thesis in addition to incorporating different models
and loss functions. Moreover, conducting a multivariate volatility study using several
variables simultaneously, such as forecasting the volatility of Ether while incorporating
the comovements of other cryptocurrency assets like Bitcoin, could lead to superior
predictive accuracy. Although this would increase the statistical analysis’s complexity,
it would be a valuable direction for future research.
32
8 Conclusion
This thesis is an evaluation of the forecast accuracy of the one-day-ahead volatility of
Ether, the native currency of the Ethereum blockchain. Volatility research has been
an active area of interest for many years, because of its importance in making informed
decisions and managing risk effectively. Although there is abundant volatility litera-
ture on traditional assets and some literature on Bitcoin, we contribute by evaluating
volatility models on the second biggest cryptocurrency, Ether. The evaluated models
include SMA9, SMA20, EWMA, GARCH, EGARCH, GJR-GARCH, T-GARCH, T-
EGARCH, and T-GJR-GARCH. We ask and answer the following question: which of
the selected models produces the best one-day-ahead forecasts of volatility for Ether?
The question is answered on in-sample performance and out-of-sample performance.
The in-sample results show that GARCH produces the best goodness-of-fit. GARCH
produces the lowest value of AIC and BIC using Student’s t-distribution and mini-
mizes BIC for the Gaussian distribution. GJR-GARCH produces the lowest value of
AIC under the Gaussian distribution.
The out-of-sample forecast accuracy is not aligned with the in-sample result. There,
it is shown that considering MSE along with the DM-test, the EGARCH model is the
most accurate in predicting the one-day-ahead volatility of Ether. Finally, considering
MAE and the DM-test, we find that the SMA9 model shows the best forecast accuracy.
The superior models are not statistically different, and consequently, either SMA9 or
EGARCH may be considered for forecasting purposes.
33
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE transac-
tions on automatic control, 19 (6), 716–723. https://doi.org/10.1109/TAC.
1974.1100705
Akgiray, V. (1989). Conditional heteroscedasticity in time series of stock returns:
Evidence and forecasts. Journal of business, 55–80.
Andersen, T. G., & Bollerslev, T. (1998). Answering the skeptics: Yes, standard
volatility models do provide accurate forecasts. International economic review,
885–905. https://doi.org/10.2307/2527343
Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2003). Modeling and
forecasting realized volatility. Econometrica, 71 (2), 579–625. https://doi.org/
10.1111/1468-0262.00418
Baur, D. G., & Dimpfl, T. (2018). Asymmetric volatility in cryptocurrencies. Eco-
nomics Letters, 173, 148–151. https://doi.org/10.1016/j.econlet.2018.10.008
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Jour-
nal of econometrics, 31 (3), 307–327. https://doi.org/10.1016/0304-4076(86)
90063-1
Bollerslev, T. (1987). A conditionally heteroskedastic time series model for speculative
prices and rates of return. The review of economics and statistics, 542–547.
Bollerslev, T., Engle, R. F., & Nelson, D. B. (1994). Arch models. Handbook of econo-
metrics, 4, 2959–3038. https://doi.org/10.1016/S1573-4412(05)80018-2
Bouri, E., Azzi, G., & Dyhrberg, A. H. (2017). On the return-volatility relationship
in the bitcoin market around the price crash of 2013. Economics, 11 (1). https:
//doi.org/10.5018/economics-ejournal.ja.2017-2
Brailsford, T. J., & Faff, R. W. (1996). An evaluation of volatility forecasting tech-
niques. Journal of Banking & Finance, 20 (3), 419–438. https://doi.org/10.
1016/0378-4266(95)00015-1
34
Brooks, C. (1998). Predicting stock index volatility: Can market volume help? Jour-
nal of Forecasting, 17 (1), 59–80. https : / / doi . org / 10 . 1002 / (SICI ) 1099 -
131X(199801)17:1⟨59::AID-FOR676⟩3.0.CO;2-H
Buterin, V., et al. (2014). A next-generation smart contract and decentralized appli-
cation platform. white paper, 3 (37), 2–1.
Christoffersen, P. (2011). Elements of financial risk management. Academic press.
Chu, J., Chan, S., Nadarajah, S., & Osterrieder, J. (2017). Garch modelling of cryp-
tocurrencies. Journal of Risk and Financial Management, 10 (4), 17. https :
//doi.org/10.3390/jrfm10040017
CoinMarketCap. (2023). Coinmarketcap. Retrieved March 30, 2023, from https://
coinmarketcap.com/
Diebold, F. X., & Lopez, J. A. (1996). 8 forecast evaluation and combination. Hand-
book of statistics, 14, 241–268. https://doi.org/10.1016/S0169-7161(96)14010-
4
Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of
Business & Economic Statistics, 253–263. https://doi.org/10.2307/1392185
Dimpfl, T., & Peter, F. J. (2021). Nothing but noise? price discovery across cryp-
tocurrency exchanges. Journal of Financial Markets, 54, 100584. https://doi.
org/10.1016/j.finmar.2020.100584
Dyhrberg, A. H. (2016a). Bitcoin, gold and the dollar–a garch volatility analysis.
Finance Research Letters, 16, 85–92. https://doi.org/10.1016/j.frl.2015.10.008
Dyhrberg, A. H. (2016b). Hedging capabilities of bitcoin. is it the virtual gold? Fi-
nance Research Letters, 16, 139–144. https://doi.org/10.1016/j.frl.2015.10.025
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of
the variance of united kingdom inflation. Econometrica: Journal of the econo-
metric society, 987–1007. https://doi.org/10.2307/1912773
35
Ethereum Foundation. (2021). Energy consumption and the ethereum network. https:
//ethereum.org/en/energy-consumption/
Ethereum Foundation. (2023). Proof of stake (pos). https : / / ethereum . org / en /
developers/docs/consensus-mechanisms/pos/
FOREX. (2023). Cryptocurrency market hours. https://www.forex.com/ie/markets-
to-trade/cryptocurrency-trading/cryptocurrency-market-hours/
Georgieva, K. (2022). The future of money: Gearing up for central bank digital cur-
rency. https://www.imf.org/en/News/Articles/2022/02/09/sp020922-the-
future-of-money-gearing-up-for-central-bank-digital-currency
Ghalanos, A. (2022). Rugarch: Univariate garch models. [R package version 1.4-9.].
Glaser, F., Zimmermann, K., Haferkorn, M., Weber, M. C., & Siering, M. (2014).
Bitcoin-asset or currency? revealing users’ hidden intentions. Revealing Users’
Hidden Intentions (April 15, 2014). ECIS.
Glosten, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the relation between
the expected value and the volatility of the nominal excess return on stocks.
The journal of finance, 48 (5), 1779–1801. https://doi.org/10.1111/j.1540-
6261.1993.tb05128.x
Hansen, P. R., & Lunde, A. (2005). A forecast comparison of volatility models: Does
anything beat a garch (1, 1)? Journal of applied econometrics, 20 (7), 873–889.
https://doi.org/10.1002/jae.800
Hansen, P. R., & Huang, Z. (2016). Exponential garch modeling with realized mea-
sures of volatility. Journal of Business & Economic Statistics, 34 (2), 269–287.
https://doi.org/10.1080/07350015.2015.1038543
Haqqi, T. (2022). 15 major companies that accept bitcoin. https://finance.yahoo.
com/news/15-major-companies-accept-bitcoin-155558584.html
Katsiampa, P. (2017). Volatility estimation for bitcoin: A comparison of garch models.
Economics letters, 158, 3–6. https://doi.org/10.1016/j.econlet.2017.06.023
36
Lopez, J. A. (2001). Evaluating the predictive accuracy of volatility models. Journal
of forecasting, 20 (2), 87–109. https://doi.org/10.2307/1392185
Loretan, M., & Phillips, P. C. (1994). Testing the covariance stationarity of heavy-
tailed time series: An overview of the theory with applications to several fi-
nancial datasets. Journal of empirical finance, 1 (2), 211–248.
Makarov, I., & Schoar, A. (2020). Trading and arbitrage in cryptocurrency markets.
Journal of Financial Economics, 135 (2), 293–319. https://doi.org/10.1016/j.
jfineco.2019.07.001
MCMillan, D., Speight, A., & Apgwilym, O. (2000). Forecasting uk stock market
volatility. Applied Financial Economics, 10 (4), 435–448. https://doi.org/10.
1080/09603100050031561
Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of mathe-
matical Psychology, 47 (1), 90–100. https://doi.org/10.1016/S0022-2496(02)
00028-7
Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Decentralized
business review, 21260.
Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach.
Econometrica: Journal of the econometric society, 347–370. https://doi.org/
10.2307/2938260
Poon, S.-H., & Granger, C. W. J. (2003). Forecasting volatility in financial markets:
A review. Journal of economic literature, 41 (2), 478–539. https://doi.org/10.
1257/002205103765762743
Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics,
461–464.
Sharpe, W. F. (1998). The sharpe ratio. Streetwise–the Best of the Journal of Portfolio
Management, 3, 169–185.
Stock, J., & Watson, M. (2020). Introduction to econometrics. (Fourth edition, global
ed., Pearson series in economics).
37
Taylor, S. J. (1986). Modelling financial time series. Chichester: Wiley.
Thomas, D., & Sabater, A. (2022). Private equity and institutional investors back away
from crypto and defi. https://www.spglobal.com/marketintelligence/en/news-
insights/latest - news- headlines/private- equity- and- institutional - investors-
back-away-from-crypto-and-defi-73627011
Tsay, R. S. (2010). Analysis of financial time series (Vol. 762). John Wiley & Sons.
Tse, Y. (1991). Stock returns volatility in the tokyo stock exchange. Japan and the
World Economy, 3 (3), 285–298. https://doi.org/10.1016/0922-1425(91)90011-
Z
Tse, Y., & Tung, S. (1992). Forecasting volatility in the singapore stock market. Asia
Pacific Journal of Management, 9, 1–13. https://doi.org/10.1007/BF01732034
Wells Fargo. (2022). Cryptocurrencies - too early or too late? https : / / saf .
wellsfargoadvisors.com/emx/dctm/Research/wfii/wfii reports/Investment
Strategy/cryptocurrency020722.pdf
West, K. D., & Cho, D. (1995). The predictive ability of several models of exchange
rate volatility. Journal of econometrics, 69 (2), 367–391. https://doi.org/10.
1016/0304-4076(94)01654-I
38
Appendix
Table 7: Mean Squared Errors for different λ
λ MSE
0.87 0.000576
0.88 0.000574
0.89 0.000562
0.90 0.000572
0.91 0.000584
0.92 0.000598
0.93 0.000614
0.94 0.000633
0.95 0.000666
0.96 0.000682
0.97 0.000714
0.98 0.000760
0.99 0.000813
Figure 5: Realized Volatility
39