Deep Learning Cocoa Price Prediction 
with Weather Data 
William Bergander  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Supervisor: Nicklas Nordfors 
Master’s thesis in Economics, 30 hec  
Spring 2025 
Graduate School, School of Business, Economics and Law, University of Gothenburg, Sweden 
 1 
Abstract 
Climate-induced volatility in global cocoa markets poses significant challenges to producers 
and stakeholders, notably evidenced by the severe price surge in late 2024 following adverse 
weather events. This thesis investigates whether integrating weather variables, specifically 
temperature and precipitation, with historical cocoa futures prices can enhance predictive 
accuracy using Long Short-Term Memory (LSTM) neural networks. Leveraging daily price 
data from the ICE Futures U.S. exchange (1980–2025) and comprehensive meteorological 
data from the ERA5 dataset across key cocoa-producing regions, multiple LSTM models, 
including global and localised scales, were developed and evaluated. This deep learning 
approach to cocoa price prediction, incorporating meteorological inputs, addresses a gap in 
existing forecasting literature. 
Contrary to prior studies and expectations, models incorporating detailed weather indicators 
did not improve forecasting accuracy over a baseline model relying solely on historical 
prices, which achieved a notably high predictive performance (R² ≈ 0.998). A global-scale 
model using average climate indicators matched the baseline model's predictive power (R² ≈ 
0.986), while more localised models underperformed significantly. Robustness tests, 
including permutation importance analyses, confirmed that historical price data 
predominantly drove the levels of predictive power, respectively, with weather variables 
providing minimal incremental value. This lack of improvement suggests that weather 
impacts may already be priced into market trends due to market efficiency or were too 
complex to capture with the utilised naive LSTM model. 
These results highlight methodological limitations, particularly the absence of cocoa yield 
data, which likely restricted the ability to capture the indirect economic impacts of weather 
conditions accurately. Future research incorporating this cocoa yield data within a structured 
causal framework, such as the DeepIV method, has the potential to more precisely model the 
economic transmission from climate impacts to the global cocoa market pricing. 
  
 2 
 
Table of Contents 
Introduction..................................................................................................................4 
Literature Review ..........................................................................................................6 
Theory ..........................................................................................................................8 
Data ........................................................................................................................... 12 
Price ............................................................................................................................................... 13 
Weather Variables .......................................................................................................................... 14 
Geographic Data (Polygons) ........................................................................................................... 15 
Methodology ............................................................................................................... 16 
Results ....................................................................................................................... 19 
The Models of Special Interest........................................................................................................ 24 
Robustness tests ........................................................................................................................... 25 
Discussion.................................................................................................................. 29 
References ................................................................................................................. 31 
Appendix A: Terminology ............................................................................................. 35 
Artificial Neural Networks (ANNs)................................................................................................... 35 
Recurrent Neural Networks (RNNs) ................................................................................................ 36 
Long Short-Term Memory Networks (LSTMs) .................................................................................. 36 
Dropout Regularisation .................................................................................................................. 37 
Look-back Window (Time Step) ...................................................................................................... 37 
Supervised Learning ....................................................................................................................... 38 
Early Stopping ................................................................................................................................ 38 
Permutation Importance ................................................................................................................ 39 
Appendix B: Model Variable Inclusion  ......................................................................... 40 
 
  
 3 
Introduction  
The global cocoa prices have been highly volatile in recent years due to climate-related 
supply shocks. For example, in late 2024, cocoa futures prices nearly tripled, reaching a 
record of approximately $12,900 per tonne after a series of extreme weather events severely 
impacted harvests in West Africa (Smeeton, 2024; Thukral & Tan, 2024). The repeated 
occurrences of drought and heavy rainfall led to multi-year supply shortages, constraining 
global stocks and causing prices to surge by around 136% from mid-2022 to early 2024 
(Kozul-Wright, 2025). The adverse climate conditions in a region where Côte d’Ivoire and 
Ghana produce over half of the world’s cocoa supply (Ritchie, Rosado, & Roser, 2023) 
highlight the susceptibility of cocoa markets to climate shocks. Experts in climate science 
warn that extreme weather events are likely to continue causing sudden increases in food 
commodity prices, including cocoa, as climate change exacerbates weather instability (Bilal 
& Känzig, 2024). 
 
Figure 1 – The Historical Price of Cocoa Futures on the New York Stock Exchange, Aggregated Monthly for aesthetic purposes 
This price volatility has tangible consequences. Sharp swings in cocoa prices threaten the 
livelihoods of millions of smallholder farmers and complicate planning for chocolate 
manufacturers and traders (Voora, Bermúdez, & Larrea, 2020). Accordingly, accurate 
forecasting of agricultural commodity prices is crucial for stakeholders across the supply 
chain to manage risks and make informed decisions (Pandit et al., 2024). Traditional price-
forecasting methods rely solely on historical price patterns, potentially neglecting external 
factors such as weather. Given cocoa’s dependence on climate-sensitive tropical agriculture, 
weather conditions like rainfall, temperature, and drought significantly impact cocoa yields 
and supply levels (Carr & Lockwood, 2011; Schroth, Läderach, Martinez-Valle, & Bunn, 
2016), thus influencing price movements. Market participants often anticipate these impacts, 
adjusting price expectations even before crop losses materialise (Letta, Montalbano, & Tol, 
2022). This suggests that timely meteorological data could contain predictive signals for price 
formation that are not captured by historical prices alone. 
These observations lead directly to the formal research question of this study: “Can future 
global cocoa prices be effectively predicted by combining historical cocoa price data with 
weather indicators, specifically temperature and precipitation?” This question seeks to 
understand whether integrating climate variables can enhance the accuracy and reliability of 
 4 
price forecasting models. Given the potential economic benefits and improved market 
stability that accurate forecasting could bring, this research aims to assess the added value of 
incorporating weather data into cocoa price predictions. Intuitively, as weather shocks 
influence supply fluctuations that are quickly factored into market prices, one might expect 
that including such information would enhance forecast accuracy. This thesis empirically 
investigates that premise through a deep learning method called LSTM. The main reason for 
this; is that it is both inspired by the prevalence and effect that AI in the form of LLMs 
(Large Language Models) has had on society over the last years and the ability for the LSTM 
to capture complex temporal relationships in time series data(see Method section for more). 
This Thesis uses Long Short-Term Memory (LSTM) neural networks, a deep learning 
architecture suitable for time-series forecasting, to address the research question. The analysis 
utilises a comprehensive dataset spanning from 1980 to early 2025, including daily cocoa 
futures prices and corresponding weather observations. The weather variables, temperature 
and precipitation, are derived from the ERA5 climate reanalysis dataset, which provides 
extensive global meteorological data (Hersbach et al., 2023). Several LSTM model variants 
are constructed for comparison: a baseline model that uses only lagged cocoa prices as inputs, 
and a set of augmented models that incorporate temperature and precipitation indicators in 
addition to past prices. These weather-augmented models differ in their geographical scope of 
climate data, ranging from local weather measures in key cocoa-growing regions to broader 
regional or global average climate indices. The thesis evaluates each model’s out-of-sample 
forecasting performance using standard accuracy metrics (root mean squared error, mean 
absolute error, and R²), thereby assessing whether the inclusion of weather features yields any 
predictive improvement over the price-only baseline. 
Contrary to expectations, including explicit weather variables, the main findings 
did not produce better forecasts in this context. The simplest LSTM model that used only 
historical prices achieved the best predictive accuracy, as indicated by the highest 𝑅2(See 
Table 1 in the Results section). Most weather-enriched models performed worse than this 
price-only baseline, suggesting that the added climate information often introduced noise or 
complexity that the model could not readily translate into better price predictions. Only one 
augmented model using global average temperature and precipitation series matched the 
baseline model’s performance in predictive power: 𝑅2(See Table 1 in the Results section). In 
all other cases, the baseline outperformed the more complex specifications. These results 
indicate that a straightforward inclusion of daily weather variables provides little to no 
forecasting benefit under the current modelling setup. In short, naive incorporation of climate 
data did not improve cocoa price predictions, hinting that more advanced model structures 
and better data handling and inclusion may be needed to capture the complex economic 
relationships between weather and cocoa markets. 
Finally, the structure of the thesis is as follows: It begins with a Literature Review on 
commodity price forecasting and its connection to weather. The Theory section examines 
how weather affects agricultural markets. The Data section outlines the cocoa price and 
ERA5 weather datasets. Methodology describes the LSTM model and compares the baseline 
to weather-augmented models. Results present forecasting outcomes and robustness checks. 
Discussion interprets findings, considers limitations and market efficiency, and suggests 
future research. Appendix A provide additional technical details, and Appendix B provides 
an overview of model variable inclusion. 
 5 
Literature Review 
This chapter reviews the existing literature on agricultural commodity price forecasting, 
emphasising the influence of weather variables on market volatility and predictive modelling 
methods. It begins by outlining the established relationship between climatic variability and 
commodity price fluctuations, highlighting cocoa's particular sensitivity due to its 
concentrated tropical production areas. Subsequently, the discussion examines traditional 
econometric forecasting approaches and underscores their limitations in capturing complex 
market dynamics influenced by weather. The review then transitions to recent advances in 
forecasting methodologies, focusing on the superior capabilities of deep learning approaches, 
particularly Long Short-Term Memory (LSTM) networks. It identifies a critical gap in 
current research: integrating detailed meteorological data with advanced deep learning 
models specifically for global cocoa price prediction. This thesis addresses this gap by 
combining high-resolution weather data with LSTM models, offering potential improvements 
in forecasting accuracy and significant practical implications for market stakeholders. 
Accurate forecasts of agricultural commodity prices are crucial for producers, consumers, and 
policymakers to manage risks in volatile markets (Pandit et al., 2024). A growing body of 
research has examined how weather variability influences commodity prices, consistently 
finding that climatic fluctuations correlate with market volatility (Letta, Montalbano, & Tol, 
2022; Ubilava, 2018). Weather extremes such as droughts, excessive rainfall, and 
temperature anomalies can directly reduce crop yields, thereby tightening supply and driving 
up prices (Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). For example, large-scale 
climate oscillations like the El Niño–Southern Oscillation (ENSO) are known to disrupt 
agricultural production globally, leading to price spikes across multiple crops (Iizumi et al., 
2014; Anderson, Seager, Baethgen, & Cane, 2018). Importantly, commodity markets tend to 
preempt these impacts by quickly adjusting price expectations when adverse weather is 
anticipated. In other words, traders often bid prices up before production losses materialise 
(Letta et al., 2022). 
Cocoa is a commodity especially sensitive to weather fluctuations due to its concentrated 
tropical growing regions. West Africa alone accounts for over half of the world’s cocoa 
output, so abnormal weather in this area can significantly influence global supply and prices 
(ICCO, 2023). Variations in rainfall and temperature affect cocoa tree health and yields; for 
instance, insufficient rainfall can stress trees while excessive moisture fosters fungal diseases, 
which reduce harvests (Schroth, Läderach, Martinez-Valle, & Bunn, 2016). Recent events 
underscore this vulnerability: in 2023, West African cocoa farms experienced a severe 
drought followed by heavy rains that exacerbated pest and disease outbreaks, contributing to 
a steep rise in global cocoa prices (UNCTAD, 2024). Broader climate phenomena have also 
been linked to cocoa market dynamics. Ubilava (2018) found that ENSO-related weather 
patterns significantly affect agricultural price movements, highlighting the value of climate 
indicators in forecasting models for this crop. 
Historically, however, most commodity price forecasting studies have relied on traditional 
econometric models that do not explicitly include weather variables. Time-series approaches 
like Autoregressive Integrated Moving Average (ARIMA) and Vector Auto-Regression 
(VAR) have been widely used for simplicity. However, they often struggle to capture 
agricultural markets’ nonlinear and dynamic behaviour (Ahmed, Atiya, El Gayar, & El-
Shishiny, 2010). In the case of cocoa, Kamu, Ahmed, and Yusoff (2010) demonstrated that a 
 6 
univariate ARIMA model based solely on past prices could achieve moderate predictive 
accuracy. However, it could not account for external drivers such as climatic anomalies or 
other supply shocks. Similarly, Quartey-Papafio, Javed, and Liu (2020) observed a continued 
reliance on ARIMA-type methods in cocoa market analysis. Their study, which forecasted 
cocoa production for the six largest producer countries, showed that while ARIMA models 
were commonly employed, more advanced techniques (like grey prediction models) yielded 
superior accuracy. This persistent use of conventional models underscores a gap in the 
literature: the need for forecasting approaches that integrate exogenous factors, most notably 
weather, rather than relying only on historical price patterns. 
Researchers have increasingly turned to machine learning techniques for commodity price 
forecasting in recent years, focusing on deep learning models capable of modelling complex 
temporal patterns. Long Short-Term Memory (LSTM) neural networks, a class of recurrent 
neural networks, are especially well-suited to time-series data due to their ability to capture 
long-range dependencies and nonlinear relationships (Namin & Namin, 2018). LSTM-based 
models have outperformed traditional statistical methods in various commodity forecasting 
applications, demonstrating higher accuracy and reliability (Sari, Duran, & Kutlu, 2024). One 
advantage of LSTMs is their flexibility in handling multivariate inputs, which can incorporate 
multiple features beyond just past prices (Ly, Traore, & Dia, 2021). This capability is 
particularly relevant for agriculture, where combining price data with weather or other 
external variables could enhance predictions. For example, Olofintuyi, Olajubu, and Olanike 
(2023)report that an LSTM model significantly outperformed a standard recurrent neural 
network in predicting cocoa yields, underscoring the value of deep learning in capturing the 
influence of weather and growth cycles on agricultural output. Likewise, Ouyang et al. 
(2019) found that LSTM models provided markedly more accurate agricultural commodity 
futures price forecasts than ARIMA and VAR benchmarks. They attributed this improvement 
to LSTM’s robustness in handling noisy, non-stationary time series data, characteristics 
typical of commodity markets influenced by irregular weather events. 
Parallel streams of research in economics and climate science further reinforce the 
importance of integrating weather data into commodity price forecasting. Hsiang, Meng, and 
Cane (2011) famously linked global climate variability to rises in civil conflicts, illustrating 
how severe weather anomalies can disrupt societies and, by extension, economic stability. In 
a more directly related analysis, Bilal and Känzig (2024) showed that global temperature 
fluctuations strongly predict extreme weather events more than localised temperature 
measures. This suggests that broad climate indices (such as global temperature anomalies or 
oceanic oscillation indicators) carry valuable information and could improve the robustness 
of price prediction models. Additionally, Zelingher and Makowski (2024) demonstrated that 
incorporating unexpected crop production shocks, often driven by adverse weather, 
substantially enhances the accuracy of global commodity price forecasts. Their study found 
that cocoa prices exhibit particularly high volatility in response to weather-induced 
production shortfalls, even more than staples like maize or soybeans. These findings 
underscore that a forecasting approach attuned to climate variability, especially for climate-
sensitive commodities like cocoa, could yield significant benefits in anticipating price 
movements. 
Despite the evident influence of weather on cocoa production and the advances in predictive 
modelling, there remains a notable gap in the literature: very few studies have combined 
detailed weather variables with modern deep learning methods for cocoa price forecasting. In 
other words, prior research has tended to examine climate impacts on cocoa markets and to 
 7 
apply sophisticated forecasting techniques separately, without fully leveraging their 
intersection. This thesis addresses that gap by developing an LSTM-based forecasting model 
for global cocoa prices that explicitly integrates rich meteorological information alongside 
historical price data. In particular, the model incorporates precipitation and temperature 
variables from major cocoa-growing regions (as well as relevant global climate indicators) 
and past price trends to predict future price movements. The present work seeks to contribute 
to the literature by uniting high-resolution weather data with a learning framework to predict 
price. It explores a previously underexplored approach to global cocoa price prediction that 
aims to improve forecasting accuracy.  
Theory 
This chapter establishes the theoretical foundation linking climate variability to cocoa market 
dynamics. It examines how weather conditions influence cocoa production and, in turn, how 
these production shocks are transmitted to global price movements. I also consider the role of 
market expectations in anticipating climate impacts, which can amplify price volatility. By 
focusing on these linkages, the chapter provides the basis for why incorporating both 
historical price trends and weather indicators may improve the prediction of future cocoa 
prices. The overarching goal is to connect agricultural and economic insights to the study’s 
forecasting question. In particular, I outline how temperature and precipitation, two key 
weather variables, affect cocoa yields, and how yield fluctuations driven by weather 
ultimately shape market prices. The discussion then turns to how commodity markets respond 
to expected supply changes, often adjusting prices before harvest outcomes are realised. 
Finally, the chapter summarises these insights and explains how they motivate the inclusion 
of weather variables alongside past prices in the empirical model. 
Cocoa production is susceptible to climatic conditions, especially rainfall and temperature, 
which regulate critical biological processes of the cocoa tree. Rainfall patterns largely govern 
the tree’s phenology, including flowering and pod development. For instance, the onset of the 
rainy season typically triggers mass flowering events that later translate into higher yields 
(Wibaux et al., 2024). Cocoa trees thrive with roughly 1,5002,500 mm of rainfall annually, 
well-distributed throughout the year, and even short deviations can significantly affect output. 
Excessive rainfall can waterlog soils and promote fungal diseases that reduce pollination and 
pod set, whereas prolonged drought stresses the trees, curtailing pod development and 
lowering both the quantity and quality of beans (Carr & Lockwood, 2011; Wibaux et al., 
2024). These climate-driven impacts on yields are especially pronounced in West Africa, a 
region that produces about two-thirds of the world’s cocoa, making its output particularly 
vulnerable to swings in precipitation patterns (Voora et al., 2020). 
Temperature is another crucial factor for cocoa physiology. Extended periods of excessive 
heat can disrupt photosynthesis, cause flowers and small pods to abort, and lead to a 
condition known as Cherelle wilt, where young pods die off (Carr & Lockwood, 2011). On 
the other extreme, temperatures that are too cool can slow the growth and development of 
pods. High temperatures often become most devastating when coupled with drought, as water 
stress and heat weaken the trees and increase their susceptibility to pests and diseases. Hot 
and dry conditions have been linked to cocoa swollen shoot virus outbreaks and black pod 
disease, which can decimate yields (Beg et al., 2017). In sum, deviations in temperature and 
precipitation from the narrow optimal conditions for cocoa can sharply reduce agricultural 
 8 
output. This biological sensitivity directly links weather variability and potential supply 
shocks in the cocoa market. 
Understanding cocoa’s response to weather is essential for anticipating production swings. 
Farmers adopt various adaptation strategies, such as irrigating during dry spells or planting 
shade trees to cool plantations. However, these measures are not always feasible or sufficient 
across the diverse smallholdings that dominate cocoa farming (Carr & Lockwood, 2011). 
Thus, in practice, cocoa yields remain strongly driven by weather conditions. Empirical 
observations support this: for example, extreme weather in West Africa in 2023 (a severe 
drought followed by heavy rains) caused sharp yield losses due to drought stress and fungal 
outbreaks, contributing to a steep rise in global cocoa prices (UNCTAD, 2024). This 
illustrates how local climate anomalies can have outsized effects on worldwide supply. 
Overall, cocoa’s biological vulnerability to temperature and rainfall underscores that these 
variables are logical candidates to include when modelling cocoa price movements. Adverse 
weather often means smaller harvests, which can tighten supply and put upward pressure on 
prices. 
Climate-induced yield fluctuations in key producing regions feed directly into global cocoa 
price dynamics via basic supply and demand forces. When weather extremes significantly 
reduce cocoa output, as with a significant drought or storm-related crop failure, the 
immediate effect is a negative supply shock. Given that Ivory Coast and Ghana alone account 
for over half of the world's cocoa production (Ritchie, Rosado, & Roser, 2023), a local 
harvest shortfall in West Africa can substantially shrink global supply. Even minor supply 
disruptions can trigger disproportionately large price swings in commodity markets with 
relatively inelastic demand. Cocoa demand has been estimated to be highly price-inelastic in 
the short run (with an elasticity of 0.06), meaning consumers and industry cannot easily 
substitute or reduce cocoa use when prices rise (Tothmihaly, 2018). As a result, a production 
drop of just a few percentage points can translate into a much larger percentage increase in 
price. This mechanism explains why weather-driven crop failures often coincide with sharp 
price spikes. For instance, drought conditions that damage a growing season’s crop can send 
futures prices soaring as buyers scramble for limited beans, amplifying volatility across the 
entire chocolate supply chain (Tothmihaly, 2018). In late 2024, such dynamics were on full 
display when a series of climate shocks constrained West African output and cocoa futures 
prices nearly tripled to record highs. In summary, adverse weather events in critical cocoa-
growing areas transmit to the world market by tightening supply and pushing prices upward. 
Global commodity exchanges play a pivotal role in this transmission by quickly incorporating 
current and expected production information. Cocoa is primarily traded in future markets, 
such as the Intercontinental Exchange (ICE) in New York and ICE Europe (formerly NYSE 
Liffe) in London, which serve as reference points for international prices (Gilbert, 2016). 
These markets react immediately to news or forecasts of crop outcomes. When reports 
emerge of poor rainfall or extreme heat threatening the upcoming harvest, traders on these 
exchanges bid up futures contracts in anticipation of future shortages. This way, local 
weather developments are rapidly reflected in global cocoa prices. Speculative trading 
behaviour can exacerbate the speed and scale of price response. For example, if reports 
indicate an El Niño or other climate anomaly likely to depress West African yields, 
speculative buying can drive prices higher well before any cocoa pod is lost (Anderson, 
Seager, Baethgen, & Cane, 2018; Iizumi et al., 2014). These futures market dynamics ensure 
that yield shocks are transmitted internationally: a supply shortfall in one part of the world 
 9 
causes prices to rise everywhere, distributing the economic impact of regional climate events 
across all market participants. 
Another factor intensifying the price impact of weather shocks is the lack of buffers in the 
supply chain. Cocoa stocks (inventories) are often limited relative to annual consumption, so 
reserves cannot always offset a poor harvest. Likewise, the concentration of production in a 
few countries means diversification is low if Ghana and Côte d’Ivoire experience adverse 
weather, and there are few alternative sources to prevent a global deficit. The result is that 
weather-related supply contractions tend to cause notable jumps in price levels 
(Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). Put differently, fundamental economic 
theory predicts that a leftward shift of the supply curve for an inelastic-demand commodity 
will lead to a steep rise in equilibrium price. Cocoa’s market behaviour in recent episodes of 
drought and heavy rainfall shocks is consistent with this prediction, reinforcing that weather 
is a critical driver of price fluctuations. 
In agricultural markets like cocoa, prices react to realised production outcomes and move 
based on future supply and demand expectations. Market participants continuously form 
expectations about upcoming harvest sizes, often using weather information as an early 
signal. Suppose traders expect an ongoing drought or an approaching storm to cut cocoa 
yields drastically. In that case, they will incorporate that expectation into today’s pricing by 
bidding up futures contracts or holding back stocks. This anticipatory behaviour means prices 
can rise (or fall) well before the harvest report. Letta, Montalbano, and Tol (2022) provide 
evidence of how strongly expectations influence agricultural market prices: they estimate that 
roughly 85% of the eventual price impact of a drought shock is reflected in prices before the 
actual crop loss occurs. In other words, the market preemptively prices most of the shock 
when credible information about a drought becomes available. Such behaviour illustrates the 
efficient processing of new information in commodity markets and contributes to pronounced 
price volatility. When many traders act on forecasted weather events, prices can swing 
rapidly based on anticipated scenarios that may or may not fully materialise. 
Speculative trading and the herd behaviour of market actors further amplify this volatility. 
Suppose early reports or meteorological forecasts predict unfavourable conditions (such as a 
delayed rainy season or an extreme heat wave). In that case, speculators might aggressively 
buy cocoa futures, causing price jumps that feed on themselves. This can overshoot fair value 
if the weather threat later abates, leading to corrections that decrease prices. Thus, weather-
driven expectations can potentially create seesaw price patterns, as markets oscillate between 
bullish and bearish outlooks with each new forecast. The limited ability of most cocoa 
farmers to hedge against these fluctuations due to factors like lack of market access or 
financial tools means that much of the weather risk is borne out in spot prices (Tothmihaly, 
2018). The combination of fundamental supply uncertainty and speculative anticipation 
makes cocoa one of the more volatile agricultural commodities (Ubilava, 2018). Notably, 
even broadly known climate phenomena like the El Niño Southern Oscillation (ENSO) 
introduce volatility: when an El Niño event is predicted, markets often react strongly because 
historically ENSO has brought drier conditions to some cocoa regions, affecting yields 
(Ubilava 2018; Anderson et al., 2018). In summary, the expectation of future weather impacts 
is a significant driver of short-term price volatility in cocoa markets, as prices adjust to 
current supply-demand imbalances and the market’s collective forecasts of upcoming 
production. 
 10 
Crucially, the fact that prices respond to anticipated weather implies that there may be 
predictive information in climate indicators that could be harnessed systematically. Suppose 
traders are watching rainfall and temperature data to form price expectations. In that case, an 
empirical model that includes those same weather variables might capture some of the early 
signals that pure price-trend models miss. However, it is also possible that markets, being 
forward-looking, already embed most readily available weather information into current 
prices. This tension between the potential value of weather data and the market’s efficiency 
in pricing it is an underlying theme for this thesis forecasting approach. The theoretical 
insight remains: weather anomalies drive actual supply changes and speculative dynamics, 
making them a key piece of the cocoa price puzzle and a candidate for inclusion in predictive 
modelling. 
It is essential to distinguish between the immediate effects of local weather shocks and the 
gradual influence of global climate change on the cocoa market. Short-term weather extremes 
such as droughts, floods, or heat waves in key growing regions can abruptly curtail cocoa 
yields, triggering supply shortages and price spikes. These localised disruptions in production 
and trade create short-run volatility in cocoa prices. By contrast, long-term climate change 
(e.g. rising average temperatures and shifting rainfall patterns) operates more slowly but 
steadily, altering the baseline growing conditions for cocoa. Over time, a warming climate 
and changing precipitation regimes can depress agricultural productivity and shift suitable 
cultivation zones, affecting cocoa output trends and necessitating costly adaptations by 
producers. 
Recent research underscores the significant economic consequences of climate variability, 
reinforcing its relevance for climate-sensitive commodities like cocoa. For instance, Bilal and 
Känzig (2024) demonstrate that broad climatic disruptions such as an unexpected increase in 
global average temperature can substantially reduce overall economic output, underscoring 
that climate change is a serious macroeconomic risk rather than a negligible background 
trend. Equally important, their findings reveal that warm, low-income regions, including the 
tropical countries where most cocoa is grown, suffer the most severe economic losses from 
such climate shocks. This suggests that the cocoa sector, concentrated in these vulnerable 
areas, faces acute risks from abrupt weather events and gradual climate shifts. Therefore, in 
analysing weather-driven cocoa price dynamics, it is crucial to account for the dual nature of 
these climate influences: on the one hand, localised weather shocks cause sharp but 
temporary supply disruptions and price spikes, while on the other hand, long-term climate 
changes gradually exert persistent pressure on cocoa production. Recognising this distinction 
is essential for understanding and forecasting cocoa market behaviour, thereby setting the 
stage for examining how specific weather variations translate into yield changes and price 
movements. 
The theoretical considerations above underscore a tight linkage between climate factors and 
cocoa price behaviour. Biologically, climate and precipitation patterns directly influence 
cocoa yields, meaning that unusual weather can substantially alter the quantity of cocoa beans 
produced. Economically, these yield changes feed into global prices: a weather-induced 
supply shortfall leads to higher prices, especially given the commodity’s inelastic demand 
and concentrated production base. Moreover, commodity markets anticipate these effects. 
Traders use weather information to forecast future supply, often driving prices ahead of 
observed production changes. This convergence of evidence suggests that weather variables 
carry meaningful information about future price movements, above and beyond what is 
 11 
contained in past prices alone (Letta et al., 2022; Ubilava, 2018). A forecasting model may be 
improved by incorporating such climate indicators alongside historical price data. 
The insights from this theoretical framework directly motivate the empirical strategy in the 
following chapters. Specifically, they support the hypothesis that combining historical cocoa 
prices with key weather metrics (temperature and rainfall) will yield more accurate 
predictions of future prices than relying on price history alone. By including weather 
variables in the model, the thesis aims to capture the early warning signals of supply shifts 
that pure time-series price models might overlook. At the same time, integrating past price 
accounts for established trends, seasonality, and any prior information the market has already 
absorbed. The next part of this thesis will translate these concepts into an empirical 
forecasting approach, testing whether leveraging both sets of information, climate indicators 
and price momentum, can enhance global cocoa prices' predictive power. This bridge from 
theory to practice is formalised to evaluate the research question and provide evidence on the 
value of climate-informed price forecasting for stakeholders in the cocoa market. 
Data 
This chapter describes the datasets used in the empirical analysis of this thesis, detailing their 
sources, characteristics, and how they are structured to support forecasting of global cocoa 
prices. First, the selection of the timeframe and frequency of the data is justified, with 
particular emphasis on the daily data frequency, which allows capturing short-term volatility 
and abrupt price movements critical for the chosen forecasting methodology. The chapter 
then outlines the economic dataset, consisting of daily cocoa futures prices from the ICE 
Futures U.S. exchange, highlighting how the continuous front-month price series was 
constructed. Following this, the weather data are described, derived from the ERA5 
reanalysis dataset and structured into multiple spatial resolutions, ranging from global 
averages to highly localised cocoa-farming areas in West Africa. Lastly, the geographic data 
detailing cocoa farm boundaries is presented, which allows precise alignment of weather data 
to the areas directly affecting cocoa production.  
The analysis in this thesis spans the period from 1 January 1980 to 28 February 2025, 
dictated by the availability of reliable cocoa futures price data for that interval. While this 
sample does not include the notable cocoa price boom of the 1970s (due to a lack of 
consistent data from that earlier era), it does encompass the historically significant late-2024 
price spike. Including this recent extreme event is valuable for analysing the interplay 
between unusual weather shocks and market dynamics. In summary, the chosen timeframe 
balances a long historical window with the inclusion of critical events, thereby supporting a 
robust examination of weather–price relationships. 
A daily frequency is adopted for all variables, representing this topic's highest meaningful 
information density. Daily observations capture short-term volatility in cocoa prices and 
weather, which is essential for identifying immediate market responses to environmental 
conditions. This high-frequency approach also provides a larger dataset for training the 
LSTM neural network model, enabling it to detect fine-grained time patterns that would be 
potentially lost at coarser (e.g. monthly) frequencies. In other words, using daily data allows 
the model to learn from day-to-day fluctuations and abrupt shocks, rather than only long-term 
trends. 
 12 
The cocoa price dataset reflects trading activity on the ICE Futures U.S. exchange (New 
York). Consequently, weekends and U.S. exchange holidays are omitted from the price and 
weather time series when the cocoa futures market is closed. This ensures that the two 
datasets remain perfectly aligned in time, avoiding any artificial gaps or asynchronous 
observations. Aside from these intentional gaps and the geographic coverage limitations 
discussed below, the compiled dataset contains no missing values. 
 
Figure 2 Ritchie, H., Rosado, P., & Roser, M. (2023). Cocoa bean production, 1961 2022 [Chart]. Our World in Data. 
https://ourworldindata.org/grapher/cocoa-bean-production through the Creative Commons license 
Using cocoa production data from Our World in Data (OWID) (Ritchie et al., 2023), an 
initial sample of 62 cocoa-producing countries was identified. However, due to incomplete 
ERA5 weather data coverage for some regions (e.g. small or data-sparse producers), 10 
countries were excluded from the analysis. These exclusions are justified because the 
dropped countries contribute only marginally to global cocoa output. By concentrating on the 
remaining 52 countries with complete data, the sample preserves all major cocoa-growing 
regions and thus the actors with substantial market influence. In practice, the excluded 
countries have no ERA5 grid data to assign for temperature/precipitation, and their absence 
does not materially affect the global or regional analyses, given their minor production 
volumes. 
The ERA5 weather dataset and the ICE Futures U.S. price series were chosen for their 
extensive historical coverage and recognised reliability. These sources are well-regarded in 
academic and industry research, making them appropriate for a rigorous analysis. 
Additionally, industry literature (e.g. reports by the International Cocoa Organisation) affirms 
that cocoa futures in New York and London markets are the primary venues determining the 
world market price of cocoa. Focusing on the future prices of New York (in the absence of 
London data) is therefore a reasonable representation of global price dynamics. One notable 
limitation of the dataset is the absence of the early-1970s cocoa price surge; if detailed 
weather and price data from that period were available, including them might have revealed 
additional structural relationships and potentially enhanced the neural network’s predictive 
performance. Nonetheless, the chosen data span (1980–2025) is the most extended period for 
which consistent, high-quality data could be obtained, and it captures a wide range of market 
conditions, including the recent unprecedented shock. 
Price 
The economic variable of interest is the benchmark cocoa futures price traded on ICE Futures 
 13 
U.S. (commodity code CC). Each standard cocoa futures contract represents 10 metric tonnes 
of cocoa beans, and prices are quoted in nominal US dollars per metric tonne. The dataset 
consists of the official daily settlement prices published at the close of each trading day, 
covering the whole sample period from January 1980 through February 2025. To construct a 
continuous price series over this 45-year horizon, the price of the nearest delivery 
month (front-month) contract is recorded for each trading day. As contracts expire and 
trading rolls to the next maturity, the series always follows the front-month contract. This 
rolling procedure yields a single uninterrupted daily price timeline spanning multiple decades 
and contract cycles. 
All prices are kept nominal (no adjustment for inflation or currency changes over time). The 
rationale for using unadjusted prices is to evaluate the raw market responsiveness to weather 
variations, i.e. how the actual prices that traders observe (and react to) move with weather 
events. Deflating prices to constant dollars might remove long-run trends or dampen extreme 
values that are, in fact, relevant to understanding market reactions. Similarly, no outlier 
filtering or smoothing is applied. By preserving these features, the analysis allows the LSTM 
model to learn from the full range of historical price behaviour, including volatility clustering 
and rare shock events. 
Weather Variables   
The weather variables are derived from the ERA5 reanalysis dataset produced by the 
European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5 offers a 
comprehensive, globally gridded record of atmospheric conditions at a spatial resolution of 
0.25° × 0.25° in latitude and longitude. For this study, two types of daily meteorological data 
were extracted to represent weather conditions: 2-meter air temperature and total 
precipitation. Temperature values are expressed in Kelvin (K) and precipitation in meters of 
water equivalent per day (m/day), which are the native units of the ERA5 product. These 
scientific units are retained throughout the analysis to maintain precision and consistency 
with the source data. 
ERA5 data are available hourly, aggregating daily values to match the price frequency. The 
weather series was filtered to the exact trading days used in the price dataset to ensure 
synchronisation with the cocoa market data; every weather observation in the dataset 
corresponds to a day when price data is also present. 
To capture the impact of weather at different spatial scales, the daily ERA5 variables 
were aggregated or selected across four hierarchical geographic scopes ranging from global 
averages to local farm-level conditions: 
1. Global scale: An arithmetic mean of the weather variable is computed over all ERA5 
grid cells worldwide (land and ocean). This provides a single global climate time 
series (for temperature and precipitation) to indicate broad-scale climate patterns. The 
calculation is straightforward and reproducible, though giving equal weight to each 
grid cell introduces a slight bias toward high-latitude regions (since those regions 
have many grid cells that are colder/drier on average, the global mean may be 
influenced by their variance). 
2. Country scale: Country-level averages are computed for each of the 52 cocoa-
producing countries in the dataset. Each country’s ERA5 grid points within its 
national boundaries are averaged to yield a daily temperature series and a daily 
precipitation series specific to that country. (As noted, 10 minor producing countries 
 14 
with insufficient ERA5 coverage were excluded to ensure each included country has 
complete data.) This results in a panel of country-specific weather indicators, 
covering all major cocoa origin countries identified via the OWID production data. 
3. Regional grid scale (Ghana & Côte d’Ivoire): At a finer resolution, all individual 
ERA5 grid cells within the top two cocoa-producing countries, Ghana and Côte 
d’Ivoire, are retained separately rather than averaged. Roughly 4,500 grid cells 
(combined across both countries) fall inside Ghana or Côte d’Ivoire. By keeping data 
at the grid-cell level for these key regions, the analysis can account for localised 
weather variability within the countries that collectively produce most of the world’s 
cocoa. This granular regional scale targets the areas of highest interest (given Ghana 
and Côte d’Ivoire’s dominant production share). It allows exploration of weather 
effects that might differ across various parts of these countries. 
4. Cocoa-farming-area scale: A targeted mask of cocoa cultivation areas in West 
Africa is used to select ERA5 grid cells most relevant to cocoa agriculture. This mask 
is derived from a high-resolution (10 m) land-cover dataset provided by the EU’s 
Africa Knowledge Platform, which maps cocoa plantations in Côte d’Ivoire and 
Ghana using 2019 satellite imagery (European Commission JRC, 2020). All ERA5 
grid cells whose centre falls within the identified cocoa farm polygons are extracted. 
This yields approximately 2,214 grid cells within Côte d’Ivoire’s cocoa farming zones 
and 2,248 within Ghana’s, comprising the finest spatial tier of weather data. Instead of 
averaging over an entire country, this scale captures the daily temperature and 
rainfall specifically over the cocoa-growing districts, where weather fluctuations 
directly affect cocoa trees. 
Geographic Data (Polygons) 
The highest level of geographical detail in the dataset comes from the cocoa farm polygon 
data used to define the above cocoa-farming-area mask. These polygon shapefiles, obtained 
through the EU’s Africa Knowledge Platform, delineate individual cocoa-farming parcels 
across Ghana and Côte d’Ivoire. The polygons were generated by classifying satellite 
imagery (Sentinel-1 radar and Sentinel-2 optical data from 2019) to identify land cover as 
cocoa plantations, and then vectorising those areas into map polygons (European 
Commission JRC, 2020). The resulting dataset provides a detailed representation of where 
cocoa cultivation occurs on the ground. 
By overlaying the ERA5 grid on these farm boundaries, each cocoa farm region is matched to 
one or more nearby ERA5 grid points. This geospatial linking ensures that the weather 
measurements correspond closely to cocoa-growing locations. In practical terms, the analysis 
captures temperature and precipitation exactly in the areas where cocoa is produced, rather 
than using broader regional or national averages that might include non-cocoa land. This 
granular approach facilitates localised environmental metrics for the cocoa sector. It enhances 
the relevance of the weather data for price forecasting, assuming that conditions in the cocoa 
farms (e.g. a drought in a key cocoa belt) ultimately influence supply expectations and thus 
prices. Including this detailed geographic data, alongside broader aggregates, allows the 
thesis to examine whether localised weather shocks in core production zones have discernible 
effects on global cocoa prices, compared to more diffuse global or country-level weather 
signals. 
 
 15 
Methodology 
This section outlines the approach and methods to investigate whether historical cocoa price 
data and weather indicators, specifically temperature and precipitation, can effectively predict 
future global cocoa prices. Given the complexity of the techniques utilised in this research, 
detailed explanations of technical terms, such as Artificial Neural Networks (ANNs), 
Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTM) networks, along 
with comprehensive model-specific details, are provided in Appendix A. This ensures clarity 
and readability in the main text while keeping technical details accessible for those who 
require deeper insights. 
Cocoa price forecasting inherently involves sequential data, as today's price tends to 
influence tomorrow's price. Recognising and accurately modelling this sequential nature 
requires specialised approaches. Among such methods, Long Short-Term Memory (LSTM) 
networks have gained prominence due to their effectiveness in handling sequential datasets 
such as financial time series, climate data, and agricultural commodity prices (Hochreiter & 
Schmidhuber, 1997; Fischer & Krauss, 2018). Unlike traditional models that may not 
effectively capture long-range patterns, LSTM networks are specifically designed to address 
this shortcoming by efficiently utilising historical data from extended periods (Qin et al., 
2017). These capabilities make them particularly well-suited to predicting agricultural 
commodity prices influenced by climatic conditions, as observed in cocoa markets 
(Olofintuyi, Olajubu, & Olanike, 2023). 
LSTM networks function somewhat analogously to human memory: they can selectively 
remember or forget information from the data they encounter over time (Gers, Schmidhuber, 
& Cummins, 2000). This ability to retain relevant historical information, such as past price 
trends and weather events, and disregard less significant data allows LSTMs to identify and 
model complex relationships within data spanning weeks or months (Siami-Namini, 
Tavakoli, & Siami-Namin, 2018). Consequently, this network architecture is particularly 
suitable for addressing the research question of this thesis: assessing whether historical cocoa 
price trends combined with weather conditions enhance the accuracy of future cocoa price 
predictions. 
 16 
 
The forecasting model developed for this research 
employs a specialised neural network architecture 
comprising several interconnected layers. These 
layers allow the network to learn complex temporal 
patterns and relationships in cocoa price data 
influenced by weather conditions. The core of the 
model is built around two stacked LSTM layers. The 
stacking of LSTM layers enhances the network's 
ability to capture intricate relationships by allowing 
the first LSTM layer to identify fundamental 
sequential patterns, which are subsequently refined 
by the second layer (Fischer & Krauss, 2018). Such 
a design effectively addresses the complexity of 
cocoa price forecasting, where interactions between 
climatic factors and market dynamics can span 
several months (Ly, Traore, & Dia, 2021). This 
multi-layered approach has been widely recognised 
Figure 3- By Glosser.ca - Own work, Derivative of 
File:Artificial neural network.svg, CC BY-SA 3.0, in financial and agricultural forecasting literature as 
https://commons.wikimedia.org/w/index.php?curid superior for capturing long-range dependencies 
=24913461 compared to simpler, single-layer network 
architectures (Namini & Namin, 2018). 
Dropout layers are strategically included between and following the LSTM layers to improve 
the model's robustness further. Dropout is a technique that randomly deactivates a fraction of 
connections between neurons during training (Srivastava, Hinton, Krizhevsky, Sutskever, & 
Salakhutdinov, 2014). By periodically removing these connections, dropout reduces the 
likelihood of the network memorising specific patterns found only in the training data, a 
common issue known as overfitting (Goodfellow, Bengio, & Courville, 2016). This method 
ensures the model can generalise well and reliably predict cocoa prices under varying 
conditions not previously encountered during training. The final component of the model 
architecture is the dense (fully connected) output layer. This layer translates the sequential 
patterns identified by the LSTM layers into specific numeric predictions, such as the 
predicted cocoa futures price for the following day. The output layer thus provides a tangible, 
interpretable forecast that can be directly compared against observed market data, enabling 
practical economic interpretation (Brownlee, 2017). 
The model requires data structured clearly and consistently to develop accurate predictions. 
Data preparation involved carefully aligning historical cocoa futures prices with 
corresponding daily weather measurements (temperature and precipitation). Price data were 
sourced from ICE Futures U.S. market quotations, and weather variables were obtained from 
the ERA5 global reanalysis dataset, both widely recognised for their reliability and extensive 
historical coverage (Hersbach et al., 2023; International Cocoa Organisation [ICCO], 2024). 
A rolling window approach was utilised to structure the data into sequential input-output 
pairs appropriate for LSTM forecasting. Specifically, each prediction the model makes relies 
upon the previous 60 days of data. This means the model uses historical cocoa prices and 
corresponding weather conditions from the prior two months to predict the price on the 
subsequent day. The 60-day look-back window was selected based on preliminary 
experiments, which indicated that this window length effectively captured short-term market 
 17 
dynamics without unnecessary complexity (Brownlee, 2017). Robustness checks involving 
alternative window lengths are discussed later to validate this choice further. 
Before feeding data into the LSTM network, all input variables, including cocoa prices and 
weather variables, were normalised using the Min-Max scaling method. This process ensures 
that all input data fall within a numerical range (from 0 to 1), thus preventing any single 
variable, such as cocoa price measured in thousands of dollars, from disproportionately 
influencing the model’s training relative to smaller-scale weather data (Goodfellow, Bengio, 
& Courville, 2016). Such normalisation enhances the stability and efficiency of the training 
process by promoting balanced contributions from each variable, facilitating faster 
convergence of the network parameters (Brownlee, 2017). The prepared dataset was then 
chronologically partitioned into distinct training and testing subsets. The initial 80% of the 
observations, covering earlier periods, were reserved exclusively for training the model. In 
comparison, the remaining 20% were set aside for testing the model’s predictive accuracy on 
previously unseen future data points. This temporal separation ensures a realistic assessment 
of the model’s predictive capabilities, reflecting genuine forecasting conditions where future 
data is unknown during model training (Hyndman & Athanasopoulos, 2018). 
The predictive accuracy of the LSTM model was assessed using a series of well-established 
forecasting metrics, each offering unique insights into the model’s forecasting capabilities. 
The key metrics employed in this research were Root Mean Squared Error (RMSE), Mean 
Absolute Error (MAE), and the coefficient of determination (R²). These metrics are standard 
in time-series forecasting literature due to their intuitive interpretations and relevance to 
economic analyses (Hyndman & Athanasopoulos, 2018). These performance metrics were 
applied to a baseline model, trained solely on historical cocoa prices, and then a set of 
complete versions of the model, including weather data alongside previous price information. 
Comparing these two types of models enabled an explicit assessment of whether including 
weather variables improved forecasting performance. If the complete models with weather 
inputs significantly reduced RMSE and MAE or enhanced the R² compared to the baseline, it 
would indicate meaningful predictive contributions from weather information. 
A series of robustness tests were conducted to ensure the reliability and generalisability of the 
LSTM forecasting model. These tests aimed to verify that the model’s predictions were not 
sensitive to arbitrary methodological choices, such as the exact historical window length used 
for predictions or the spatial resolution of the weather data. First, alternative look-back 
window lengths were tested. Beyond the original 60-day window, models with shorter (30-
day and 45-day) and longer (75-day, 90-day, and 120-day) historical windows were trained 
and compared. This analysis helps confirm whether the 60-day window length captured 
optimal predictive information or whether other window lengths yielded significantly 
different forecasting accuracy (Brownlee, 2017). The results indicated that while the 60-day 
window provided the best balance between accuracy and complexity, predictions remained 
stable across different windows, suggesting that the model’s forecasting performance was 
robust to moderate variations in historical input duration. Second, robustness checks involved 
assessing the impact of aggregating weather data at different geographical scales. Weather 
information was tested at various resolutions, including global averages, country-level 
aggregates, and highly localised (farm-level) measurements within major cocoa-producing 
regions such as Ghana and Côte d’Ivoire (Wibaux, Normand, Vezy, Durand, & Lauri, 2024). 
These tests determined whether the predictive accuracy varied significantly based on spatial 
granularity (Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). Third, permutation 
importance analyses were conducted to quantify the individual predictive contributions of 
 18 
historical cocoa prices, temperature, and precipitation variables. Each feature was randomly 
shuffled in these tests, disrupting its relationship with the true price outcomes. If shuffling a 
feature significantly increased forecast error, it implied the model depended heavily on that 
feature, demonstrating its importance for accurate forecasting (Breiman, 2001; Molnar, 
2020). These robustness checks provide confidence in the validity and reliability of the 
forecasting approach, confirming that the predictive relationships identified in the primary 
analysis are neither sensitive to minor changes in methodology nor solely driven by specific 
arbitrary choices.  
Despite the strengths and robust analytical framework the LSTM forecasting model provides 
and its proposed extensions, several methodological considerations and inherent limitations 
must be acknowledged. Recognising these factors is crucial for correctly interpreting the 
forecasting results and understanding their practical applicability and reliability. First, the 
predictive accuracy of neural network-based models, including LSTMs, depends heavily on 
the quantity and quality of data available (Goodfellow, Bengio, & Courville, 2016). Although 
this study utilised extensive historical price data from the ICE Futures market and reliable 
weather data from the ERA5 dataset (Hersbach et al., 2023; ICCO, 2024), specific relevant 
granular data, such as production-level yield data and comprehensive local economic 
indicators, were unavailable. The absence of these detailed datasets introduces the potential 
for omitted variable bias, implying some crucial factors influencing cocoa prices might not 
have been fully accounted for, potentially limiting the model’s real-world predictive power 
(Tothmihaly, 2018; Molnar, 2020). Although neural networks, particularly LSTMs, possess 
powerful capabilities to model complex nonlinear relationships, their internal processes 
remain challenging to interpret due to their "black box" nature (Molnar, 2020). The difficulty 
in explaining how the model arrives at specific predictions can limit its practical acceptance 
in economic decision-making contexts, where transparency and interpretability are highly 
valued (Siami-Namini, Tavakoli, & Siami-Namin, 2018). Computational constraints 
presented a significant practical limitation in this research. While the analysis demonstrated 
promising predictive capabilities using localised farm-level data within key production areas, 
attempts to expand this detailed analysis to multiple cocoa-producing regions simultaneously 
exceeded available computational resources for this level of granularity. This limited the 
scope of the analysis and prevented comprehensive testing of potentially more powerful 
predictive configurations involving extensive localised data across multiple production areas 
(Brownlee, 2017). 
Results 
This chapter presents and analyses the empirical results of implementing various Long Short-
Term Memory (LSTM) models trained on forecasting cocoa futures prices. The analysis is 
structured to systematically compare a baseline predictive model, relying exclusively on 
historical cocoa price data, with several augmented models that additionally incorporate 
climatic variables, specifically temperature and precipitation(see Table 1 for an overview of 
all models). Each model's predictive performance is quantified using the standard 
performance metrics. Robustness checks are then included to ensure the reliability and 
validity of the findings. The results are contextualised to highlight the economic significance 
of incorporating weather data into commodity price forecasting, addressing the theoretical 
assumption that climatic factors substantially impact cocoa market dynamics. 
 19 
The primary objective of this study was to evaluate whether incorporating weather variables, 
specifically temperature and precipitation, improves predictive accuracy for cocoa futures 
prices compared to a baseline model relying solely on historical price data. This investigation 
was grounded in the theoretical expectation that climatic conditions significantly affect cocoa 
yields and market prices, given the market’s structural characteristics dominated by a small 
number of multinational corporations and vulnerable smallholder farmers. I first implemented 
a standard univariate LSTM neural network to establish a baseline for predicting cocoa 
futures prices, relying solely on historical prices as inputs. The dataset comprises monthly 
settlement prices of cocoa futures contracts from January 1980 to February 2025. Data were 
chronologically split into training (80%) and testing sets (20%) to preserve temporal 
dependencies. Specifically, the first 80% of the observations formed the training set, while 
the remaining 20% constituted the testing set of the price data. The graph showing a monthly 
average version of the price illustrates how the all-time high price of December 2024 is 
included in the last 20% of the data. 
 
Figure 4 
Here is a Table showing all the models trained, including the mentioned Base model, to 
which an inclusion of weather variables was compared. The table shows that all subsequent 
models include one of the weather data sets specified in the variables section. The only 
models that do not include past prices along with their specific version of training data are the 
two models, number 12, “Only Global Temp”, and number 13 “, Only Global Temp & 
Precipitation”. A table with a more detailed overview of variable inclusion for the models is 
located in Appendix B. 
# Model MSE RMSE MAE 𝑅2* 
1 Base model* 48512.988 220.257 92.317 0.998 
2 Global Temp 68145.273 261.046 114.669 0.983 
3 Global Temp & Precipitation 56718.905 238.157 104.133 0.986 
4 The 52 countries’ Temp 176119.893 419.667 166.944 0.957 
5 The 52’s Temp & Precipitation 582472.609 763.199 307.891 0.858 
6 Côte d’Ivoire’s & Ghana’s 1707077.556 1306.552 499.517 0.585 
total Temp data 
7 IC’s & G’s farm area: Temp 1462084.261 1209.167 452.496 0.644 
8 Farm area: Temp & it’s 966032.998 982.869 389.126 0.765 
optimal Range 
 20 
9 Only IC’s farm area: Temp & 644050.656  802.528 307.683 0.843 
Precipitation 
10 IC & G’s farm area: 418815.955 647.160 237.078 0.898 
Precipitation 
11 IC & G’s farm area: 6364638.522 2522.823 1527.357 -0.548 
Precipitation & its Optimal  
Range 
12 Only Global Temp 4001419.667 2000.355 1088.016 0.027 
 
13 Only Global Temp & 5740023.581 2395.835 1397.295 -0.396 
Precipitation 
Table 1 
 
The Base model and its counterpart 
 
The base model (number 1), which only included the past prices as training data for 
predicting cocoa futures prices in dollars, had predictive accuracy, as seen in the main table. 
The structure of the base model was a sequence of the previous 60 days (look-back period = 
60), meaning that each forecasted value depended solely on price movements within the 
preceding four months. The network consisted of two stacked LSTM layers containing 32 
and 16 hidden units, respectively, followed by a dense linear output layer to generate one-
month-ahead predictions. Model training employed the Adam optimiser with Mean Squared 
Error (MSE) loss, complemented by early stopping to mitigate potential overfitting. This 
structure was also used for every subsequent model, with weather variables added to the 
training data to compare how each additional version of the weather data affected the model's 
predictive accuracy. 
 21 
 
Figure 5 
Figure 5 plots the mean-squared error (MSE) for the training and validation sets over 
50 epochs. Training loss drops sharply in the first epoch (passthrough of the entire data) from 
roughly 2.5 × 10⁻⁴ to 7 × 10⁻⁵ and then plateaus, indicating the optimiser has made most 
weight adjustments early. Validation loss falls from about 6 × 10⁻⁴ to 3 × 10⁻⁴ during the first 
10 epochs but fluctuates thereafter, hinting at mild overfitting or high variance rather than 
continued improvement. 
The predictive accuracy of the Base model on the test set is summarised by evaluation 
metrics presented in Table 1 alongside other notable models. The base model achieved a 
Mean Squared Error (MSE) of approximately 48513, Root Mean Squared Error (RMSE) of 
about 220 USD per tonne, Mean Absolute Error (MAE) of around 92 USD per tonne, and an 
R² of 0.988, highlighting its strong predictive power derived exclusively from past price 
information. 
 22 
 
Figure 6 
Figure 6 plots the predicted versus actual cocoa futures prices over the test period, 
demonstrating that the model effectively captures medium-to-long-term price dynamics. 
Despite the high accuracy, noticeable deviations appear around periods of pronounced price 
volatility and sharp market shifts, suggesting opportunities for improvement by integrating 
additional explanatory variables. The established baseline thus provides a firm reference for 
comparison. The subsequent models are extended by incorporating climatic data, total 
monthly precipitation and 2-meter above-ground temperature, to determine whether 
integrating weather conditions contributes additional predictive power beyond historical 
prices alone. 
Figure 7 shows two of the same type of graphs as those in Figure 6 and Figure 5 combined. 
These new graphs in Figure 7 show the performance of the model with only temperature as 
training data, number 12, “Only Global Temp”. This and model number 13, “Only Global 
Temp & Precipitation”, was constructed to test the performance of weather data alone for 
predicting the global price of cocoa, the US cocoa futures market price (See Table 1). 
 
Figure 7 
 23 
The first graph in Figure 7 illustrates learning curves for this temperature-only model 
(number 12). Although there was a low training loss, the validation loss remained 
significantly higher, stabilising at approximately 0.04. This pronounced gap between training 
and validation loss suggests the model struggled to capture the underlying price dynamics 
from weather data alone. This performance in the model's training was mirrored in the right-
hand graph in Figure 7, where the predicted price failed to capture the nuances of market 
pricing. It completely failed to register that the market escalated to an all-time high in pricing, 
peaking at the end of 2024, as evidenced in the price graph. An even poorer performance in 
predictive power was observed in the second of the weather-only models (number 13), which 
had a 𝑅2 with a value of -0.39582, as shown in the main table above. This model also utilised 
only weather variables to predict cocoa prices, in the form of temperature data and 
precipitation data. These models demonstrated that an LSTM model with these specifications 
could not predict the world price of cocoa using weather data alone. This is why all other 
models (except the base model) incorporate both past prices and some form of weather 
variable as training data. 
The Models of Special Interest 
# Model MSE RMSE MAE R2 
1 Base  48512.988 220.257 92.317 0.998 
3 Global Temp & 56718.905 238.157 104.133 0.986 
Precipitation 
5 The 52’s Temp & 582472.609 763.199 307.891 0.858 
Precipitation 
9 Only IC’s farm 644050.656 802.528 307.683 0.843 
area: Temp &   
Precipitation 
Table 2 
Table 2 highlights the four models I find most informative since they cover all three levels of 
geographical specificity for which the weather data could impact cocoa economically. I 
include the base model (number 1) because its predictive power is notably high: with an R² of 
0.998 at a one-day horizon, yesterday’s cocoa-futures price explains almost all of tomorrow’s 
variation. Such strong autocorrelation is typical when both (i) forecast windows are short and 
(ii) market participants rapidly incorporate new information into prices (Box & Jenkins, 
1970). 
The baseline model’s root-mean-square error is approximately $220 per tonne. Compared 
with the price range indicated in the historical chart (approximately $3,000 to $12,000 per 
tonne), this translates to an average error of 1.8% to 7.3% of the prevailing price. In practical 
terms, the model typically remains within single-digit percentage points on ordinary days. 
However, it can still miss the occasional double-digit jumps that have characterised the recent 
rally. 
The second model of special interest is number 3: “Global Temp & Precipitation.” It 
performed nearly as well as the base model in terms of predictive power, 𝑅2 = 0.986207. This 
model suggests that it is approximately as suitable for the model's performance when global 
weather data is included. This implies that it is primarily the global weather changes (together 
with past prices) that could serve as a basis for price prediction as effectively as just past 
 24 
prices and the global price of cocoa. This suggestion, based on predictive power, aligns with 
what Bilal and Känzig (2024) propose in their paper: that global climate change has a greater 
impact on economic outcomes than local weather events as shocks to the economy.  A large 
majority of global cocoa output comes from countries whose capitals Are situated within 
approximately 1,000 km of the Equator, for instance, Côte d’Ivoire, Ghana, Indonesia, 
Nigeria, and Ecuador collectively produce roughly 80% of the world's cocoa (Ritchie et al., 
2023; ICCO, 2023, 2024). These equatorial, agriculture-dependent economies are among the 
most climate-vulnerable, as rising temperatures and erratic rainfall directly threaten cocoa 
physiology and yields (Schroth et al., 2016; Wibaux et al., 2024; Carr & Lockwood, 2011). 
This suggests that there should be close ties between the overall risk to economic output, as 
stated by Bilal and Känzig (2024), and the supply-side performance of the cocoa market. 
Then there is model number 5,” The 52’s Temp & Precipitation”, they are 52 of the total list 
of cocoa-producing countries presented by our world in data that emerged because of weather 
data availability. This model uses the arithmetic mean of temperature and precipitation from 
each of these 52 countries that are located, as stated in the previous model, in the near 
vicinity of the equator. A natural continuation from the previous model, since these 52 are 
among the countries most negatively affected by climate change (Chatzopoulos et al., 2020). 
The weather data from this list of countries should have some predictive power for the 
economic performance of the supply side of cocoa, reflected in the world price of cocoa to 
some extent. The predictive power of the fifth model could be interpreted as a reflection of 
this agricultural origin of the commodity, with a 𝑅2  value of 0.86 (see main or collapsed 
table). This suggests that approximately 86% of the price may be predicted using the previous 
prices and the weather data for the countries that produce nearly all the cocoa in the world.  
The final of the most notable models is number 9, “Only IC’s farm area: Temp & 
Precipitation.” This model uses all available weather data collected within the geographical 
region that produces cocoa in Côte d’Ivoire. Since this country alone produces 45% of all 
cocoa in the world, an R value of 84% is not an entirely unlikely reflection of the economic 
reality of the cocoa market. Though it is notable that it is not much different from the 
predictive power of model number 5, this suggests that there is almost an equal amount of 
predictive power in using the local version of the weather data as the average version on a 
per-country basis. Since this is only one of the countries with the highest cocoa production, I 
am led to believe that there would be even higher predictive power in utilising the most 
localised weather data in all the top-producing countries in the world simultaneously. I was 
unable to test this, as including all available weather data from the farming regions of Ghana 
(the world’s second-largest producer of cocoa) caused my working environment in Google 
Collab to crash. 
 
Robustness tests 
The first robustness test was conducted on the baseline model (see Table 1) to evaluate its 
reliability. One of these robustness tests involved extensive hyperparameter tuning, focusing 
primarily on determining the optimal look-back window, batch size, and number of epochs. 
The look-back window, which represents the number of past days used for predicting future 
prices, was systematically tested across various durations: 30, 45, 60, 75, 90, and 120 days. 
 25 
 
Figure 8 
The results shown in Figure 8 identify the optimal window as 60 days, achieving the lowest 
RMSE of 195.03 (See Appendix A for the rationale behind different look-back windows). In 
comparison, shorter windows of 30 and 45 days produced higher RMSE values of 231.85 and 
199.81, respectively, while longer windows, such as 75, 90, and 120 days, yielded RMSE 
values of 219.03, 196.17, and 221.76, respectively. 
 
 
Figure 9 
Figure 9 shows the other robustness test on the base model, which was tuning experiments 
that evaluated the effects of different batch sizes and epoch combinations. A batch size of 32 
combined with 30 epochs emerged as the most effective setting, delivering the lowest RMSE 
of 186.79. Alternative configurations revealed significant variability in performance. For 
instance, a batch size of 16 resulted in progressively worse outcomes with increasing epochs, 
yielding RMSE values of 229.41 (30 epochs), 291.87 (50 epochs), and a notably high RMSE 
of 525.65 (100 epochs), indicative of severe overfitting. Similarly, increasing batch sizes to 
64 resulted in less optimal RMSE values, ranging from 191.48 (30 epochs) to 218.33 (50 
 26 
epochs), with slight improvement at 100 epochs (RMSE 192.31), though still inferior to the 
optimal configuration. 
The subsequent set of robustness tests was conducted on the models of particular interest (see 
the condensed table) that incorporated weather variables in the training data. These models 
were evaluated using permutation importance tests. Permutation importance is a post-hoc 
model interpretability technique that assesses how each input feature affects a predictive 
model’s performance. The fundamental concept is to randomly permute (shuffle) the values 
of one feature in the model’s validation data, thereby breaking its association with the true 
outcome, and then observe the change in the model’s error rate. If the model’s error increases 
significantly after permuting a feature, it suggests that the model was heavily reliant on that 
feature (Breiman, 2001; Fisher, Rudin, & Dominici, 2019); conversely, if shuffling a 
feature’s values has minimal effect, that feature is likely not essential to the model’s output. 
This procedure is frequently employed as a robustness check. For instance, in an LSTM 
model predicting an economic indicator from historical price, rainfall, and precipitation data, 
permuting each input sequentially demonstrates how much the prediction accuracy declines, 
thus confirming the variable’s influence. Permutation importance provides a straightforward, 
model-agnostic method to interpret even intricate deep learning models, yet it has significant 
limitations. It indicates how strongly a feature impacts prediction error without disclosing the 
direction of the feature’s effect (i.e., it is an “undirected” importance measure). Furthermore, 
if predictors are highly correlated, the test can understate a variable’s importance, as the 
model may retrieve similar information from a correlated feature when one is shuffled. 
Despite these limitations, permutation tests remain a simple and effective tool for evaluating 
variable relevance as part of model robustness analysis in deep learning (Molnar, 2020). 
The first of these permutation importance tests was carried out on model number 3, “Global 
Temp & Precipitation”. The test yielded a baseline root-mean-squared error (RMSE) of 
228.43 as shown in Figure 10 
  
Figure 10 
When shuffling the values of the various variables included in that model, it was found that 
shuffling the historical temperature data resulted in an RMSE of 228.30, a change of ±0.13 so 
small that it is indistinguishable from ordinary sampling noise; thus, temperature provides 
virtually no incremental predictive information. When shuffling the historical precipitation 
data, the resulting RMSE was 228.08, reflecting a change of ±0.3431, implying that the 
network had been exploiting spurious correlations in the precipitation series. When those 
correlations are broken, out-of-sample accuracy improves slightly, signalling a near-zero or 
 27 
mildly negative importance weight. When shuffling the price history data, the resulting 
RMSE was 2,143.6164, representing a change of +1,915.1910, demonstrating that almost all 
forecast values derive from the autoregressive structure embedded in past prices. Because 
precipitation and temperature are deemed to be highly collinear with, or unable to substitute 
for, the price series, the magnitude of this jump can be interpreted as a robust upper bound on 
their explanatory power (Molnar, 2020). The results indicate that the purported “Global Temp 
& Precipitation” model is mainly driven by price-only, weather variables contribute no 
material signal and may even introduce noise.  
The permutation importance was then tested on model number 5, “The 52’s Temp & 
Precipitation”.  
 
Figure 11 
 
Figure 11 presents the results of permutation importance for model 5 (“The 52’s Temperature 
& Precipitation”). When permuting the price history data, the out-of-sample root mean 
squared error (RMSE) increases by roughly 1,200 units. In contrast, shuffling any individual 
weather channel affects the error by only a few units.  
 
Figure 12 consolidates the average weather data per 
country into two categories: rain and temperature. It 
confirms that, on average, both categories yield a 
mean ΔRMSE statistically indistinguishable from 
zero, whereas the price category alone sustains the 
complete 1,200-point deterioration. 
 
Figure 12  
For permutation importance tests, a large positive ΔRMSE indicates that the model’s 
predictive distribution is highly sensitive to the information contained in that variable 
 28 
(Breiman, 2001; Fisher, Rudin, & Dominici, 2019). The results imply that nearly all 
predictive power originates from the autoregressive structure in historical prices; the 
temperature and precipitation data from the 52 countries contribute no significant signal. 
Several weather channels show slightly negative importances, suggesting that the LSTM 
exploits noise or spurious correlations in those inputs, which are eliminated when the series 
are permuted (Molnar, 2020). Since the weather variables are considered not strongly 
collinear with past prices, the standard caveat that permutation tests can understate 
importance in multicollinearity (Altmann et al., 2010) is unlikely to alter this conclusion. 
These results demonstrate that the purported model number 5, “The 52’s Temp & 
Precipitation”, is effectively a price-only model for this dataset and architecture. 
Lastly, a permutation importance test was conducted on model number 9, “Only IC’s farm 
area: Temp & Precipitation”, indicating contrasting influences of historical price data and 
weather conditions (temperature and precipitation) within cocoa-producing regions of the 
Côte d’Ivoire. Specifically, shuffling past cocoa price data results in a total increase in mean 
squared error (ΔMSE) of 3.433. In contrast, permuting weather variables collectively leads to 
a higher total ΔMSE of 38.646; however, the average contribution per weather variable is 
minimal, at approximately 0.077. 
These results reveal important nuances. While combined weather variables seem influential, 
their average individual impact is extremely low, suggesting that the overall significance 
arises primarily from the large number of weather variables rather than the strong predictive 
power of any single channel. Conversely, the price variable alone accounts for a considerable 
individual impact (3.433), indicating a substantial predictive dependence on historical price 
dynamics. 
Consequently, the model primarily utilises historical price information as its main predictive 
feature, while weather variables provide minimal explanatory power individually. Their 
combined significance largely reflects quantity rather than quality in predictive terms. This 
finding indicates that price-driven market dynamics mainly influence the predictive 
performance of this LSTM model, which is encapsulated in historical prices. At the same 
time, weather conditions contribute only minor incremental information.  
Discussion 
This thesis addresses the research question: "Can future global cocoa prices be effectively 
predicted by combining historical cocoa price data with weather indicators, specifically 
temperature and precipitation?" The findings derived from the LSTM models provide 
intriguing insights, albeit with notable methodological limitations. 
The baseline LSTM model, which relied exclusively on historical prices, achieved notable 
predictive power (R² ≈ 0.998). However, this strong result highlights an important inference: 
historical cocoa futures prices inherently incorporate substantial information about market 
dynamics and expectations, leaving little room for incremental predictive gains from weather 
data. This suggests either high market efficiency or pronounced speculative activity, where 
market participants swiftly adjust prices to reflect all available information. 
The inclusion of weather variables yielded mixed outcomes. Global weather data (Model 3, 
"Global Temp & Precipitation") achieved high predictive accuracy (R² ≈ 0.986) but did not 
 29 
surpass the baseline model significantly. Conversely, models utilising more localised climate 
data performed comparatively poorly. Model 5, using temperature and precipitation averages 
across 52 countries, experienced a notable reduction in predictive accuracy (R² ≈ 0.858) and 
an approximately threefold increase in RMSE compared to the baseline. These results suggest 
that aggregating climate data across multiple regions introduced noise rather than valuable 
predictive insights. Meanwhile, Model 9, focusing specifically on cocoa-producing regions in 
Côte d’Ivoire, performed slightly better (R² ≈ 0.843) but still did not match the baseline 
model’s performance. This implies that localised climatic effects do influence cocoa 
production, but are insufficient for enhancing short-term price forecasting beyond historical 
prices alone. 
These outcomes contradict expectations derived from existing literature, which typically 
posits climatic variables as valuable predictors of agricultural commodity prices. Two 
interpretations arise from the negligible predictive improvement. First, weather impacts might 
already be embedded in historical prices through market anticipation, reducing the 
incremental value of contemporaneous climate data. Second, the immediate, daily price 
impact of weather fluctuations may be too subtle or complex for models to capture 
accurately, especially given data granularity limitations. 
Robustness checks further emphasised methodological challenges. Permutation importance 
tests demonstrated that historical prices strongly dominated predictive accuracy, as 
scrambling price data significantly degraded model performance. Conversely, permuting 
weather variables had minimal effects, underscoring their limited explanatory power 
irrespective of the geographic level of the weather data. This indicates that variations in 
predictive accuracy across models primarily reflect differences in data noise levels rather than 
the true economic impact of climatic factors. This outcome highlights critical issues 
regarding feature selection and potential overfitting when incorporating complex, high-
density datasets into predictive models. 
If regarded as reliable, these results have significant implications for understanding the cocoa 
market structure and informational efficiency. The dominance of historical price momentum 
as a predictive signal indicates a highly efficient market, rapidly absorbing available 
information into price adjustments. Consequently, additional climate information appears 
redundant or insufficiently detailed to significantly enhance short-term forecasts. While 
aligning with the efficient market hypotheses that suggest limited opportunities to exploit 
publicly available climatic data, the model’s complete disregard for weather impacts likely 
indicates methodological shortcomings rather than inherent market realities. For example, a 
sensitivity analysis (see Figure 11) only identified a slight weather-related price variation in 
Venezuela, which starkly conflicts with established literature that it was due to extreme 
weather events in western Africa that caused the recent substantial price spike of chocolate. 
A plausible explanation for this discrepancy is the complexity of the relationship between 
weather events and global cocoa prices, which a standard LSTM model architecture might 
inadequately capture. An omitted variable bias, particularly the absence of cocoa yield data, 
likely contributes significantly to the observed discrepancy in results. Yield data might 
represent a critical missing link in understanding how weather variations translate into global 
price fluctuations. 
A potential approach using this additional yield data for future research might be structuring 
two interconnected LSTM models analogous to an instrumental variable (IV) regression. 
 30 
Following the framework of DeepIV as introduced by Hartford et al. (2017), one model 
would first estimate the direct relationship between weather conditions and cocoa yields, 
effectively isolating weather-induced supply shocks. Subsequently, a second-stage model 
would forecast cocoa prices based on the predicted yields from the first stage, capturing the 
indirect economic impact of weather on prices more clearly. Such a structured approach 
could enhance the interpretability and accuracy of forecasts by explicitly modelling the causal 
pathway: weather → cocoa yield → cocoa price. This might have a greater chance of 
showing the real economic mechanisms underpinning the futures market dynamics of cocoa. 
References 
Abu, I.-O., Szantoi, Z., Brink, A., Robuchon, M., & Thiel, M. (2020). Cocoa map for Côte 
d’Ivoire and Ghana [Data set]. PANGAEA. https://doi.org/10.1594/PANGAEA.917473 
Ahmed, N. K., Atiya, A. F., El Gayar, N., & El-Shishiny, H. (2010). An empirical 
comparison of machine-learning models for time-series forecasting. Econometric Reviews, 
29(5–6), 594–621. https://doi.org/10.1080/07474938.2010.481556 
Altmann, A., Toloşi, L., Sander, O., & Lengauer, T. (2010). Permutation importance: A 
corrected feature importance measure. Bioinformatics, 26(10), 1340–
1347. https://doi.org/10.1093/bioinformatics/btq134 
Anderson, W., Seager, R., Baethgen, W., & Cane, M. (2018). Trans-Pacific ENSO 
teleconnections pose a correlated risk to agriculture. Agricultural and Forest Meteorology, 
262, 298–309. https://doi.org/10.1016/j.agrformet.2018.07.023 
Bilal, A., & Känzig, D. R. (2024). The macroeconomic impact of climate change: Global vs. 
local temperature (NBER Working Paper No. 32450). National Bureau of Economic 
Research. https://doi.org/10.3386/w32450 
Beg, M. S., Ahmad, S., Jan, K., & Bashir, K. (2017). Status, supply chain and processing of 
cocoa: A review. Trends in Food Science & Technology, 66, 108–
116. https://doi.org/10.1016/j.tifs.2017.06.007 
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–
32. https://doi.org/10.1023/A:1010933404324 
Box, G. E. P., & Jenkins, G. M. (1970). Time series analysis: Forecasting and control. San 
Francisco, CA: Holden-Day. 
Brownlee, J. (2017). Long short-term memory networks with Python: Develop sequence 
prediction models with deep learning. Machine Learning Mastery. 
Carr, M. K. V., & Lockwood, G. (2011). The water relations and irrigation requirements of 
cocoa (Theobroma cacao L.): A review. Experimental Agriculture, 47(4), 653–
676. https://doi.org/10.1017/S0014479711000421 
Chatzopoulos, T., Pérez Domínguez, I., & Zampieri, M. (2020). Climate extremes and 
agricultural commodity markets: A global economic analysis of regionally mediated 
 31 
impacts. Climate Risk Management, 27, Article 
100193. https://doi.org/10.1016/j.wace.2019.100193 
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for 
financial market predictions. European Journal of Operational Research, 270(2), 654–
669. https://doi.org/10.1016/j.ejor.2017.11.054 
Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: 
Learning a variable’s importance by simultaneously studying an entire class of prediction 
models. Journal of Machine Learning Research, 20(177), 1–
81. https://jmlr.org/papers/v20/18-760.html  
Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction 
with LSTM. Neural Computation, 12(10), 2451–
2471. https://doi.org/10.1162/089976600300015015 
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge, MA: MIT 
Press. 
Hartford, J., Lewis, G., Leyton-Brown, K., & Taddy, M. (2017). Deep IV: A flexible 
approach for counterfactual prediction. In D. Precup & Y. W. Teh (Eds.), Proceedings of the 
34th International Conference on Machine Learning(Vol. 70, pp. 1414–1423). 
PMLR. https://proceedings.mlr.press/v70/hartford17a.html 
Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle 
River, NJ: Prentice Hall. 
Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., … 
Thépaut, J.-N. (2023). ERA5 hourly data on single levels from 1940 to the present [Data set]. 
Copernicus Climate Change Service (C3S) Climate Data 
Store. https://doi.org/10.24381/cds.adbb2d47 (Accessed May 4, 2025) 
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 
9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 
Hsiang, S., Meng, K., & Cane, M. (2011). Civil conflicts are associated with the global 
climate. Nature, 476, 438–441. https://doi.org/10.1038/nature10311 
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice (2nd 
ed.). OTexts. Retrieved May 4, 2025, from https://otexts.com/fpp2/ 
IBM Cloud Education. (2020, June 26). What is 
overfitting? https://www.ibm.com/cloud/learn/overfitting (Retrieved May 4, 2025) 
Iizumi, T., Luo, J. J., Challinor, A. J., Sakurai, G., Yokozawa, M., Sakuma, H., … Yamagata, 
T. (2014). Impacts of El Niño Southern Oscillation on the global yields of major 
crops. Nature Communications, 5, 3712. https://doi.org/10.1038/ncomms4712 
International Cocoa Organisation. (2023). Quarterly bulletin of cocoa statistics: November 
2023. Abidjan, Côte d’Ivoire: ICCO. 
 32 
International Cocoa Organisation. (2024, November 29). November 2024 quarterly bulletin 
of cocoa statistics. Abidjan, Côte d’Ivoire: ICCO. Retrieved May 4, 2025, 
from https://www.icco.org/november-2024-quarterly-bulletin-of-cocoa-statistics/ 
Kamu, A., Ahmed, A., & Yusoff, R. (2010). Forecasting cocoa-bean prices using univariate 
time-series models. Journal of Arts, Science & Commerce, 1(1), 71–80. 
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimisation. In Proceedings 
of the 3rd International Conference on Learning Representations (ICLR 
2015). https://doi.org/10.48550/arXiv.1412.6980 
Kozul-Wright, A. (2025, April 21). Bitter truth: Why has chocolate become so expensive? Al 
Jazeera. Retrieved May 4, 2025, from https://www.aljazeera.com/news/2025/4/21/bitter-
easter-truth-why-has-chocolate-become-so-expensive 
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–
444. https://doi.org/10.1038/nature14539 
Letta, M., Montalbano, P., & Tol, R. S. J. (2022). Weather shocks, traders’ expectations, and 
food prices. American Journal of Agricultural Economics, 104(3), 1100–1119. DOI: 
10.1111/ajae.12258 
Mienye, I. D., Swart, T. G., & Obaido, G. (2024). Recurrent neural networks: A 
comprehensive review of architectures, variants, and applications. Information, 15(9), 
517. https://doi.org/10.3390/info15090517 
Molnar, C. (2020). Interpretable machine learning: A guide for making black box models 
explainable (2nd ed.). Independently Published. Retrieved May 8, 2025, 
from https://christophm.github.io/interpretable-ml-book/ 
Namin, S. I., & Namin, A. S. (2018). Forecasting economic and financial time series: 
ARIMA vs. LSTM. Journal of Business Research, 90, 468–
472. https://doi.org/10.1016/j.jbusres.2018.05.001 
Olofintuyi, S. S., Olajubu, E. A., & Olanike, D. (2023). An ensemble deep-learning approach 
for predicting cocoa yield. Heliyon, 9, e15245. https://doi.org/10.1016/j.heliyon.2023.e15245 
Ouyang, H., Wei, X., & Wu, Q. (2019). Agricultural commodity futures prices prediction via 
long- and short-term time-series network. Journal of Applied Economics, 22(1), 468–
483. https://doi.org/10.1080/15140326.2019.1668664 
Prechelt, L. (2012). Early stopping — but when? In G. Montavon, G. B. Orr, & K.-R. Müller 
(Eds.), Neural networks: Tricks of the trade (2nd ed., Lecture Notes in Computer Science, 
Vol. 7700, pp. 53–67). Springer. https://doi.org/10.1007/978-3-642-35289-8_5 
Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., & Cottrell, G. W. (2017). A dual-stage 
attention-based recurrent neural network for time-series prediction. In Proceedings of the 
Twenty-Sixth International Joint Conference on Artificial Intelligence (pp. 2627–
2633). https://doi.org/10.24963/ijcai.2017/366 
 33 
Quartey-Papafio, T. K., Javed, S. A., & Liu, S. (2020). Forecasting cocoa production of six 
major producers through ARIMA and grey models. Grey Systems: Theory and Application, 
10(3), 421–438. https://doi.org/10.1108/GS-04-2020-0050 
Racine Ly & Fousseini Traore & Khadim Dia, (2021). "Forecasting Commodity Prices Using 
Long Short-Term Memory Neural Networks," Papers 2101.03087, arXiv.org, revised Jan 
2021. 
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-
propagating errors. Nature, 323(6088), 533–536. https://doi.org/10.1038/323533a0 
Ritchie, H., Rosado, P., & Roser, M. (2023). Cocoa bean production [Data set]. Our World in 
Data. https://ourworldindata.org/grapher/cocoa-bean-production 
Sari, M., Duran, S., Kutlu, H. et al. Various optimized machine learning techniques to predict 
agricultural commodity prices. Neural Comput & Applic 36, 11439–11459 (2024). 
https://doi.org/10.1007/s00521-024-09679-x 
Schroth, G., Läderach, P., Martinez-Valle, A. I., & Bunn, C. (2016). From site-level to 
regional adaptation planning for tropical commodities: Cocoa in West Africa. Science of the 
Total Environment, 556, 231–241. 10.1007/s11027-016-9707-y 
Siami-Namini, S., Tavakoli, N., & Siami-Namin, A. (2018). A comparison of ARIMA and 
LSTM in forecasting time series. In Proceedings of the 17th IEEE International Conference 
on Machine Learning and Applications (pp. 1394–
1401). https://doi.org/10.1109/ICMLA.2018.00227 
Smeeton, G. (2024, March 21). Easter chocolate prices soar as climate change and El Niño 
bite. Energy & Climate Intelligence Unit. Retrieved May 4, 2025, 
from https://eciu.net/media/press-releases/2024/easter-chocolate-prices-soar-as-climate-
change-and-el-ni%C3%B1o-bite 
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). 
Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine 
Learning Research, 15, 1929–1958. http://jmlr.org/papers/v15/srivastava14a.html 
Tothmihaly, A. (2018). How low is the price elasticity in the global cocoa market? African 
Journal of Agricultural and Resource Economics, 13(3), 209–
223. https://doi.org/10.22004/ag.econ.284986 
Thukral, N., & Tan, F. (2024, December 31). Cocoa tops global commodities rally for 2nd 
year; steel ingredients struggle on China demand. Reuters. Retrieved May 4, 2025, 
from https://www.reuters.com/markets/commodities/cocoa-tops-global-commodities-rally-
2nd-year-steel-ingredients-struggle-china-2024-12-31/ 
Ubilava, D. (2018). The role of El Niño–Southern Oscillation in commodity-price movement 
and predictability. American Journal of Agricultural Economics, 100(1), 239–
263. https://doi.org/10.1093/ajae/aax060 
 34 
United Nations Conference on Trade and Development. (2024, April 2). Chocolate price 
hikes: A bittersweet reason to care about climate change. UNCTAD. Retrieved May 4, 2025, 
from https://unctad.org/news/chocolate-price-hikes-bittersweet-reason-care-about-climate-
change 
Voora, V., Bermúdez, S., & Larrea, C. (2020). Global market report: Cocoa (Sustainable 
Commodities Marketplace Series). International Institute for Sustainable 
Development. https://www.iisd.org/system/files/publications/ssi-global-market-report-
cocoa.pdf 
Wibaux, T., Normand, F., Vezy, R., Durand, J. B., & Lauri, P. É. (2024). Do seasonal 
flowering and fruiting patterns of cacao only depend on climatic factors? The case study of 
mixed genotype populations in Côte d’Ivoire. Scientia Horticulturae, 337, 
113529. https://doi.org/10.1016/j.scienta.2024.113529 
Zelingher, R., & Makowski, D. (2024). Investigating and forecasting the impact of crop-
production shocks on global commodity prices. Environmental Research Letters, 19, 
014026. https://doi.org/10.1088/1748-9326/ad0dda 
Zhang, Z. (2016). A gentle introduction to artificial neural networks. Annals of Translational 
Medicine, 4(19), 370. 
Appendix A  
This appendix provides detailed explanations of key technical concepts and methodologies 
referenced in the main body of this thesis. Its purpose is to enhance readability by isolating 
technical details from the primary discussion, allowing readers interested in deeper technical 
understanding to consult these sections directly. Specifically, this appendix includes 
comprehensive overviews of Artificial Neural Networks (ANNs), Recurrent Neural Networks 
(RNNs), and Long Short-Term Memory (LSTM) networks, which underpin the forecasting 
models utilised in this research. Additionally, it clarifies critical procedures such as dropout 
regularisation, permutation importance analysis, supervised learning processes, and data 
structuring approaches like the look-back window technique. The explanations here are 
designed to support readers in appreciating the methodological robustness and analytical 
choices adopted in the thesis. 
Artificial Neural Networks (ANNs) 
Artificial Neural Networks (ANNs) are computing systems inspired by the human brain’s 
network of neurons. In an ANN, numerous simple processing units (called neurons or nodes) 
are interconnected in layers and collectively learn to transform input data into useful outputs. 
Each neuron receives input signals (numbers), multiplies each by an adjustable weight 
(reflecting the importance of that input), sums them up, and then applies an activation 
function to produce an output signal. This structure allows ANNs to learn complex patterns in 
data by adjusting the weights during training so that the network outputs correct or desirable 
results (Rumelhart, Hinton, & Williams, 1986). An ANN “mimics” how a brain solves 
problems: it takes in information, processes it through many connected units, and outputs a 
prediction or decision based on what it has learned (Zhang, 2016). 
 35 
One key advantage of ANNs is their ability to model non-linear relationships between inputs 
and outputs. Unlike a simple linear regression, which assumes a straight-line relationship, 
neural networks can capture more complicated patterns (Haykin, 1999; Zhang, 2016). They 
achieve this through multiple layers of neurons (often called hidden layers) that progressively 
extract higher-level features from the raw input. ANNs learn from examples in a training 
process, typically using a back-propagation algorithm to gradually adjust the weights so that 
the predictions improve over time (Rumelhart et al., 1986). With enough data and appropriate 
design, ANNs can approximate very complex functions and have been applied successfully in 
many fields, from image recognition to economic forecasting, where they often outperform 
traditional linear models by “learning” the underlying structure directly from data(LeCun, 
Bengio, & Hinton, 2015). 
Recurrent Neural Networks (RNNs) 
Recurrent Neural Networks (RNNs) are specialised neural networks designed to 
handle sequential data and capture temporal dynamics. Unlike a standard feedforward ANN, 
where inputs and outputs are independent, an RNN introduces connections that form directed 
cycles, allowing information to persist or be retained from one step to the next. In practical 
terms, an RNN maintains an internal hidden state updated at each time step based on the new 
input and the previous hidden state. This gives RNNs a form of memory, enabling them to 
use information from earlier in the sequence to inform later outputs. For example, if I am 
using an RNN to predict economic time series data, the network’s hidden state at time t could 
carry forward summarised information about preceding days, allowing the model 
to remember context and dependencies over time (Mienye, Swart, & Obaido, 2024). 
Through this recurrent structure, RNNs are well-suited for data where order matters, such as 
time series records, sentences in language, or any sequence where earlier elements influence 
later ones. However, basic RNNs have difficulty learning long-range dependencies. Over 
long sequences, they may struggle to retain information from far back because the influence 
of a given input tends to diminish as it is propagated through many time steps, a problem 
known as the “vanishing gradient” in training (Goodfellow, Bengio, & Courville, 2016). In 
other words, a simple RNN might “forget” important long-term information when it reaches 
later steps. This limitation led to the development of more advanced recurrent architectures, 
like LSTMs, which are specifically designed to better handle long-term context. 
Long Short-Term Memory Networks (LSTMs) 
Long Short-Term Memory (LSTM) networks are an improved form of RNN that addresses 
the short-term memory issue of traditional RNNS. Introduced by Hochreiter and 
Schmidhuber in 1997, LSTMs were designed to overcome the vanishing gradient 
problem and maintain long-term dependencies in sequence data (Hochreiter & Schmidhuber, 
1997). The key innovation of an LSTM is the use of internal gating mechanisms that regulate 
the flow of information over time. Each LSTM unit (often called an LSTM cell) contains 
three primary gates: an input gate, a forget gate, and an output gate. These gates are valves 
that open or close to decide how much new information to write, how much old information 
to forget, and how much of the current cell’s information to output to the next time step. By 
dynamically controlling these flows, an LSTM can selectively remember important 
information and forget irrelevant data as sequences evolve (Gers, Schmidhuber, & Cummins, 
2000; Mienye et al., 2024). 
 36 
This gated cell design allows LSTMs to preserve information longer than standard RNNs. For 
example, in an economic time series, an LSTM might learn to retain the effect of a price 
shock or a seasonal pattern that occurred many days ago and use it to inform current 
predictions. In contrast, a basic RNN might have already lost track of that influence. The 
internal cell state in an LSTM acts like a conveyor belt carrying pertinent information along, 
unchanged unless explicitly modified by the gates. This architecture enables LSTMs to 
effectively capture long-term trends and context in sequential data while mitigating the risk 
of old information “fading away”. In summary, an LSTM is a powerful sequence model that 
extends the memory of neural networks, making it highly suitable for forecasting tasks like 
this thesis, where both recent and more distant historical data can be crucial for predicting 
future values. 
Dropout Regularisation 
Dropout is a regularisation technique used to prevent neural networks from overfitting, when 
a model learns the training data too specifically and fails to generalise to new data. The core 
idea of dropout is surprisingly simple. During training, randomly drop out a fraction of the 
neurons in the network on each pass (iteration) so that they temporarily do nothing. In 
practice, this means that for each training update, every neuron (apart from the output 
neurons) has a certain probability (the dropout rate, e.g. 20%) of being ignored, its output is 
set to zero, and its connections are not updated. This might sound counterintuitive, but 
removing parts of the model deliberately has a powerful effect. By forcing the network to 
train with different subsets of neurons each time, dropout prevents any single neuron or small 
set of neurons from becoming overly specialised to the training data (Srivastava et al., 2014). 
Instead, the network must learn redundant, more robust features that are useful in conjunction 
with many different subsets of other neurons. 
At prediction time (after training), dropout is turned off, and all neurons are used but with 
their learned weights scaled appropriately to account for the averaging effect of dropout 
training. The result is similar to ensembling many different neural network configurations 
together. In effect, dropout makes the network act like an ensemble of numerous smaller 
networks that vote on the outcome, improving generalisation (Srivastava et al., 2014). 
Empirically, dropout regularisation often leads to a model performing better on validation 
and test data. In our context, I applied dropout (e.g. dropping 20% of neurons) in the LSTM 
layers to reduce overfitting risk, thereby improving the model’s ability to generalise patterns 
from historical data to unseen future data (Srivastava et al., 2014). 
Look-back Window (Time Step) 
A look-back window (a sliding window or time-step window) refers to the number of past 
time periods used as input features to forecast the next value in a sequence. This concept 
transforms time-series data into a supervised learning format using historical observations to 
predict future ones. For example, if I choose a look-back window of 60 days, the model will 
at each step consider the past 60 days of data (e.g. past 60 daily prices, and possibly other 
variables over those days) to predict the price on the 61st day. This rolling window approach 
creates structured input-output pairs from sequential data: the inputs are the values from the 
previous 60 days, and the output is the value at the next day. Moving this window along the 
series daily, I generate many training examples that teach the model how past patterns relate 
to future outcomes (Brownlee, 2017). 
 37 
The look-back window is an important design parameter in time-series neural networks 
because it defines the scope of historical information given to the model. A window that is 
too short might miss important longer-term trends or cycles; on the other hand, a too long 
window could introduce unnecessary noise or complexity and make learning harder. In 
traditional time-series analysis, this idea corresponds to using lagged values as predictors. For 
instance, an autoregressive model of order 60 uses the previous 60 observations to forecast 
the next one (Box & Jenkins, 1970). Similarly, the thesis’s neural network approach 
explicitly sets 60 prior days as the input length, allowing the LSTM model to capture recent 
momentum and potentially seasonal effects within roughly two months of historical data. The 
term “time step” in this context often refers to each discrete time interval (each day is one 
time step), and an LSTM with 60 time steps of look-back means it processes sequences that 
are 60 steps long. Choosing an appropriate look-back window is typically done through 
experimentation or prior knowledge, balancing the need for sufficient context against the risk 
of diluting relevant information. 
Supervised Learning 
Supervised learning is the most common training paradigm in machine learning, in which a 
model learns from examples that include both the input data and the desired output. In 
supervised learning, I provide the algorithm with a labelled dataset, a set of training examples 
where each example consists of input features and a known correct output (target). The 
learning process involves the model making predictions on the inputs and adjusting its 
internal parameters (weights) to reduce the error between its predictions and the true outputs. 
Essentially, the model is “supervised” by the feedback from these known answers, which 
guide it to improve over time. The ultimate goal is for the trained model to accurately predict 
the outputs for new, unseen inputs by generalising the patterns it learned from the training 
data. 
Forecasting cocoa prices is a supervised learning problem: the input might be a sequence of 
past prices (and possibly other variables like weather), and the known output is the actual 
price on the next day. By showing the network many examples of such input-output pairs, it 
can learn the relationship between historical trends and future prices. The use of supervised 
learning implies there is a clear objective signal to learn from for each day in the training set; 
the model is told what the correct next-day price was. Standard algorithms for supervised 
learning include neural networks (like our LSTM), decision trees, linear regression, etc., all 
of which seek to minimise a loss function that quantifies the prediction error. Over time, the 
model parameters are tuned (often via gradient descent optimisation) to produce outputs as 
close as possible to the true values (Goodfellow et al., 2016). Once training is complete, if the 
model has learned well, it should be able to take a new sequence of recent data 
and predict the next price with useful accuracy. Supervised learning contrasts with 
unsupervised learning (where no labelled outputs are given) and reinforcement learning 
(where feedback comes from rewards), which are not used in this thesis. Here, everything is 
framed as supervised learning with historical observations and known outcomes. 
Early Stopping 
Early stopping is a practical technique used during model training to prevent overfitting. The 
idea is to stop training the model before it starts overfitting the training data. In a typical 
training process, as the model learns, its performance on the training set consistently 
 38 
improves. However, performance on a separate validation set (data not used for training, 
intended to simulate new/unseen data) may stop improving after a certain point. It can even 
begin to deteriorate as the model starts to memorise noise in the training data. Early stopping 
monitors the model’s performance on the validation set. It halts the training process when 
improvement has levelled off, essentially when additional training no longer leads to better 
generalisation. In practice, one might set a rule: “If the validation loss has not decreased for, 
say, five consecutive epochs (passes through the data), then stop training.” At that point, the 
model parameters from the epoch with the best validation performance are retained. This 
way, the model is frozen at the optimal point before overfitting sets in (Prechelt, 2012; 
Goodfellow et al., 2016). 
In simpler terms, early stopping acts as an automatic brake on the training process. Rather 
than training for a fixed number of epochs (iterations), the algorithm tests the model as it 
trains to see when it has had enough. Training is stopped early when further training yields no 
benefit on validation data. This saves computation time and often results in a model that 
performs better on test data. In this thesis, I employed early stopping by monitoring the 
validation loss during training. I halted the training once the validation loss stopped 
decreasing (indicating the model might start overfitting if I continued). Early stopping is one 
of several regularisation strategies (like dropout, discussed above) that help achieve a model 
that generalises well. It leverages the idea that over-training is counter-productive, and that 
there is an optimal point in training after which the model begins to learn spurious details of 
the training set (IBM Cloud Education, 2020). By using early stopping, I aim to capture the 
underlying signal in the data without capturing the noise. 
Permutation Importance 
Permutation importance is a technique for measuring the importance of input features in a 
trained machine learning model, and it provides an intuitive way to interpret complex models 
like neural networks. The basic procedure is as follows: for a given feature (input variable), I 
randomly shuffle or permute its values across all observations in the dataset, breaking any 
real relationship between that feature and the target. I then run the data through the model 
again and observe how much the model’s error (for example, the prediction mean squared 
error) increases due to this shuffling. Suppose the model’s performance drops significantly 
(error increases a lot). In that case, it indicates that the shuffled feature was important to the 
model’s predictions because the model struggles when its true information is destroyed. 
Conversely, if permuting a feature causes little to no change in error, that feature likely was 
not very important to the model’s decision-making. In short, a feature’s permutation 
importance score can be defined by the magnitude of increase in prediction error when that 
feature’s values are randomised: larger increase = more important feature (Breiman, 2001; 
Fisher, Rudin, & Dominici, 2019). 
One advantage of permutation importance is that it is model-agnostic and easy to compute. It 
does not require examining the model's weights or structure; it merely observes how 
predictions change when inputs are perturbed. This attribute applies to any predictive model, 
whether a neural network or a random forest. It benefits complex “black-box” models where 
traditional interpretation proves challenging (Molnar, 2020). By applying permutation 
importance, researchers can rank which predictors (e.g., past prices, temperature, rainfall, 
etc.) influence the model’s forecasting accuracy most. In the context of our study, I utilised 
permutation importance to ascertain which factors the LSTM model relied upon most for its 
predictions. The results, for instance, indicated that shuffling the price history significantly 
 39 
degraded performance (indicating the high importance of past prices). In contrast, shuffling 
weather variables had minimal effect, suggesting they were relatively unimportant in the 
model’s predictions, a finding consistent with Y’s evaluation.  
It is worth noting a caveat: permutation importance assumes that features are independent 
regarding their contribution to the model. If two features are correlated or provide redundant 
information, permuting one may not lead to a significant performance drop because the other 
correlated feature still offers similar information. In such cases, permutation importance 
might underestimate the importance of those features or distribute the importance across 
them. This is a known limitation; for example, if several weather variables are highly 
correlated, shuffling one at a time might not reveal a significant effect since the model can 
rely on the others. In other words, in multicollinearity (features moving together), the 
permutation test can understate a feature’s true importance (Altmann et al., 2010). Despite 
this limitation, permutation importance remains a popular and straightforward tool for 
interpreting models. It provides a clear, quantitative insight into which inputs our neural 
network considers most relevant, thereby adding transparency to the forecasting model’s 
behaviour. Each importance score derived from this method correlates directly with the 
model’s predictive performance, an intuitive metric for economists and stakeholders to grasp 
which drivers genuinely contribute to the price predictions. 
Appendix B 
This Appendix overviews the Variable inclusion in the 13 different LSTM models. 
 
A Historical Daily Cocoa Futers Prices 
B Global Daily Avrage Tempreture 
C Global Daily Avrage Percipitation 
D The 52 Cacaoa Farming Countries Respective Daily Avrage Temprature 
E The 52 Cacaoa Farming Countries Respective Daily Avrage Percipiation 
F Côte d’Ivoire’s All Available Era5 Grid-cell Temperature Data 
G Côte d’Ivoire’s All Available Era5 Grid-cell Precipitation Data 
I Ghana’s All Available Era5 Grid-cell Temperature Data 
J Ghana’s All Available Era5 Grid-cell Precipitation Data 
K Côte d’Ivoire’s Cacao Farm area’s All Available Era5 Grid-cell Temperature Data 
L Côte d’Ivoire’s Cacao Farm area’s All Available Era5 Grid-cell Precipitation Data 
M Ghana’s Cacao Farm area’s All Available Era5 Grid-cell Temperature Data 
N Ghana’s Cacao Farm area’s All Available Era5 Grid-cell Precipitation Data 
# Model A B C D E F G I J K L M N 
1 Base model* X             
2 Global Temp X X            
3 Global Temp & Precipitation X X X           
4 The 52 countries’ Temp X   X          
5 The 52’s Temp & Precipitation X   X X         
6 Côte d’Ivoire’s & Ghana’s total X     X  X      
Temp data 
 40 
7 IC’s & G’s farm area: Temp X         X  X  
8 Farm area: Temp & it’s optimal X         X  X  
Range 
9 Only IC’s farm area: Temp & X         X X   
Precipitation 
10 IC & G’s farm area: Precipitation X          X  X 
11 IC & G’s farm area: Precipitation X          X  X 
& its Optimal Range 
12 Only Global Temp  X            
13 Only Global Temp &  X X           
Precipitation 
 
 
 41