Deep Learning Cocoa Price Prediction with Weather Data William Bergander Supervisor: Nicklas Nordfors Master’s thesis in Economics, 30 hec Spring 2025 Graduate School, School of Business, Economics and Law, University of Gothenburg, Sweden 1 Abstract Climate-induced volatility in global cocoa markets poses significant challenges to producers and stakeholders, notably evidenced by the severe price surge in late 2024 following adverse weather events. This thesis investigates whether integrating weather variables, specifically temperature and precipitation, with historical cocoa futures prices can enhance predictive accuracy using Long Short-Term Memory (LSTM) neural networks. Leveraging daily price data from the ICE Futures U.S. exchange (1980–2025) and comprehensive meteorological data from the ERA5 dataset across key cocoa-producing regions, multiple LSTM models, including global and localised scales, were developed and evaluated. This deep learning approach to cocoa price prediction, incorporating meteorological inputs, addresses a gap in existing forecasting literature. Contrary to prior studies and expectations, models incorporating detailed weather indicators did not improve forecasting accuracy over a baseline model relying solely on historical prices, which achieved a notably high predictive performance (R² ≈ 0.998). A global-scale model using average climate indicators matched the baseline model's predictive power (R² ≈ 0.986), while more localised models underperformed significantly. Robustness tests, including permutation importance analyses, confirmed that historical price data predominantly drove the levels of predictive power, respectively, with weather variables providing minimal incremental value. This lack of improvement suggests that weather impacts may already be priced into market trends due to market efficiency or were too complex to capture with the utilised naive LSTM model. These results highlight methodological limitations, particularly the absence of cocoa yield data, which likely restricted the ability to capture the indirect economic impacts of weather conditions accurately. Future research incorporating this cocoa yield data within a structured causal framework, such as the DeepIV method, has the potential to more precisely model the economic transmission from climate impacts to the global cocoa market pricing. 2 Table of Contents Introduction..................................................................................................................4 Literature Review ..........................................................................................................6 Theory ..........................................................................................................................8 Data ........................................................................................................................... 12 Price ............................................................................................................................................... 13 Weather Variables .......................................................................................................................... 14 Geographic Data (Polygons) ........................................................................................................... 15 Methodology ............................................................................................................... 16 Results ....................................................................................................................... 19 The Models of Special Interest........................................................................................................ 24 Robustness tests ........................................................................................................................... 25 Discussion.................................................................................................................. 29 References ................................................................................................................. 31 Appendix A: Terminology ............................................................................................. 35 Artificial Neural Networks (ANNs)................................................................................................... 35 Recurrent Neural Networks (RNNs) ................................................................................................ 36 Long Short-Term Memory Networks (LSTMs) .................................................................................. 36 Dropout Regularisation .................................................................................................................. 37 Look-back Window (Time Step) ...................................................................................................... 37 Supervised Learning ....................................................................................................................... 38 Early Stopping ................................................................................................................................ 38 Permutation Importance ................................................................................................................ 39 Appendix B: Model Variable Inclusion ......................................................................... 40 3 Introduction The global cocoa prices have been highly volatile in recent years due to climate-related supply shocks. For example, in late 2024, cocoa futures prices nearly tripled, reaching a record of approximately $12,900 per tonne after a series of extreme weather events severely impacted harvests in West Africa (Smeeton, 2024; Thukral & Tan, 2024). The repeated occurrences of drought and heavy rainfall led to multi-year supply shortages, constraining global stocks and causing prices to surge by around 136% from mid-2022 to early 2024 (Kozul-Wright, 2025). The adverse climate conditions in a region where Côte d’Ivoire and Ghana produce over half of the world’s cocoa supply (Ritchie, Rosado, & Roser, 2023) highlight the susceptibility of cocoa markets to climate shocks. Experts in climate science warn that extreme weather events are likely to continue causing sudden increases in food commodity prices, including cocoa, as climate change exacerbates weather instability (Bilal & Känzig, 2024). Figure 1 – The Historical Price of Cocoa Futures on the New York Stock Exchange, Aggregated Monthly for aesthetic purposes This price volatility has tangible consequences. Sharp swings in cocoa prices threaten the livelihoods of millions of smallholder farmers and complicate planning for chocolate manufacturers and traders (Voora, Bermúdez, & Larrea, 2020). Accordingly, accurate forecasting of agricultural commodity prices is crucial for stakeholders across the supply chain to manage risks and make informed decisions (Pandit et al., 2024). Traditional price- forecasting methods rely solely on historical price patterns, potentially neglecting external factors such as weather. Given cocoa’s dependence on climate-sensitive tropical agriculture, weather conditions like rainfall, temperature, and drought significantly impact cocoa yields and supply levels (Carr & Lockwood, 2011; Schroth, Läderach, Martinez-Valle, & Bunn, 2016), thus influencing price movements. Market participants often anticipate these impacts, adjusting price expectations even before crop losses materialise (Letta, Montalbano, & Tol, 2022). This suggests that timely meteorological data could contain predictive signals for price formation that are not captured by historical prices alone. These observations lead directly to the formal research question of this study: “Can future global cocoa prices be effectively predicted by combining historical cocoa price data with weather indicators, specifically temperature and precipitation?” This question seeks to understand whether integrating climate variables can enhance the accuracy and reliability of 4 price forecasting models. Given the potential economic benefits and improved market stability that accurate forecasting could bring, this research aims to assess the added value of incorporating weather data into cocoa price predictions. Intuitively, as weather shocks influence supply fluctuations that are quickly factored into market prices, one might expect that including such information would enhance forecast accuracy. This thesis empirically investigates that premise through a deep learning method called LSTM. The main reason for this; is that it is both inspired by the prevalence and effect that AI in the form of LLMs (Large Language Models) has had on society over the last years and the ability for the LSTM to capture complex temporal relationships in time series data(see Method section for more). This Thesis uses Long Short-Term Memory (LSTM) neural networks, a deep learning architecture suitable for time-series forecasting, to address the research question. The analysis utilises a comprehensive dataset spanning from 1980 to early 2025, including daily cocoa futures prices and corresponding weather observations. The weather variables, temperature and precipitation, are derived from the ERA5 climate reanalysis dataset, which provides extensive global meteorological data (Hersbach et al., 2023). Several LSTM model variants are constructed for comparison: a baseline model that uses only lagged cocoa prices as inputs, and a set of augmented models that incorporate temperature and precipitation indicators in addition to past prices. These weather-augmented models differ in their geographical scope of climate data, ranging from local weather measures in key cocoa-growing regions to broader regional or global average climate indices. The thesis evaluates each model’s out-of-sample forecasting performance using standard accuracy metrics (root mean squared error, mean absolute error, and R²), thereby assessing whether the inclusion of weather features yields any predictive improvement over the price-only baseline. Contrary to expectations, including explicit weather variables, the main findings did not produce better forecasts in this context. The simplest LSTM model that used only historical prices achieved the best predictive accuracy, as indicated by the highest 𝑅2(See Table 1 in the Results section). Most weather-enriched models performed worse than this price-only baseline, suggesting that the added climate information often introduced noise or complexity that the model could not readily translate into better price predictions. Only one augmented model using global average temperature and precipitation series matched the baseline model’s performance in predictive power: 𝑅2(See Table 1 in the Results section). In all other cases, the baseline outperformed the more complex specifications. These results indicate that a straightforward inclusion of daily weather variables provides little to no forecasting benefit under the current modelling setup. In short, naive incorporation of climate data did not improve cocoa price predictions, hinting that more advanced model structures and better data handling and inclusion may be needed to capture the complex economic relationships between weather and cocoa markets. Finally, the structure of the thesis is as follows: It begins with a Literature Review on commodity price forecasting and its connection to weather. The Theory section examines how weather affects agricultural markets. The Data section outlines the cocoa price and ERA5 weather datasets. Methodology describes the LSTM model and compares the baseline to weather-augmented models. Results present forecasting outcomes and robustness checks. Discussion interprets findings, considers limitations and market efficiency, and suggests future research. Appendix A provide additional technical details, and Appendix B provides an overview of model variable inclusion. 5 Literature Review This chapter reviews the existing literature on agricultural commodity price forecasting, emphasising the influence of weather variables on market volatility and predictive modelling methods. It begins by outlining the established relationship between climatic variability and commodity price fluctuations, highlighting cocoa's particular sensitivity due to its concentrated tropical production areas. Subsequently, the discussion examines traditional econometric forecasting approaches and underscores their limitations in capturing complex market dynamics influenced by weather. The review then transitions to recent advances in forecasting methodologies, focusing on the superior capabilities of deep learning approaches, particularly Long Short-Term Memory (LSTM) networks. It identifies a critical gap in current research: integrating detailed meteorological data with advanced deep learning models specifically for global cocoa price prediction. This thesis addresses this gap by combining high-resolution weather data with LSTM models, offering potential improvements in forecasting accuracy and significant practical implications for market stakeholders. Accurate forecasts of agricultural commodity prices are crucial for producers, consumers, and policymakers to manage risks in volatile markets (Pandit et al., 2024). A growing body of research has examined how weather variability influences commodity prices, consistently finding that climatic fluctuations correlate with market volatility (Letta, Montalbano, & Tol, 2022; Ubilava, 2018). Weather extremes such as droughts, excessive rainfall, and temperature anomalies can directly reduce crop yields, thereby tightening supply and driving up prices (Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). For example, large-scale climate oscillations like the El Niño–Southern Oscillation (ENSO) are known to disrupt agricultural production globally, leading to price spikes across multiple crops (Iizumi et al., 2014; Anderson, Seager, Baethgen, & Cane, 2018). Importantly, commodity markets tend to preempt these impacts by quickly adjusting price expectations when adverse weather is anticipated. In other words, traders often bid prices up before production losses materialise (Letta et al., 2022). Cocoa is a commodity especially sensitive to weather fluctuations due to its concentrated tropical growing regions. West Africa alone accounts for over half of the world’s cocoa output, so abnormal weather in this area can significantly influence global supply and prices (ICCO, 2023). Variations in rainfall and temperature affect cocoa tree health and yields; for instance, insufficient rainfall can stress trees while excessive moisture fosters fungal diseases, which reduce harvests (Schroth, Läderach, Martinez-Valle, & Bunn, 2016). Recent events underscore this vulnerability: in 2023, West African cocoa farms experienced a severe drought followed by heavy rains that exacerbated pest and disease outbreaks, contributing to a steep rise in global cocoa prices (UNCTAD, 2024). Broader climate phenomena have also been linked to cocoa market dynamics. Ubilava (2018) found that ENSO-related weather patterns significantly affect agricultural price movements, highlighting the value of climate indicators in forecasting models for this crop. Historically, however, most commodity price forecasting studies have relied on traditional econometric models that do not explicitly include weather variables. Time-series approaches like Autoregressive Integrated Moving Average (ARIMA) and Vector Auto-Regression (VAR) have been widely used for simplicity. However, they often struggle to capture agricultural markets’ nonlinear and dynamic behaviour (Ahmed, Atiya, El Gayar, & El- Shishiny, 2010). In the case of cocoa, Kamu, Ahmed, and Yusoff (2010) demonstrated that a 6 univariate ARIMA model based solely on past prices could achieve moderate predictive accuracy. However, it could not account for external drivers such as climatic anomalies or other supply shocks. Similarly, Quartey-Papafio, Javed, and Liu (2020) observed a continued reliance on ARIMA-type methods in cocoa market analysis. Their study, which forecasted cocoa production for the six largest producer countries, showed that while ARIMA models were commonly employed, more advanced techniques (like grey prediction models) yielded superior accuracy. This persistent use of conventional models underscores a gap in the literature: the need for forecasting approaches that integrate exogenous factors, most notably weather, rather than relying only on historical price patterns. Researchers have increasingly turned to machine learning techniques for commodity price forecasting in recent years, focusing on deep learning models capable of modelling complex temporal patterns. Long Short-Term Memory (LSTM) neural networks, a class of recurrent neural networks, are especially well-suited to time-series data due to their ability to capture long-range dependencies and nonlinear relationships (Namin & Namin, 2018). LSTM-based models have outperformed traditional statistical methods in various commodity forecasting applications, demonstrating higher accuracy and reliability (Sari, Duran, & Kutlu, 2024). One advantage of LSTMs is their flexibility in handling multivariate inputs, which can incorporate multiple features beyond just past prices (Ly, Traore, & Dia, 2021). This capability is particularly relevant for agriculture, where combining price data with weather or other external variables could enhance predictions. For example, Olofintuyi, Olajubu, and Olanike (2023)report that an LSTM model significantly outperformed a standard recurrent neural network in predicting cocoa yields, underscoring the value of deep learning in capturing the influence of weather and growth cycles on agricultural output. Likewise, Ouyang et al. (2019) found that LSTM models provided markedly more accurate agricultural commodity futures price forecasts than ARIMA and VAR benchmarks. They attributed this improvement to LSTM’s robustness in handling noisy, non-stationary time series data, characteristics typical of commodity markets influenced by irregular weather events. Parallel streams of research in economics and climate science further reinforce the importance of integrating weather data into commodity price forecasting. Hsiang, Meng, and Cane (2011) famously linked global climate variability to rises in civil conflicts, illustrating how severe weather anomalies can disrupt societies and, by extension, economic stability. In a more directly related analysis, Bilal and Känzig (2024) showed that global temperature fluctuations strongly predict extreme weather events more than localised temperature measures. This suggests that broad climate indices (such as global temperature anomalies or oceanic oscillation indicators) carry valuable information and could improve the robustness of price prediction models. Additionally, Zelingher and Makowski (2024) demonstrated that incorporating unexpected crop production shocks, often driven by adverse weather, substantially enhances the accuracy of global commodity price forecasts. Their study found that cocoa prices exhibit particularly high volatility in response to weather-induced production shortfalls, even more than staples like maize or soybeans. These findings underscore that a forecasting approach attuned to climate variability, especially for climate- sensitive commodities like cocoa, could yield significant benefits in anticipating price movements. Despite the evident influence of weather on cocoa production and the advances in predictive modelling, there remains a notable gap in the literature: very few studies have combined detailed weather variables with modern deep learning methods for cocoa price forecasting. In other words, prior research has tended to examine climate impacts on cocoa markets and to 7 apply sophisticated forecasting techniques separately, without fully leveraging their intersection. This thesis addresses that gap by developing an LSTM-based forecasting model for global cocoa prices that explicitly integrates rich meteorological information alongside historical price data. In particular, the model incorporates precipitation and temperature variables from major cocoa-growing regions (as well as relevant global climate indicators) and past price trends to predict future price movements. The present work seeks to contribute to the literature by uniting high-resolution weather data with a learning framework to predict price. It explores a previously underexplored approach to global cocoa price prediction that aims to improve forecasting accuracy. Theory This chapter establishes the theoretical foundation linking climate variability to cocoa market dynamics. It examines how weather conditions influence cocoa production and, in turn, how these production shocks are transmitted to global price movements. I also consider the role of market expectations in anticipating climate impacts, which can amplify price volatility. By focusing on these linkages, the chapter provides the basis for why incorporating both historical price trends and weather indicators may improve the prediction of future cocoa prices. The overarching goal is to connect agricultural and economic insights to the study’s forecasting question. In particular, I outline how temperature and precipitation, two key weather variables, affect cocoa yields, and how yield fluctuations driven by weather ultimately shape market prices. The discussion then turns to how commodity markets respond to expected supply changes, often adjusting prices before harvest outcomes are realised. Finally, the chapter summarises these insights and explains how they motivate the inclusion of weather variables alongside past prices in the empirical model. Cocoa production is susceptible to climatic conditions, especially rainfall and temperature, which regulate critical biological processes of the cocoa tree. Rainfall patterns largely govern the tree’s phenology, including flowering and pod development. For instance, the onset of the rainy season typically triggers mass flowering events that later translate into higher yields (Wibaux et al., 2024). Cocoa trees thrive with roughly 1,5002,500 mm of rainfall annually, well-distributed throughout the year, and even short deviations can significantly affect output. Excessive rainfall can waterlog soils and promote fungal diseases that reduce pollination and pod set, whereas prolonged drought stresses the trees, curtailing pod development and lowering both the quantity and quality of beans (Carr & Lockwood, 2011; Wibaux et al., 2024). These climate-driven impacts on yields are especially pronounced in West Africa, a region that produces about two-thirds of the world’s cocoa, making its output particularly vulnerable to swings in precipitation patterns (Voora et al., 2020). Temperature is another crucial factor for cocoa physiology. Extended periods of excessive heat can disrupt photosynthesis, cause flowers and small pods to abort, and lead to a condition known as Cherelle wilt, where young pods die off (Carr & Lockwood, 2011). On the other extreme, temperatures that are too cool can slow the growth and development of pods. High temperatures often become most devastating when coupled with drought, as water stress and heat weaken the trees and increase their susceptibility to pests and diseases. Hot and dry conditions have been linked to cocoa swollen shoot virus outbreaks and black pod disease, which can decimate yields (Beg et al., 2017). In sum, deviations in temperature and precipitation from the narrow optimal conditions for cocoa can sharply reduce agricultural 8 output. This biological sensitivity directly links weather variability and potential supply shocks in the cocoa market. Understanding cocoa’s response to weather is essential for anticipating production swings. Farmers adopt various adaptation strategies, such as irrigating during dry spells or planting shade trees to cool plantations. However, these measures are not always feasible or sufficient across the diverse smallholdings that dominate cocoa farming (Carr & Lockwood, 2011). Thus, in practice, cocoa yields remain strongly driven by weather conditions. Empirical observations support this: for example, extreme weather in West Africa in 2023 (a severe drought followed by heavy rains) caused sharp yield losses due to drought stress and fungal outbreaks, contributing to a steep rise in global cocoa prices (UNCTAD, 2024). This illustrates how local climate anomalies can have outsized effects on worldwide supply. Overall, cocoa’s biological vulnerability to temperature and rainfall underscores that these variables are logical candidates to include when modelling cocoa price movements. Adverse weather often means smaller harvests, which can tighten supply and put upward pressure on prices. Climate-induced yield fluctuations in key producing regions feed directly into global cocoa price dynamics via basic supply and demand forces. When weather extremes significantly reduce cocoa output, as with a significant drought or storm-related crop failure, the immediate effect is a negative supply shock. Given that Ivory Coast and Ghana alone account for over half of the world's cocoa production (Ritchie, Rosado, & Roser, 2023), a local harvest shortfall in West Africa can substantially shrink global supply. Even minor supply disruptions can trigger disproportionately large price swings in commodity markets with relatively inelastic demand. Cocoa demand has been estimated to be highly price-inelastic in the short run (with an elasticity of 0.06), meaning consumers and industry cannot easily substitute or reduce cocoa use when prices rise (Tothmihaly, 2018). As a result, a production drop of just a few percentage points can translate into a much larger percentage increase in price. This mechanism explains why weather-driven crop failures often coincide with sharp price spikes. For instance, drought conditions that damage a growing season’s crop can send futures prices soaring as buyers scramble for limited beans, amplifying volatility across the entire chocolate supply chain (Tothmihaly, 2018). In late 2024, such dynamics were on full display when a series of climate shocks constrained West African output and cocoa futures prices nearly tripled to record highs. In summary, adverse weather events in critical cocoa- growing areas transmit to the world market by tightening supply and pushing prices upward. Global commodity exchanges play a pivotal role in this transmission by quickly incorporating current and expected production information. Cocoa is primarily traded in future markets, such as the Intercontinental Exchange (ICE) in New York and ICE Europe (formerly NYSE Liffe) in London, which serve as reference points for international prices (Gilbert, 2016). These markets react immediately to news or forecasts of crop outcomes. When reports emerge of poor rainfall or extreme heat threatening the upcoming harvest, traders on these exchanges bid up futures contracts in anticipation of future shortages. This way, local weather developments are rapidly reflected in global cocoa prices. Speculative trading behaviour can exacerbate the speed and scale of price response. For example, if reports indicate an El Niño or other climate anomaly likely to depress West African yields, speculative buying can drive prices higher well before any cocoa pod is lost (Anderson, Seager, Baethgen, & Cane, 2018; Iizumi et al., 2014). These futures market dynamics ensure that yield shocks are transmitted internationally: a supply shortfall in one part of the world 9 causes prices to rise everywhere, distributing the economic impact of regional climate events across all market participants. Another factor intensifying the price impact of weather shocks is the lack of buffers in the supply chain. Cocoa stocks (inventories) are often limited relative to annual consumption, so reserves cannot always offset a poor harvest. Likewise, the concentration of production in a few countries means diversification is low if Ghana and Côte d’Ivoire experience adverse weather, and there are few alternative sources to prevent a global deficit. The result is that weather-related supply contractions tend to cause notable jumps in price levels (Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). Put differently, fundamental economic theory predicts that a leftward shift of the supply curve for an inelastic-demand commodity will lead to a steep rise in equilibrium price. Cocoa’s market behaviour in recent episodes of drought and heavy rainfall shocks is consistent with this prediction, reinforcing that weather is a critical driver of price fluctuations. In agricultural markets like cocoa, prices react to realised production outcomes and move based on future supply and demand expectations. Market participants continuously form expectations about upcoming harvest sizes, often using weather information as an early signal. Suppose traders expect an ongoing drought or an approaching storm to cut cocoa yields drastically. In that case, they will incorporate that expectation into today’s pricing by bidding up futures contracts or holding back stocks. This anticipatory behaviour means prices can rise (or fall) well before the harvest report. Letta, Montalbano, and Tol (2022) provide evidence of how strongly expectations influence agricultural market prices: they estimate that roughly 85% of the eventual price impact of a drought shock is reflected in prices before the actual crop loss occurs. In other words, the market preemptively prices most of the shock when credible information about a drought becomes available. Such behaviour illustrates the efficient processing of new information in commodity markets and contributes to pronounced price volatility. When many traders act on forecasted weather events, prices can swing rapidly based on anticipated scenarios that may or may not fully materialise. Speculative trading and the herd behaviour of market actors further amplify this volatility. Suppose early reports or meteorological forecasts predict unfavourable conditions (such as a delayed rainy season or an extreme heat wave). In that case, speculators might aggressively buy cocoa futures, causing price jumps that feed on themselves. This can overshoot fair value if the weather threat later abates, leading to corrections that decrease prices. Thus, weather- driven expectations can potentially create seesaw price patterns, as markets oscillate between bullish and bearish outlooks with each new forecast. The limited ability of most cocoa farmers to hedge against these fluctuations due to factors like lack of market access or financial tools means that much of the weather risk is borne out in spot prices (Tothmihaly, 2018). The combination of fundamental supply uncertainty and speculative anticipation makes cocoa one of the more volatile agricultural commodities (Ubilava, 2018). Notably, even broadly known climate phenomena like the El Niño Southern Oscillation (ENSO) introduce volatility: when an El Niño event is predicted, markets often react strongly because historically ENSO has brought drier conditions to some cocoa regions, affecting yields (Ubilava 2018; Anderson et al., 2018). In summary, the expectation of future weather impacts is a significant driver of short-term price volatility in cocoa markets, as prices adjust to current supply-demand imbalances and the market’s collective forecasts of upcoming production. 10 Crucially, the fact that prices respond to anticipated weather implies that there may be predictive information in climate indicators that could be harnessed systematically. Suppose traders are watching rainfall and temperature data to form price expectations. In that case, an empirical model that includes those same weather variables might capture some of the early signals that pure price-trend models miss. However, it is also possible that markets, being forward-looking, already embed most readily available weather information into current prices. This tension between the potential value of weather data and the market’s efficiency in pricing it is an underlying theme for this thesis forecasting approach. The theoretical insight remains: weather anomalies drive actual supply changes and speculative dynamics, making them a key piece of the cocoa price puzzle and a candidate for inclusion in predictive modelling. It is essential to distinguish between the immediate effects of local weather shocks and the gradual influence of global climate change on the cocoa market. Short-term weather extremes such as droughts, floods, or heat waves in key growing regions can abruptly curtail cocoa yields, triggering supply shortages and price spikes. These localised disruptions in production and trade create short-run volatility in cocoa prices. By contrast, long-term climate change (e.g. rising average temperatures and shifting rainfall patterns) operates more slowly but steadily, altering the baseline growing conditions for cocoa. Over time, a warming climate and changing precipitation regimes can depress agricultural productivity and shift suitable cultivation zones, affecting cocoa output trends and necessitating costly adaptations by producers. Recent research underscores the significant economic consequences of climate variability, reinforcing its relevance for climate-sensitive commodities like cocoa. For instance, Bilal and Känzig (2024) demonstrate that broad climatic disruptions such as an unexpected increase in global average temperature can substantially reduce overall economic output, underscoring that climate change is a serious macroeconomic risk rather than a negligible background trend. Equally important, their findings reveal that warm, low-income regions, including the tropical countries where most cocoa is grown, suffer the most severe economic losses from such climate shocks. This suggests that the cocoa sector, concentrated in these vulnerable areas, faces acute risks from abrupt weather events and gradual climate shifts. Therefore, in analysing weather-driven cocoa price dynamics, it is crucial to account for the dual nature of these climate influences: on the one hand, localised weather shocks cause sharp but temporary supply disruptions and price spikes, while on the other hand, long-term climate changes gradually exert persistent pressure on cocoa production. Recognising this distinction is essential for understanding and forecasting cocoa market behaviour, thereby setting the stage for examining how specific weather variations translate into yield changes and price movements. The theoretical considerations above underscore a tight linkage between climate factors and cocoa price behaviour. Biologically, climate and precipitation patterns directly influence cocoa yields, meaning that unusual weather can substantially alter the quantity of cocoa beans produced. Economically, these yield changes feed into global prices: a weather-induced supply shortfall leads to higher prices, especially given the commodity’s inelastic demand and concentrated production base. Moreover, commodity markets anticipate these effects. Traders use weather information to forecast future supply, often driving prices ahead of observed production changes. This convergence of evidence suggests that weather variables carry meaningful information about future price movements, above and beyond what is 11 contained in past prices alone (Letta et al., 2022; Ubilava, 2018). A forecasting model may be improved by incorporating such climate indicators alongside historical price data. The insights from this theoretical framework directly motivate the empirical strategy in the following chapters. Specifically, they support the hypothesis that combining historical cocoa prices with key weather metrics (temperature and rainfall) will yield more accurate predictions of future prices than relying on price history alone. By including weather variables in the model, the thesis aims to capture the early warning signals of supply shifts that pure time-series price models might overlook. At the same time, integrating past price accounts for established trends, seasonality, and any prior information the market has already absorbed. The next part of this thesis will translate these concepts into an empirical forecasting approach, testing whether leveraging both sets of information, climate indicators and price momentum, can enhance global cocoa prices' predictive power. This bridge from theory to practice is formalised to evaluate the research question and provide evidence on the value of climate-informed price forecasting for stakeholders in the cocoa market. Data This chapter describes the datasets used in the empirical analysis of this thesis, detailing their sources, characteristics, and how they are structured to support forecasting of global cocoa prices. First, the selection of the timeframe and frequency of the data is justified, with particular emphasis on the daily data frequency, which allows capturing short-term volatility and abrupt price movements critical for the chosen forecasting methodology. The chapter then outlines the economic dataset, consisting of daily cocoa futures prices from the ICE Futures U.S. exchange, highlighting how the continuous front-month price series was constructed. Following this, the weather data are described, derived from the ERA5 reanalysis dataset and structured into multiple spatial resolutions, ranging from global averages to highly localised cocoa-farming areas in West Africa. Lastly, the geographic data detailing cocoa farm boundaries is presented, which allows precise alignment of weather data to the areas directly affecting cocoa production. The analysis in this thesis spans the period from 1 January 1980 to 28 February 2025, dictated by the availability of reliable cocoa futures price data for that interval. While this sample does not include the notable cocoa price boom of the 1970s (due to a lack of consistent data from that earlier era), it does encompass the historically significant late-2024 price spike. Including this recent extreme event is valuable for analysing the interplay between unusual weather shocks and market dynamics. In summary, the chosen timeframe balances a long historical window with the inclusion of critical events, thereby supporting a robust examination of weather–price relationships. A daily frequency is adopted for all variables, representing this topic's highest meaningful information density. Daily observations capture short-term volatility in cocoa prices and weather, which is essential for identifying immediate market responses to environmental conditions. This high-frequency approach also provides a larger dataset for training the LSTM neural network model, enabling it to detect fine-grained time patterns that would be potentially lost at coarser (e.g. monthly) frequencies. In other words, using daily data allows the model to learn from day-to-day fluctuations and abrupt shocks, rather than only long-term trends. 12 The cocoa price dataset reflects trading activity on the ICE Futures U.S. exchange (New York). Consequently, weekends and U.S. exchange holidays are omitted from the price and weather time series when the cocoa futures market is closed. This ensures that the two datasets remain perfectly aligned in time, avoiding any artificial gaps or asynchronous observations. Aside from these intentional gaps and the geographic coverage limitations discussed below, the compiled dataset contains no missing values. Figure 2 Ritchie, H., Rosado, P., & Roser, M. (2023). Cocoa bean production, 1961 2022 [Chart]. Our World in Data. https://ourworldindata.org/grapher/cocoa-bean-production through the Creative Commons license Using cocoa production data from Our World in Data (OWID) (Ritchie et al., 2023), an initial sample of 62 cocoa-producing countries was identified. However, due to incomplete ERA5 weather data coverage for some regions (e.g. small or data-sparse producers), 10 countries were excluded from the analysis. These exclusions are justified because the dropped countries contribute only marginally to global cocoa output. By concentrating on the remaining 52 countries with complete data, the sample preserves all major cocoa-growing regions and thus the actors with substantial market influence. In practice, the excluded countries have no ERA5 grid data to assign for temperature/precipitation, and their absence does not materially affect the global or regional analyses, given their minor production volumes. The ERA5 weather dataset and the ICE Futures U.S. price series were chosen for their extensive historical coverage and recognised reliability. These sources are well-regarded in academic and industry research, making them appropriate for a rigorous analysis. Additionally, industry literature (e.g. reports by the International Cocoa Organisation) affirms that cocoa futures in New York and London markets are the primary venues determining the world market price of cocoa. Focusing on the future prices of New York (in the absence of London data) is therefore a reasonable representation of global price dynamics. One notable limitation of the dataset is the absence of the early-1970s cocoa price surge; if detailed weather and price data from that period were available, including them might have revealed additional structural relationships and potentially enhanced the neural network’s predictive performance. Nonetheless, the chosen data span (1980–2025) is the most extended period for which consistent, high-quality data could be obtained, and it captures a wide range of market conditions, including the recent unprecedented shock. Price The economic variable of interest is the benchmark cocoa futures price traded on ICE Futures 13 U.S. (commodity code CC). Each standard cocoa futures contract represents 10 metric tonnes of cocoa beans, and prices are quoted in nominal US dollars per metric tonne. The dataset consists of the official daily settlement prices published at the close of each trading day, covering the whole sample period from January 1980 through February 2025. To construct a continuous price series over this 45-year horizon, the price of the nearest delivery month (front-month) contract is recorded for each trading day. As contracts expire and trading rolls to the next maturity, the series always follows the front-month contract. This rolling procedure yields a single uninterrupted daily price timeline spanning multiple decades and contract cycles. All prices are kept nominal (no adjustment for inflation or currency changes over time). The rationale for using unadjusted prices is to evaluate the raw market responsiveness to weather variations, i.e. how the actual prices that traders observe (and react to) move with weather events. Deflating prices to constant dollars might remove long-run trends or dampen extreme values that are, in fact, relevant to understanding market reactions. Similarly, no outlier filtering or smoothing is applied. By preserving these features, the analysis allows the LSTM model to learn from the full range of historical price behaviour, including volatility clustering and rare shock events. Weather Variables The weather variables are derived from the ERA5 reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5 offers a comprehensive, globally gridded record of atmospheric conditions at a spatial resolution of 0.25° × 0.25° in latitude and longitude. For this study, two types of daily meteorological data were extracted to represent weather conditions: 2-meter air temperature and total precipitation. Temperature values are expressed in Kelvin (K) and precipitation in meters of water equivalent per day (m/day), which are the native units of the ERA5 product. These scientific units are retained throughout the analysis to maintain precision and consistency with the source data. ERA5 data are available hourly, aggregating daily values to match the price frequency. The weather series was filtered to the exact trading days used in the price dataset to ensure synchronisation with the cocoa market data; every weather observation in the dataset corresponds to a day when price data is also present. To capture the impact of weather at different spatial scales, the daily ERA5 variables were aggregated or selected across four hierarchical geographic scopes ranging from global averages to local farm-level conditions: 1. Global scale: An arithmetic mean of the weather variable is computed over all ERA5 grid cells worldwide (land and ocean). This provides a single global climate time series (for temperature and precipitation) to indicate broad-scale climate patterns. The calculation is straightforward and reproducible, though giving equal weight to each grid cell introduces a slight bias toward high-latitude regions (since those regions have many grid cells that are colder/drier on average, the global mean may be influenced by their variance). 2. Country scale: Country-level averages are computed for each of the 52 cocoa- producing countries in the dataset. Each country’s ERA5 grid points within its national boundaries are averaged to yield a daily temperature series and a daily precipitation series specific to that country. (As noted, 10 minor producing countries 14 with insufficient ERA5 coverage were excluded to ensure each included country has complete data.) This results in a panel of country-specific weather indicators, covering all major cocoa origin countries identified via the OWID production data. 3. Regional grid scale (Ghana & Côte d’Ivoire): At a finer resolution, all individual ERA5 grid cells within the top two cocoa-producing countries, Ghana and Côte d’Ivoire, are retained separately rather than averaged. Roughly 4,500 grid cells (combined across both countries) fall inside Ghana or Côte d’Ivoire. By keeping data at the grid-cell level for these key regions, the analysis can account for localised weather variability within the countries that collectively produce most of the world’s cocoa. This granular regional scale targets the areas of highest interest (given Ghana and Côte d’Ivoire’s dominant production share). It allows exploration of weather effects that might differ across various parts of these countries. 4. Cocoa-farming-area scale: A targeted mask of cocoa cultivation areas in West Africa is used to select ERA5 grid cells most relevant to cocoa agriculture. This mask is derived from a high-resolution (10 m) land-cover dataset provided by the EU’s Africa Knowledge Platform, which maps cocoa plantations in Côte d’Ivoire and Ghana using 2019 satellite imagery (European Commission JRC, 2020). All ERA5 grid cells whose centre falls within the identified cocoa farm polygons are extracted. This yields approximately 2,214 grid cells within Côte d’Ivoire’s cocoa farming zones and 2,248 within Ghana’s, comprising the finest spatial tier of weather data. Instead of averaging over an entire country, this scale captures the daily temperature and rainfall specifically over the cocoa-growing districts, where weather fluctuations directly affect cocoa trees. Geographic Data (Polygons) The highest level of geographical detail in the dataset comes from the cocoa farm polygon data used to define the above cocoa-farming-area mask. These polygon shapefiles, obtained through the EU’s Africa Knowledge Platform, delineate individual cocoa-farming parcels across Ghana and Côte d’Ivoire. The polygons were generated by classifying satellite imagery (Sentinel-1 radar and Sentinel-2 optical data from 2019) to identify land cover as cocoa plantations, and then vectorising those areas into map polygons (European Commission JRC, 2020). The resulting dataset provides a detailed representation of where cocoa cultivation occurs on the ground. By overlaying the ERA5 grid on these farm boundaries, each cocoa farm region is matched to one or more nearby ERA5 grid points. This geospatial linking ensures that the weather measurements correspond closely to cocoa-growing locations. In practical terms, the analysis captures temperature and precipitation exactly in the areas where cocoa is produced, rather than using broader regional or national averages that might include non-cocoa land. This granular approach facilitates localised environmental metrics for the cocoa sector. It enhances the relevance of the weather data for price forecasting, assuming that conditions in the cocoa farms (e.g. a drought in a key cocoa belt) ultimately influence supply expectations and thus prices. Including this detailed geographic data, alongside broader aggregates, allows the thesis to examine whether localised weather shocks in core production zones have discernible effects on global cocoa prices, compared to more diffuse global or country-level weather signals. 15 Methodology This section outlines the approach and methods to investigate whether historical cocoa price data and weather indicators, specifically temperature and precipitation, can effectively predict future global cocoa prices. Given the complexity of the techniques utilised in this research, detailed explanations of technical terms, such as Artificial Neural Networks (ANNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTM) networks, along with comprehensive model-specific details, are provided in Appendix A. This ensures clarity and readability in the main text while keeping technical details accessible for those who require deeper insights. Cocoa price forecasting inherently involves sequential data, as today's price tends to influence tomorrow's price. Recognising and accurately modelling this sequential nature requires specialised approaches. Among such methods, Long Short-Term Memory (LSTM) networks have gained prominence due to their effectiveness in handling sequential datasets such as financial time series, climate data, and agricultural commodity prices (Hochreiter & Schmidhuber, 1997; Fischer & Krauss, 2018). Unlike traditional models that may not effectively capture long-range patterns, LSTM networks are specifically designed to address this shortcoming by efficiently utilising historical data from extended periods (Qin et al., 2017). These capabilities make them particularly well-suited to predicting agricultural commodity prices influenced by climatic conditions, as observed in cocoa markets (Olofintuyi, Olajubu, & Olanike, 2023). LSTM networks function somewhat analogously to human memory: they can selectively remember or forget information from the data they encounter over time (Gers, Schmidhuber, & Cummins, 2000). This ability to retain relevant historical information, such as past price trends and weather events, and disregard less significant data allows LSTMs to identify and model complex relationships within data spanning weeks or months (Siami-Namini, Tavakoli, & Siami-Namin, 2018). Consequently, this network architecture is particularly suitable for addressing the research question of this thesis: assessing whether historical cocoa price trends combined with weather conditions enhance the accuracy of future cocoa price predictions. 16 The forecasting model developed for this research employs a specialised neural network architecture comprising several interconnected layers. These layers allow the network to learn complex temporal patterns and relationships in cocoa price data influenced by weather conditions. The core of the model is built around two stacked LSTM layers. The stacking of LSTM layers enhances the network's ability to capture intricate relationships by allowing the first LSTM layer to identify fundamental sequential patterns, which are subsequently refined by the second layer (Fischer & Krauss, 2018). Such a design effectively addresses the complexity of cocoa price forecasting, where interactions between climatic factors and market dynamics can span several months (Ly, Traore, & Dia, 2021). This multi-layered approach has been widely recognised Figure 3- By Glosser.ca - Own work, Derivative of File:Artificial neural network.svg, CC BY-SA 3.0, in financial and agricultural forecasting literature as https://commons.wikimedia.org/w/index.php?curid superior for capturing long-range dependencies =24913461 compared to simpler, single-layer network architectures (Namini & Namin, 2018). Dropout layers are strategically included between and following the LSTM layers to improve the model's robustness further. Dropout is a technique that randomly deactivates a fraction of connections between neurons during training (Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov, 2014). By periodically removing these connections, dropout reduces the likelihood of the network memorising specific patterns found only in the training data, a common issue known as overfitting (Goodfellow, Bengio, & Courville, 2016). This method ensures the model can generalise well and reliably predict cocoa prices under varying conditions not previously encountered during training. The final component of the model architecture is the dense (fully connected) output layer. This layer translates the sequential patterns identified by the LSTM layers into specific numeric predictions, such as the predicted cocoa futures price for the following day. The output layer thus provides a tangible, interpretable forecast that can be directly compared against observed market data, enabling practical economic interpretation (Brownlee, 2017). The model requires data structured clearly and consistently to develop accurate predictions. Data preparation involved carefully aligning historical cocoa futures prices with corresponding daily weather measurements (temperature and precipitation). Price data were sourced from ICE Futures U.S. market quotations, and weather variables were obtained from the ERA5 global reanalysis dataset, both widely recognised for their reliability and extensive historical coverage (Hersbach et al., 2023; International Cocoa Organisation [ICCO], 2024). A rolling window approach was utilised to structure the data into sequential input-output pairs appropriate for LSTM forecasting. Specifically, each prediction the model makes relies upon the previous 60 days of data. This means the model uses historical cocoa prices and corresponding weather conditions from the prior two months to predict the price on the subsequent day. The 60-day look-back window was selected based on preliminary experiments, which indicated that this window length effectively captured short-term market 17 dynamics without unnecessary complexity (Brownlee, 2017). Robustness checks involving alternative window lengths are discussed later to validate this choice further. Before feeding data into the LSTM network, all input variables, including cocoa prices and weather variables, were normalised using the Min-Max scaling method. This process ensures that all input data fall within a numerical range (from 0 to 1), thus preventing any single variable, such as cocoa price measured in thousands of dollars, from disproportionately influencing the model’s training relative to smaller-scale weather data (Goodfellow, Bengio, & Courville, 2016). Such normalisation enhances the stability and efficiency of the training process by promoting balanced contributions from each variable, facilitating faster convergence of the network parameters (Brownlee, 2017). The prepared dataset was then chronologically partitioned into distinct training and testing subsets. The initial 80% of the observations, covering earlier periods, were reserved exclusively for training the model. In comparison, the remaining 20% were set aside for testing the model’s predictive accuracy on previously unseen future data points. This temporal separation ensures a realistic assessment of the model’s predictive capabilities, reflecting genuine forecasting conditions where future data is unknown during model training (Hyndman & Athanasopoulos, 2018). The predictive accuracy of the LSTM model was assessed using a series of well-established forecasting metrics, each offering unique insights into the model’s forecasting capabilities. The key metrics employed in this research were Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R²). These metrics are standard in time-series forecasting literature due to their intuitive interpretations and relevance to economic analyses (Hyndman & Athanasopoulos, 2018). These performance metrics were applied to a baseline model, trained solely on historical cocoa prices, and then a set of complete versions of the model, including weather data alongside previous price information. Comparing these two types of models enabled an explicit assessment of whether including weather variables improved forecasting performance. If the complete models with weather inputs significantly reduced RMSE and MAE or enhanced the R² compared to the baseline, it would indicate meaningful predictive contributions from weather information. A series of robustness tests were conducted to ensure the reliability and generalisability of the LSTM forecasting model. These tests aimed to verify that the model’s predictions were not sensitive to arbitrary methodological choices, such as the exact historical window length used for predictions or the spatial resolution of the weather data. First, alternative look-back window lengths were tested. Beyond the original 60-day window, models with shorter (30- day and 45-day) and longer (75-day, 90-day, and 120-day) historical windows were trained and compared. This analysis helps confirm whether the 60-day window length captured optimal predictive information or whether other window lengths yielded significantly different forecasting accuracy (Brownlee, 2017). The results indicated that while the 60-day window provided the best balance between accuracy and complexity, predictions remained stable across different windows, suggesting that the model’s forecasting performance was robust to moderate variations in historical input duration. Second, robustness checks involved assessing the impact of aggregating weather data at different geographical scales. Weather information was tested at various resolutions, including global averages, country-level aggregates, and highly localised (farm-level) measurements within major cocoa-producing regions such as Ghana and Côte d’Ivoire (Wibaux, Normand, Vezy, Durand, & Lauri, 2024). These tests determined whether the predictive accuracy varied significantly based on spatial granularity (Chatzopoulos, Pérez Domínguez, & Zampieri, 2020). Third, permutation importance analyses were conducted to quantify the individual predictive contributions of 18 historical cocoa prices, temperature, and precipitation variables. Each feature was randomly shuffled in these tests, disrupting its relationship with the true price outcomes. If shuffling a feature significantly increased forecast error, it implied the model depended heavily on that feature, demonstrating its importance for accurate forecasting (Breiman, 2001; Molnar, 2020). These robustness checks provide confidence in the validity and reliability of the forecasting approach, confirming that the predictive relationships identified in the primary analysis are neither sensitive to minor changes in methodology nor solely driven by specific arbitrary choices. Despite the strengths and robust analytical framework the LSTM forecasting model provides and its proposed extensions, several methodological considerations and inherent limitations must be acknowledged. Recognising these factors is crucial for correctly interpreting the forecasting results and understanding their practical applicability and reliability. First, the predictive accuracy of neural network-based models, including LSTMs, depends heavily on the quantity and quality of data available (Goodfellow, Bengio, & Courville, 2016). Although this study utilised extensive historical price data from the ICE Futures market and reliable weather data from the ERA5 dataset (Hersbach et al., 2023; ICCO, 2024), specific relevant granular data, such as production-level yield data and comprehensive local economic indicators, were unavailable. The absence of these detailed datasets introduces the potential for omitted variable bias, implying some crucial factors influencing cocoa prices might not have been fully accounted for, potentially limiting the model’s real-world predictive power (Tothmihaly, 2018; Molnar, 2020). Although neural networks, particularly LSTMs, possess powerful capabilities to model complex nonlinear relationships, their internal processes remain challenging to interpret due to their "black box" nature (Molnar, 2020). The difficulty in explaining how the model arrives at specific predictions can limit its practical acceptance in economic decision-making contexts, where transparency and interpretability are highly valued (Siami-Namini, Tavakoli, & Siami-Namin, 2018). Computational constraints presented a significant practical limitation in this research. While the analysis demonstrated promising predictive capabilities using localised farm-level data within key production areas, attempts to expand this detailed analysis to multiple cocoa-producing regions simultaneously exceeded available computational resources for this level of granularity. This limited the scope of the analysis and prevented comprehensive testing of potentially more powerful predictive configurations involving extensive localised data across multiple production areas (Brownlee, 2017). Results This chapter presents and analyses the empirical results of implementing various Long Short- Term Memory (LSTM) models trained on forecasting cocoa futures prices. The analysis is structured to systematically compare a baseline predictive model, relying exclusively on historical cocoa price data, with several augmented models that additionally incorporate climatic variables, specifically temperature and precipitation(see Table 1 for an overview of all models). Each model's predictive performance is quantified using the standard performance metrics. Robustness checks are then included to ensure the reliability and validity of the findings. The results are contextualised to highlight the economic significance of incorporating weather data into commodity price forecasting, addressing the theoretical assumption that climatic factors substantially impact cocoa market dynamics. 19 The primary objective of this study was to evaluate whether incorporating weather variables, specifically temperature and precipitation, improves predictive accuracy for cocoa futures prices compared to a baseline model relying solely on historical price data. This investigation was grounded in the theoretical expectation that climatic conditions significantly affect cocoa yields and market prices, given the market’s structural characteristics dominated by a small number of multinational corporations and vulnerable smallholder farmers. I first implemented a standard univariate LSTM neural network to establish a baseline for predicting cocoa futures prices, relying solely on historical prices as inputs. The dataset comprises monthly settlement prices of cocoa futures contracts from January 1980 to February 2025. Data were chronologically split into training (80%) and testing sets (20%) to preserve temporal dependencies. Specifically, the first 80% of the observations formed the training set, while the remaining 20% constituted the testing set of the price data. The graph showing a monthly average version of the price illustrates how the all-time high price of December 2024 is included in the last 20% of the data. Figure 4 Here is a Table showing all the models trained, including the mentioned Base model, to which an inclusion of weather variables was compared. The table shows that all subsequent models include one of the weather data sets specified in the variables section. The only models that do not include past prices along with their specific version of training data are the two models, number 12, “Only Global Temp”, and number 13 “, Only Global Temp & Precipitation”. A table with a more detailed overview of variable inclusion for the models is located in Appendix B. # Model MSE RMSE MAE 𝑅2* 1 Base model* 48512.988 220.257 92.317 0.998 2 Global Temp 68145.273 261.046 114.669 0.983 3 Global Temp & Precipitation 56718.905 238.157 104.133 0.986 4 The 52 countries’ Temp 176119.893 419.667 166.944 0.957 5 The 52’s Temp & Precipitation 582472.609 763.199 307.891 0.858 6 Côte d’Ivoire’s & Ghana’s 1707077.556 1306.552 499.517 0.585 total Temp data 7 IC’s & G’s farm area: Temp 1462084.261 1209.167 452.496 0.644 8 Farm area: Temp & it’s 966032.998 982.869 389.126 0.765 optimal Range 20 9 Only IC’s farm area: Temp & 644050.656 802.528 307.683 0.843 Precipitation 10 IC & G’s farm area: 418815.955 647.160 237.078 0.898 Precipitation 11 IC & G’s farm area: 6364638.522 2522.823 1527.357 -0.548 Precipitation & its Optimal Range 12 Only Global Temp 4001419.667 2000.355 1088.016 0.027 13 Only Global Temp & 5740023.581 2395.835 1397.295 -0.396 Precipitation Table 1 The Base model and its counterpart The base model (number 1), which only included the past prices as training data for predicting cocoa futures prices in dollars, had predictive accuracy, as seen in the main table. The structure of the base model was a sequence of the previous 60 days (look-back period = 60), meaning that each forecasted value depended solely on price movements within the preceding four months. The network consisted of two stacked LSTM layers containing 32 and 16 hidden units, respectively, followed by a dense linear output layer to generate one- month-ahead predictions. Model training employed the Adam optimiser with Mean Squared Error (MSE) loss, complemented by early stopping to mitigate potential overfitting. This structure was also used for every subsequent model, with weather variables added to the training data to compare how each additional version of the weather data affected the model's predictive accuracy. 21 Figure 5 Figure 5 plots the mean-squared error (MSE) for the training and validation sets over 50 epochs. Training loss drops sharply in the first epoch (passthrough of the entire data) from roughly 2.5 × 10⁻⁴ to 7 × 10⁻⁵ and then plateaus, indicating the optimiser has made most weight adjustments early. Validation loss falls from about 6 × 10⁻⁴ to 3 × 10⁻⁴ during the first 10 epochs but fluctuates thereafter, hinting at mild overfitting or high variance rather than continued improvement. The predictive accuracy of the Base model on the test set is summarised by evaluation metrics presented in Table 1 alongside other notable models. The base model achieved a Mean Squared Error (MSE) of approximately 48513, Root Mean Squared Error (RMSE) of about 220 USD per tonne, Mean Absolute Error (MAE) of around 92 USD per tonne, and an R² of 0.988, highlighting its strong predictive power derived exclusively from past price information. 22 Figure 6 Figure 6 plots the predicted versus actual cocoa futures prices over the test period, demonstrating that the model effectively captures medium-to-long-term price dynamics. Despite the high accuracy, noticeable deviations appear around periods of pronounced price volatility and sharp market shifts, suggesting opportunities for improvement by integrating additional explanatory variables. The established baseline thus provides a firm reference for comparison. The subsequent models are extended by incorporating climatic data, total monthly precipitation and 2-meter above-ground temperature, to determine whether integrating weather conditions contributes additional predictive power beyond historical prices alone. Figure 7 shows two of the same type of graphs as those in Figure 6 and Figure 5 combined. These new graphs in Figure 7 show the performance of the model with only temperature as training data, number 12, “Only Global Temp”. This and model number 13, “Only Global Temp & Precipitation”, was constructed to test the performance of weather data alone for predicting the global price of cocoa, the US cocoa futures market price (See Table 1). Figure 7 23 The first graph in Figure 7 illustrates learning curves for this temperature-only model (number 12). Although there was a low training loss, the validation loss remained significantly higher, stabilising at approximately 0.04. This pronounced gap between training and validation loss suggests the model struggled to capture the underlying price dynamics from weather data alone. This performance in the model's training was mirrored in the right- hand graph in Figure 7, where the predicted price failed to capture the nuances of market pricing. It completely failed to register that the market escalated to an all-time high in pricing, peaking at the end of 2024, as evidenced in the price graph. An even poorer performance in predictive power was observed in the second of the weather-only models (number 13), which had a 𝑅2 with a value of -0.39582, as shown in the main table above. This model also utilised only weather variables to predict cocoa prices, in the form of temperature data and precipitation data. These models demonstrated that an LSTM model with these specifications could not predict the world price of cocoa using weather data alone. This is why all other models (except the base model) incorporate both past prices and some form of weather variable as training data. The Models of Special Interest # Model MSE RMSE MAE R2 1 Base 48512.988 220.257 92.317 0.998 3 Global Temp & 56718.905 238.157 104.133 0.986 Precipitation 5 The 52’s Temp & 582472.609 763.199 307.891 0.858 Precipitation 9 Only IC’s farm 644050.656 802.528 307.683 0.843 area: Temp & Precipitation Table 2 Table 2 highlights the four models I find most informative since they cover all three levels of geographical specificity for which the weather data could impact cocoa economically. I include the base model (number 1) because its predictive power is notably high: with an R² of 0.998 at a one-day horizon, yesterday’s cocoa-futures price explains almost all of tomorrow’s variation. Such strong autocorrelation is typical when both (i) forecast windows are short and (ii) market participants rapidly incorporate new information into prices (Box & Jenkins, 1970). The baseline model’s root-mean-square error is approximately $220 per tonne. Compared with the price range indicated in the historical chart (approximately $3,000 to $12,000 per tonne), this translates to an average error of 1.8% to 7.3% of the prevailing price. In practical terms, the model typically remains within single-digit percentage points on ordinary days. However, it can still miss the occasional double-digit jumps that have characterised the recent rally. The second model of special interest is number 3: “Global Temp & Precipitation.” It performed nearly as well as the base model in terms of predictive power, 𝑅2 = 0.986207. This model suggests that it is approximately as suitable for the model's performance when global weather data is included. This implies that it is primarily the global weather changes (together with past prices) that could serve as a basis for price prediction as effectively as just past 24 prices and the global price of cocoa. This suggestion, based on predictive power, aligns with what Bilal and Känzig (2024) propose in their paper: that global climate change has a greater impact on economic outcomes than local weather events as shocks to the economy. A large majority of global cocoa output comes from countries whose capitals Are situated within approximately 1,000 km of the Equator, for instance, Côte d’Ivoire, Ghana, Indonesia, Nigeria, and Ecuador collectively produce roughly 80% of the world's cocoa (Ritchie et al., 2023; ICCO, 2023, 2024). These equatorial, agriculture-dependent economies are among the most climate-vulnerable, as rising temperatures and erratic rainfall directly threaten cocoa physiology and yields (Schroth et al., 2016; Wibaux et al., 2024; Carr & Lockwood, 2011). This suggests that there should be close ties between the overall risk to economic output, as stated by Bilal and Känzig (2024), and the supply-side performance of the cocoa market. Then there is model number 5,” The 52’s Temp & Precipitation”, they are 52 of the total list of cocoa-producing countries presented by our world in data that emerged because of weather data availability. This model uses the arithmetic mean of temperature and precipitation from each of these 52 countries that are located, as stated in the previous model, in the near vicinity of the equator. A natural continuation from the previous model, since these 52 are among the countries most negatively affected by climate change (Chatzopoulos et al., 2020). The weather data from this list of countries should have some predictive power for the economic performance of the supply side of cocoa, reflected in the world price of cocoa to some extent. The predictive power of the fifth model could be interpreted as a reflection of this agricultural origin of the commodity, with a 𝑅2 value of 0.86 (see main or collapsed table). This suggests that approximately 86% of the price may be predicted using the previous prices and the weather data for the countries that produce nearly all the cocoa in the world. The final of the most notable models is number 9, “Only IC’s farm area: Temp & Precipitation.” This model uses all available weather data collected within the geographical region that produces cocoa in Côte d’Ivoire. Since this country alone produces 45% of all cocoa in the world, an R value of 84% is not an entirely unlikely reflection of the economic reality of the cocoa market. Though it is notable that it is not much different from the predictive power of model number 5, this suggests that there is almost an equal amount of predictive power in using the local version of the weather data as the average version on a per-country basis. Since this is only one of the countries with the highest cocoa production, I am led to believe that there would be even higher predictive power in utilising the most localised weather data in all the top-producing countries in the world simultaneously. I was unable to test this, as including all available weather data from the farming regions of Ghana (the world’s second-largest producer of cocoa) caused my working environment in Google Collab to crash. Robustness tests The first robustness test was conducted on the baseline model (see Table 1) to evaluate its reliability. One of these robustness tests involved extensive hyperparameter tuning, focusing primarily on determining the optimal look-back window, batch size, and number of epochs. The look-back window, which represents the number of past days used for predicting future prices, was systematically tested across various durations: 30, 45, 60, 75, 90, and 120 days. 25 Figure 8 The results shown in Figure 8 identify the optimal window as 60 days, achieving the lowest RMSE of 195.03 (See Appendix A for the rationale behind different look-back windows). In comparison, shorter windows of 30 and 45 days produced higher RMSE values of 231.85 and 199.81, respectively, while longer windows, such as 75, 90, and 120 days, yielded RMSE values of 219.03, 196.17, and 221.76, respectively. Figure 9 Figure 9 shows the other robustness test on the base model, which was tuning experiments that evaluated the effects of different batch sizes and epoch combinations. A batch size of 32 combined with 30 epochs emerged as the most effective setting, delivering the lowest RMSE of 186.79. Alternative configurations revealed significant variability in performance. For instance, a batch size of 16 resulted in progressively worse outcomes with increasing epochs, yielding RMSE values of 229.41 (30 epochs), 291.87 (50 epochs), and a notably high RMSE of 525.65 (100 epochs), indicative of severe overfitting. Similarly, increasing batch sizes to 64 resulted in less optimal RMSE values, ranging from 191.48 (30 epochs) to 218.33 (50 26 epochs), with slight improvement at 100 epochs (RMSE 192.31), though still inferior to the optimal configuration. The subsequent set of robustness tests was conducted on the models of particular interest (see the condensed table) that incorporated weather variables in the training data. These models were evaluated using permutation importance tests. Permutation importance is a post-hoc model interpretability technique that assesses how each input feature affects a predictive model’s performance. The fundamental concept is to randomly permute (shuffle) the values of one feature in the model’s validation data, thereby breaking its association with the true outcome, and then observe the change in the model’s error rate. If the model’s error increases significantly after permuting a feature, it suggests that the model was heavily reliant on that feature (Breiman, 2001; Fisher, Rudin, & Dominici, 2019); conversely, if shuffling a feature’s values has minimal effect, that feature is likely not essential to the model’s output. This procedure is frequently employed as a robustness check. For instance, in an LSTM model predicting an economic indicator from historical price, rainfall, and precipitation data, permuting each input sequentially demonstrates how much the prediction accuracy declines, thus confirming the variable’s influence. Permutation importance provides a straightforward, model-agnostic method to interpret even intricate deep learning models, yet it has significant limitations. It indicates how strongly a feature impacts prediction error without disclosing the direction of the feature’s effect (i.e., it is an “undirected” importance measure). Furthermore, if predictors are highly correlated, the test can understate a variable’s importance, as the model may retrieve similar information from a correlated feature when one is shuffled. Despite these limitations, permutation tests remain a simple and effective tool for evaluating variable relevance as part of model robustness analysis in deep learning (Molnar, 2020). The first of these permutation importance tests was carried out on model number 3, “Global Temp & Precipitation”. The test yielded a baseline root-mean-squared error (RMSE) of 228.43 as shown in Figure 10 Figure 10 When shuffling the values of the various variables included in that model, it was found that shuffling the historical temperature data resulted in an RMSE of 228.30, a change of ±0.13 so small that it is indistinguishable from ordinary sampling noise; thus, temperature provides virtually no incremental predictive information. When shuffling the historical precipitation data, the resulting RMSE was 228.08, reflecting a change of ±0.3431, implying that the network had been exploiting spurious correlations in the precipitation series. When those correlations are broken, out-of-sample accuracy improves slightly, signalling a near-zero or 27 mildly negative importance weight. When shuffling the price history data, the resulting RMSE was 2,143.6164, representing a change of +1,915.1910, demonstrating that almost all forecast values derive from the autoregressive structure embedded in past prices. Because precipitation and temperature are deemed to be highly collinear with, or unable to substitute for, the price series, the magnitude of this jump can be interpreted as a robust upper bound on their explanatory power (Molnar, 2020). The results indicate that the purported “Global Temp & Precipitation” model is mainly driven by price-only, weather variables contribute no material signal and may even introduce noise. The permutation importance was then tested on model number 5, “The 52’s Temp & Precipitation”. Figure 11 Figure 11 presents the results of permutation importance for model 5 (“The 52’s Temperature & Precipitation”). When permuting the price history data, the out-of-sample root mean squared error (RMSE) increases by roughly 1,200 units. In contrast, shuffling any individual weather channel affects the error by only a few units. Figure 12 consolidates the average weather data per country into two categories: rain and temperature. It confirms that, on average, both categories yield a mean ΔRMSE statistically indistinguishable from zero, whereas the price category alone sustains the complete 1,200-point deterioration. Figure 12 For permutation importance tests, a large positive ΔRMSE indicates that the model’s predictive distribution is highly sensitive to the information contained in that variable 28 (Breiman, 2001; Fisher, Rudin, & Dominici, 2019). The results imply that nearly all predictive power originates from the autoregressive structure in historical prices; the temperature and precipitation data from the 52 countries contribute no significant signal. Several weather channels show slightly negative importances, suggesting that the LSTM exploits noise or spurious correlations in those inputs, which are eliminated when the series are permuted (Molnar, 2020). Since the weather variables are considered not strongly collinear with past prices, the standard caveat that permutation tests can understate importance in multicollinearity (Altmann et al., 2010) is unlikely to alter this conclusion. These results demonstrate that the purported model number 5, “The 52’s Temp & Precipitation”, is effectively a price-only model for this dataset and architecture. Lastly, a permutation importance test was conducted on model number 9, “Only IC’s farm area: Temp & Precipitation”, indicating contrasting influences of historical price data and weather conditions (temperature and precipitation) within cocoa-producing regions of the Côte d’Ivoire. Specifically, shuffling past cocoa price data results in a total increase in mean squared error (ΔMSE) of 3.433. In contrast, permuting weather variables collectively leads to a higher total ΔMSE of 38.646; however, the average contribution per weather variable is minimal, at approximately 0.077. These results reveal important nuances. While combined weather variables seem influential, their average individual impact is extremely low, suggesting that the overall significance arises primarily from the large number of weather variables rather than the strong predictive power of any single channel. Conversely, the price variable alone accounts for a considerable individual impact (3.433), indicating a substantial predictive dependence on historical price dynamics. Consequently, the model primarily utilises historical price information as its main predictive feature, while weather variables provide minimal explanatory power individually. Their combined significance largely reflects quantity rather than quality in predictive terms. This finding indicates that price-driven market dynamics mainly influence the predictive performance of this LSTM model, which is encapsulated in historical prices. At the same time, weather conditions contribute only minor incremental information. Discussion This thesis addresses the research question: "Can future global cocoa prices be effectively predicted by combining historical cocoa price data with weather indicators, specifically temperature and precipitation?" The findings derived from the LSTM models provide intriguing insights, albeit with notable methodological limitations. The baseline LSTM model, which relied exclusively on historical prices, achieved notable predictive power (R² ≈ 0.998). However, this strong result highlights an important inference: historical cocoa futures prices inherently incorporate substantial information about market dynamics and expectations, leaving little room for incremental predictive gains from weather data. This suggests either high market efficiency or pronounced speculative activity, where market participants swiftly adjust prices to reflect all available information. The inclusion of weather variables yielded mixed outcomes. Global weather data (Model 3, "Global Temp & Precipitation") achieved high predictive accuracy (R² ≈ 0.986) but did not 29 surpass the baseline model significantly. Conversely, models utilising more localised climate data performed comparatively poorly. Model 5, using temperature and precipitation averages across 52 countries, experienced a notable reduction in predictive accuracy (R² ≈ 0.858) and an approximately threefold increase in RMSE compared to the baseline. These results suggest that aggregating climate data across multiple regions introduced noise rather than valuable predictive insights. Meanwhile, Model 9, focusing specifically on cocoa-producing regions in Côte d’Ivoire, performed slightly better (R² ≈ 0.843) but still did not match the baseline model’s performance. This implies that localised climatic effects do influence cocoa production, but are insufficient for enhancing short-term price forecasting beyond historical prices alone. These outcomes contradict expectations derived from existing literature, which typically posits climatic variables as valuable predictors of agricultural commodity prices. Two interpretations arise from the negligible predictive improvement. First, weather impacts might already be embedded in historical prices through market anticipation, reducing the incremental value of contemporaneous climate data. Second, the immediate, daily price impact of weather fluctuations may be too subtle or complex for models to capture accurately, especially given data granularity limitations. Robustness checks further emphasised methodological challenges. Permutation importance tests demonstrated that historical prices strongly dominated predictive accuracy, as scrambling price data significantly degraded model performance. Conversely, permuting weather variables had minimal effects, underscoring their limited explanatory power irrespective of the geographic level of the weather data. This indicates that variations in predictive accuracy across models primarily reflect differences in data noise levels rather than the true economic impact of climatic factors. This outcome highlights critical issues regarding feature selection and potential overfitting when incorporating complex, high- density datasets into predictive models. If regarded as reliable, these results have significant implications for understanding the cocoa market structure and informational efficiency. The dominance of historical price momentum as a predictive signal indicates a highly efficient market, rapidly absorbing available information into price adjustments. Consequently, additional climate information appears redundant or insufficiently detailed to significantly enhance short-term forecasts. While aligning with the efficient market hypotheses that suggest limited opportunities to exploit publicly available climatic data, the model’s complete disregard for weather impacts likely indicates methodological shortcomings rather than inherent market realities. For example, a sensitivity analysis (see Figure 11) only identified a slight weather-related price variation in Venezuela, which starkly conflicts with established literature that it was due to extreme weather events in western Africa that caused the recent substantial price spike of chocolate. A plausible explanation for this discrepancy is the complexity of the relationship between weather events and global cocoa prices, which a standard LSTM model architecture might inadequately capture. An omitted variable bias, particularly the absence of cocoa yield data, likely contributes significantly to the observed discrepancy in results. Yield data might represent a critical missing link in understanding how weather variations translate into global price fluctuations. A potential approach using this additional yield data for future research might be structuring two interconnected LSTM models analogous to an instrumental variable (IV) regression. 30 Following the framework of DeepIV as introduced by Hartford et al. (2017), one model would first estimate the direct relationship between weather conditions and cocoa yields, effectively isolating weather-induced supply shocks. Subsequently, a second-stage model would forecast cocoa prices based on the predicted yields from the first stage, capturing the indirect economic impact of weather on prices more clearly. Such a structured approach could enhance the interpretability and accuracy of forecasts by explicitly modelling the causal pathway: weather → cocoa yield → cocoa price. This might have a greater chance of showing the real economic mechanisms underpinning the futures market dynamics of cocoa. References Abu, I.-O., Szantoi, Z., Brink, A., Robuchon, M., & Thiel, M. (2020). Cocoa map for Côte d’Ivoire and Ghana [Data set]. PANGAEA. https://doi.org/10.1594/PANGAEA.917473 Ahmed, N. K., Atiya, A. F., El Gayar, N., & El-Shishiny, H. (2010). An empirical comparison of machine-learning models for time-series forecasting. Econometric Reviews, 29(5–6), 594–621. https://doi.org/10.1080/07474938.2010.481556 Altmann, A., Toloşi, L., Sander, O., & Lengauer, T. (2010). Permutation importance: A corrected feature importance measure. Bioinformatics, 26(10), 1340– 1347. https://doi.org/10.1093/bioinformatics/btq134 Anderson, W., Seager, R., Baethgen, W., & Cane, M. (2018). Trans-Pacific ENSO teleconnections pose a correlated risk to agriculture. Agricultural and Forest Meteorology, 262, 298–309. https://doi.org/10.1016/j.agrformet.2018.07.023 Bilal, A., & Känzig, D. R. (2024). The macroeconomic impact of climate change: Global vs. local temperature (NBER Working Paper No. 32450). National Bureau of Economic Research. https://doi.org/10.3386/w32450 Beg, M. S., Ahmad, S., Jan, K., & Bashir, K. (2017). Status, supply chain and processing of cocoa: A review. Trends in Food Science & Technology, 66, 108– 116. https://doi.org/10.1016/j.tifs.2017.06.007 Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5– 32. https://doi.org/10.1023/A:1010933404324 Box, G. E. P., & Jenkins, G. M. (1970). Time series analysis: Forecasting and control. San Francisco, CA: Holden-Day. Brownlee, J. (2017). Long short-term memory networks with Python: Develop sequence prediction models with deep learning. Machine Learning Mastery. Carr, M. K. V., & Lockwood, G. (2011). The water relations and irrigation requirements of cocoa (Theobroma cacao L.): A review. Experimental Agriculture, 47(4), 653– 676. https://doi.org/10.1017/S0014479711000421 Chatzopoulos, T., Pérez Domínguez, I., & Zampieri, M. (2020). Climate extremes and agricultural commodity markets: A global economic analysis of regionally mediated 31 impacts. Climate Risk Management, 27, Article 100193. https://doi.org/10.1016/j.wace.2019.100193 Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654– 669. https://doi.org/10.1016/j.ejor.2017.11.054 Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by simultaneously studying an entire class of prediction models. Journal of Machine Learning Research, 20(177), 1– 81. https://jmlr.org/papers/v20/18-760.html Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451– 2471. https://doi.org/10.1162/089976600300015015 Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge, MA: MIT Press. Hartford, J., Lewis, G., Leyton-Brown, K., & Taddy, M. (2017). Deep IV: A flexible approach for counterfactual prediction. In D. Precup & Y. W. Teh (Eds.), Proceedings of the 34th International Conference on Machine Learning(Vol. 70, pp. 1414–1423). PMLR. https://proceedings.mlr.press/v70/hartford17a.html Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River, NJ: Prentice Hall. Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., … Thépaut, J.-N. (2023). ERA5 hourly data on single levels from 1940 to the present [Data set]. Copernicus Climate Change Service (C3S) Climate Data Store. https://doi.org/10.24381/cds.adbb2d47 (Accessed May 4, 2025) Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 Hsiang, S., Meng, K., & Cane, M. (2011). Civil conflicts are associated with the global climate. Nature, 476, 438–441. https://doi.org/10.1038/nature10311 Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice (2nd ed.). OTexts. Retrieved May 4, 2025, from https://otexts.com/fpp2/ IBM Cloud Education. (2020, June 26). What is overfitting? https://www.ibm.com/cloud/learn/overfitting (Retrieved May 4, 2025) Iizumi, T., Luo, J. J., Challinor, A. J., Sakurai, G., Yokozawa, M., Sakuma, H., … Yamagata, T. (2014). Impacts of El Niño Southern Oscillation on the global yields of major crops. Nature Communications, 5, 3712. https://doi.org/10.1038/ncomms4712 International Cocoa Organisation. (2023). Quarterly bulletin of cocoa statistics: November 2023. Abidjan, Côte d’Ivoire: ICCO. 32 International Cocoa Organisation. (2024, November 29). November 2024 quarterly bulletin of cocoa statistics. Abidjan, Côte d’Ivoire: ICCO. Retrieved May 4, 2025, from https://www.icco.org/november-2024-quarterly-bulletin-of-cocoa-statistics/ Kamu, A., Ahmed, A., & Yusoff, R. (2010). Forecasting cocoa-bean prices using univariate time-series models. Journal of Arts, Science & Commerce, 1(1), 71–80. Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimisation. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015). https://doi.org/10.48550/arXiv.1412.6980 Kozul-Wright, A. (2025, April 21). Bitter truth: Why has chocolate become so expensive? Al Jazeera. Retrieved May 4, 2025, from https://www.aljazeera.com/news/2025/4/21/bitter- easter-truth-why-has-chocolate-become-so-expensive LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436– 444. https://doi.org/10.1038/nature14539 Letta, M., Montalbano, P., & Tol, R. S. J. (2022). Weather shocks, traders’ expectations, and food prices. American Journal of Agricultural Economics, 104(3), 1100–1119. DOI: 10.1111/ajae.12258 Mienye, I. D., Swart, T. G., & Obaido, G. (2024). Recurrent neural networks: A comprehensive review of architectures, variants, and applications. Information, 15(9), 517. https://doi.org/10.3390/info15090517 Molnar, C. (2020). Interpretable machine learning: A guide for making black box models explainable (2nd ed.). Independently Published. Retrieved May 8, 2025, from https://christophm.github.io/interpretable-ml-book/ Namin, S. I., & Namin, A. S. (2018). Forecasting economic and financial time series: ARIMA vs. LSTM. Journal of Business Research, 90, 468– 472. https://doi.org/10.1016/j.jbusres.2018.05.001 Olofintuyi, S. S., Olajubu, E. A., & Olanike, D. (2023). An ensemble deep-learning approach for predicting cocoa yield. Heliyon, 9, e15245. https://doi.org/10.1016/j.heliyon.2023.e15245 Ouyang, H., Wei, X., & Wu, Q. (2019). Agricultural commodity futures prices prediction via long- and short-term time-series network. Journal of Applied Economics, 22(1), 468– 483. https://doi.org/10.1080/15140326.2019.1668664 Prechelt, L. (2012). Early stopping — but when? In G. Montavon, G. B. Orr, & K.-R. Müller (Eds.), Neural networks: Tricks of the trade (2nd ed., Lecture Notes in Computer Science, Vol. 7700, pp. 53–67). Springer. https://doi.org/10.1007/978-3-642-35289-8_5 Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., & Cottrell, G. W. (2017). A dual-stage attention-based recurrent neural network for time-series prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (pp. 2627– 2633). https://doi.org/10.24963/ijcai.2017/366 33 Quartey-Papafio, T. K., Javed, S. A., & Liu, S. (2020). Forecasting cocoa production of six major producers through ARIMA and grey models. Grey Systems: Theory and Application, 10(3), 421–438. https://doi.org/10.1108/GS-04-2020-0050 Racine Ly & Fousseini Traore & Khadim Dia, (2021). "Forecasting Commodity Prices Using Long Short-Term Memory Neural Networks," Papers 2101.03087, arXiv.org, revised Jan 2021. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back- propagating errors. Nature, 323(6088), 533–536. https://doi.org/10.1038/323533a0 Ritchie, H., Rosado, P., & Roser, M. (2023). Cocoa bean production [Data set]. Our World in Data. https://ourworldindata.org/grapher/cocoa-bean-production Sari, M., Duran, S., Kutlu, H. et al. Various optimized machine learning techniques to predict agricultural commodity prices. Neural Comput & Applic 36, 11439–11459 (2024). https://doi.org/10.1007/s00521-024-09679-x Schroth, G., Läderach, P., Martinez-Valle, A. I., & Bunn, C. (2016). From site-level to regional adaptation planning for tropical commodities: Cocoa in West Africa. Science of the Total Environment, 556, 231–241. 10.1007/s11027-016-9707-y Siami-Namini, S., Tavakoli, N., & Siami-Namin, A. (2018). A comparison of ARIMA and LSTM in forecasting time series. In Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (pp. 1394– 1401). https://doi.org/10.1109/ICMLA.2018.00227 Smeeton, G. (2024, March 21). Easter chocolate prices soar as climate change and El Niño bite. Energy & Climate Intelligence Unit. Retrieved May 4, 2025, from https://eciu.net/media/press-releases/2024/easter-chocolate-prices-soar-as-climate- change-and-el-ni%C3%B1o-bite Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958. http://jmlr.org/papers/v15/srivastava14a.html Tothmihaly, A. (2018). How low is the price elasticity in the global cocoa market? African Journal of Agricultural and Resource Economics, 13(3), 209– 223. https://doi.org/10.22004/ag.econ.284986 Thukral, N., & Tan, F. (2024, December 31). Cocoa tops global commodities rally for 2nd year; steel ingredients struggle on China demand. Reuters. Retrieved May 4, 2025, from https://www.reuters.com/markets/commodities/cocoa-tops-global-commodities-rally- 2nd-year-steel-ingredients-struggle-china-2024-12-31/ Ubilava, D. (2018). The role of El Niño–Southern Oscillation in commodity-price movement and predictability. American Journal of Agricultural Economics, 100(1), 239– 263. https://doi.org/10.1093/ajae/aax060 34 United Nations Conference on Trade and Development. (2024, April 2). Chocolate price hikes: A bittersweet reason to care about climate change. UNCTAD. Retrieved May 4, 2025, from https://unctad.org/news/chocolate-price-hikes-bittersweet-reason-care-about-climate- change Voora, V., Bermúdez, S., & Larrea, C. (2020). Global market report: Cocoa (Sustainable Commodities Marketplace Series). International Institute for Sustainable Development. https://www.iisd.org/system/files/publications/ssi-global-market-report- cocoa.pdf Wibaux, T., Normand, F., Vezy, R., Durand, J. B., & Lauri, P. É. (2024). Do seasonal flowering and fruiting patterns of cacao only depend on climatic factors? The case study of mixed genotype populations in Côte d’Ivoire. Scientia Horticulturae, 337, 113529. https://doi.org/10.1016/j.scienta.2024.113529 Zelingher, R., & Makowski, D. (2024). Investigating and forecasting the impact of crop- production shocks on global commodity prices. Environmental Research Letters, 19, 014026. https://doi.org/10.1088/1748-9326/ad0dda Zhang, Z. (2016). A gentle introduction to artificial neural networks. Annals of Translational Medicine, 4(19), 370. Appendix A This appendix provides detailed explanations of key technical concepts and methodologies referenced in the main body of this thesis. Its purpose is to enhance readability by isolating technical details from the primary discussion, allowing readers interested in deeper technical understanding to consult these sections directly. Specifically, this appendix includes comprehensive overviews of Artificial Neural Networks (ANNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTM) networks, which underpin the forecasting models utilised in this research. Additionally, it clarifies critical procedures such as dropout regularisation, permutation importance analysis, supervised learning processes, and data structuring approaches like the look-back window technique. The explanations here are designed to support readers in appreciating the methodological robustness and analytical choices adopted in the thesis. Artificial Neural Networks (ANNs) Artificial Neural Networks (ANNs) are computing systems inspired by the human brain’s network of neurons. In an ANN, numerous simple processing units (called neurons or nodes) are interconnected in layers and collectively learn to transform input data into useful outputs. Each neuron receives input signals (numbers), multiplies each by an adjustable weight (reflecting the importance of that input), sums them up, and then applies an activation function to produce an output signal. This structure allows ANNs to learn complex patterns in data by adjusting the weights during training so that the network outputs correct or desirable results (Rumelhart, Hinton, & Williams, 1986). An ANN “mimics” how a brain solves problems: it takes in information, processes it through many connected units, and outputs a prediction or decision based on what it has learned (Zhang, 2016). 35 One key advantage of ANNs is their ability to model non-linear relationships between inputs and outputs. Unlike a simple linear regression, which assumes a straight-line relationship, neural networks can capture more complicated patterns (Haykin, 1999; Zhang, 2016). They achieve this through multiple layers of neurons (often called hidden layers) that progressively extract higher-level features from the raw input. ANNs learn from examples in a training process, typically using a back-propagation algorithm to gradually adjust the weights so that the predictions improve over time (Rumelhart et al., 1986). With enough data and appropriate design, ANNs can approximate very complex functions and have been applied successfully in many fields, from image recognition to economic forecasting, where they often outperform traditional linear models by “learning” the underlying structure directly from data(LeCun, Bengio, & Hinton, 2015). Recurrent Neural Networks (RNNs) Recurrent Neural Networks (RNNs) are specialised neural networks designed to handle sequential data and capture temporal dynamics. Unlike a standard feedforward ANN, where inputs and outputs are independent, an RNN introduces connections that form directed cycles, allowing information to persist or be retained from one step to the next. In practical terms, an RNN maintains an internal hidden state updated at each time step based on the new input and the previous hidden state. This gives RNNs a form of memory, enabling them to use information from earlier in the sequence to inform later outputs. For example, if I am using an RNN to predict economic time series data, the network’s hidden state at time t could carry forward summarised information about preceding days, allowing the model to remember context and dependencies over time (Mienye, Swart, & Obaido, 2024). Through this recurrent structure, RNNs are well-suited for data where order matters, such as time series records, sentences in language, or any sequence where earlier elements influence later ones. However, basic RNNs have difficulty learning long-range dependencies. Over long sequences, they may struggle to retain information from far back because the influence of a given input tends to diminish as it is propagated through many time steps, a problem known as the “vanishing gradient” in training (Goodfellow, Bengio, & Courville, 2016). In other words, a simple RNN might “forget” important long-term information when it reaches later steps. This limitation led to the development of more advanced recurrent architectures, like LSTMs, which are specifically designed to better handle long-term context. Long Short-Term Memory Networks (LSTMs) Long Short-Term Memory (LSTM) networks are an improved form of RNN that addresses the short-term memory issue of traditional RNNS. Introduced by Hochreiter and Schmidhuber in 1997, LSTMs were designed to overcome the vanishing gradient problem and maintain long-term dependencies in sequence data (Hochreiter & Schmidhuber, 1997). The key innovation of an LSTM is the use of internal gating mechanisms that regulate the flow of information over time. Each LSTM unit (often called an LSTM cell) contains three primary gates: an input gate, a forget gate, and an output gate. These gates are valves that open or close to decide how much new information to write, how much old information to forget, and how much of the current cell’s information to output to the next time step. By dynamically controlling these flows, an LSTM can selectively remember important information and forget irrelevant data as sequences evolve (Gers, Schmidhuber, & Cummins, 2000; Mienye et al., 2024). 36 This gated cell design allows LSTMs to preserve information longer than standard RNNs. For example, in an economic time series, an LSTM might learn to retain the effect of a price shock or a seasonal pattern that occurred many days ago and use it to inform current predictions. In contrast, a basic RNN might have already lost track of that influence. The internal cell state in an LSTM acts like a conveyor belt carrying pertinent information along, unchanged unless explicitly modified by the gates. This architecture enables LSTMs to effectively capture long-term trends and context in sequential data while mitigating the risk of old information “fading away”. In summary, an LSTM is a powerful sequence model that extends the memory of neural networks, making it highly suitable for forecasting tasks like this thesis, where both recent and more distant historical data can be crucial for predicting future values. Dropout Regularisation Dropout is a regularisation technique used to prevent neural networks from overfitting, when a model learns the training data too specifically and fails to generalise to new data. The core idea of dropout is surprisingly simple. During training, randomly drop out a fraction of the neurons in the network on each pass (iteration) so that they temporarily do nothing. In practice, this means that for each training update, every neuron (apart from the output neurons) has a certain probability (the dropout rate, e.g. 20%) of being ignored, its output is set to zero, and its connections are not updated. This might sound counterintuitive, but removing parts of the model deliberately has a powerful effect. By forcing the network to train with different subsets of neurons each time, dropout prevents any single neuron or small set of neurons from becoming overly specialised to the training data (Srivastava et al., 2014). Instead, the network must learn redundant, more robust features that are useful in conjunction with many different subsets of other neurons. At prediction time (after training), dropout is turned off, and all neurons are used but with their learned weights scaled appropriately to account for the averaging effect of dropout training. The result is similar to ensembling many different neural network configurations together. In effect, dropout makes the network act like an ensemble of numerous smaller networks that vote on the outcome, improving generalisation (Srivastava et al., 2014). Empirically, dropout regularisation often leads to a model performing better on validation and test data. In our context, I applied dropout (e.g. dropping 20% of neurons) in the LSTM layers to reduce overfitting risk, thereby improving the model’s ability to generalise patterns from historical data to unseen future data (Srivastava et al., 2014). Look-back Window (Time Step) A look-back window (a sliding window or time-step window) refers to the number of past time periods used as input features to forecast the next value in a sequence. This concept transforms time-series data into a supervised learning format using historical observations to predict future ones. For example, if I choose a look-back window of 60 days, the model will at each step consider the past 60 days of data (e.g. past 60 daily prices, and possibly other variables over those days) to predict the price on the 61st day. This rolling window approach creates structured input-output pairs from sequential data: the inputs are the values from the previous 60 days, and the output is the value at the next day. Moving this window along the series daily, I generate many training examples that teach the model how past patterns relate to future outcomes (Brownlee, 2017). 37 The look-back window is an important design parameter in time-series neural networks because it defines the scope of historical information given to the model. A window that is too short might miss important longer-term trends or cycles; on the other hand, a too long window could introduce unnecessary noise or complexity and make learning harder. In traditional time-series analysis, this idea corresponds to using lagged values as predictors. For instance, an autoregressive model of order 60 uses the previous 60 observations to forecast the next one (Box & Jenkins, 1970). Similarly, the thesis’s neural network approach explicitly sets 60 prior days as the input length, allowing the LSTM model to capture recent momentum and potentially seasonal effects within roughly two months of historical data. The term “time step” in this context often refers to each discrete time interval (each day is one time step), and an LSTM with 60 time steps of look-back means it processes sequences that are 60 steps long. Choosing an appropriate look-back window is typically done through experimentation or prior knowledge, balancing the need for sufficient context against the risk of diluting relevant information. Supervised Learning Supervised learning is the most common training paradigm in machine learning, in which a model learns from examples that include both the input data and the desired output. In supervised learning, I provide the algorithm with a labelled dataset, a set of training examples where each example consists of input features and a known correct output (target). The learning process involves the model making predictions on the inputs and adjusting its internal parameters (weights) to reduce the error between its predictions and the true outputs. Essentially, the model is “supervised” by the feedback from these known answers, which guide it to improve over time. The ultimate goal is for the trained model to accurately predict the outputs for new, unseen inputs by generalising the patterns it learned from the training data. Forecasting cocoa prices is a supervised learning problem: the input might be a sequence of past prices (and possibly other variables like weather), and the known output is the actual price on the next day. By showing the network many examples of such input-output pairs, it can learn the relationship between historical trends and future prices. The use of supervised learning implies there is a clear objective signal to learn from for each day in the training set; the model is told what the correct next-day price was. Standard algorithms for supervised learning include neural networks (like our LSTM), decision trees, linear regression, etc., all of which seek to minimise a loss function that quantifies the prediction error. Over time, the model parameters are tuned (often via gradient descent optimisation) to produce outputs as close as possible to the true values (Goodfellow et al., 2016). Once training is complete, if the model has learned well, it should be able to take a new sequence of recent data and predict the next price with useful accuracy. Supervised learning contrasts with unsupervised learning (where no labelled outputs are given) and reinforcement learning (where feedback comes from rewards), which are not used in this thesis. Here, everything is framed as supervised learning with historical observations and known outcomes. Early Stopping Early stopping is a practical technique used during model training to prevent overfitting. The idea is to stop training the model before it starts overfitting the training data. In a typical training process, as the model learns, its performance on the training set consistently 38 improves. However, performance on a separate validation set (data not used for training, intended to simulate new/unseen data) may stop improving after a certain point. It can even begin to deteriorate as the model starts to memorise noise in the training data. Early stopping monitors the model’s performance on the validation set. It halts the training process when improvement has levelled off, essentially when additional training no longer leads to better generalisation. In practice, one might set a rule: “If the validation loss has not decreased for, say, five consecutive epochs (passes through the data), then stop training.” At that point, the model parameters from the epoch with the best validation performance are retained. This way, the model is frozen at the optimal point before overfitting sets in (Prechelt, 2012; Goodfellow et al., 2016). In simpler terms, early stopping acts as an automatic brake on the training process. Rather than training for a fixed number of epochs (iterations), the algorithm tests the model as it trains to see when it has had enough. Training is stopped early when further training yields no benefit on validation data. This saves computation time and often results in a model that performs better on test data. In this thesis, I employed early stopping by monitoring the validation loss during training. I halted the training once the validation loss stopped decreasing (indicating the model might start overfitting if I continued). Early stopping is one of several regularisation strategies (like dropout, discussed above) that help achieve a model that generalises well. It leverages the idea that over-training is counter-productive, and that there is an optimal point in training after which the model begins to learn spurious details of the training set (IBM Cloud Education, 2020). By using early stopping, I aim to capture the underlying signal in the data without capturing the noise. Permutation Importance Permutation importance is a technique for measuring the importance of input features in a trained machine learning model, and it provides an intuitive way to interpret complex models like neural networks. The basic procedure is as follows: for a given feature (input variable), I randomly shuffle or permute its values across all observations in the dataset, breaking any real relationship between that feature and the target. I then run the data through the model again and observe how much the model’s error (for example, the prediction mean squared error) increases due to this shuffling. Suppose the model’s performance drops significantly (error increases a lot). In that case, it indicates that the shuffled feature was important to the model’s predictions because the model struggles when its true information is destroyed. Conversely, if permuting a feature causes little to no change in error, that feature likely was not very important to the model’s decision-making. In short, a feature’s permutation importance score can be defined by the magnitude of increase in prediction error when that feature’s values are randomised: larger increase = more important feature (Breiman, 2001; Fisher, Rudin, & Dominici, 2019). One advantage of permutation importance is that it is model-agnostic and easy to compute. It does not require examining the model's weights or structure; it merely observes how predictions change when inputs are perturbed. This attribute applies to any predictive model, whether a neural network or a random forest. It benefits complex “black-box” models where traditional interpretation proves challenging (Molnar, 2020). By applying permutation importance, researchers can rank which predictors (e.g., past prices, temperature, rainfall, etc.) influence the model’s forecasting accuracy most. In the context of our study, I utilised permutation importance to ascertain which factors the LSTM model relied upon most for its predictions. The results, for instance, indicated that shuffling the price history significantly 39 degraded performance (indicating the high importance of past prices). In contrast, shuffling weather variables had minimal effect, suggesting they were relatively unimportant in the model’s predictions, a finding consistent with Y’s evaluation. It is worth noting a caveat: permutation importance assumes that features are independent regarding their contribution to the model. If two features are correlated or provide redundant information, permuting one may not lead to a significant performance drop because the other correlated feature still offers similar information. In such cases, permutation importance might underestimate the importance of those features or distribute the importance across them. This is a known limitation; for example, if several weather variables are highly correlated, shuffling one at a time might not reveal a significant effect since the model can rely on the others. In other words, in multicollinearity (features moving together), the permutation test can understate a feature’s true importance (Altmann et al., 2010). Despite this limitation, permutation importance remains a popular and straightforward tool for interpreting models. It provides a clear, quantitative insight into which inputs our neural network considers most relevant, thereby adding transparency to the forecasting model’s behaviour. Each importance score derived from this method correlates directly with the model’s predictive performance, an intuitive metric for economists and stakeholders to grasp which drivers genuinely contribute to the price predictions. Appendix B This Appendix overviews the Variable inclusion in the 13 different LSTM models. A Historical Daily Cocoa Futers Prices B Global Daily Avrage Tempreture C Global Daily Avrage Percipitation D The 52 Cacaoa Farming Countries Respective Daily Avrage Temprature E The 52 Cacaoa Farming Countries Respective Daily Avrage Percipiation F Côte d’Ivoire’s All Available Era5 Grid-cell Temperature Data G Côte d’Ivoire’s All Available Era5 Grid-cell Precipitation Data I Ghana’s All Available Era5 Grid-cell Temperature Data J Ghana’s All Available Era5 Grid-cell Precipitation Data K Côte d’Ivoire’s Cacao Farm area’s All Available Era5 Grid-cell Temperature Data L Côte d’Ivoire’s Cacao Farm area’s All Available Era5 Grid-cell Precipitation Data M Ghana’s Cacao Farm area’s All Available Era5 Grid-cell Temperature Data N Ghana’s Cacao Farm area’s All Available Era5 Grid-cell Precipitation Data # Model A B C D E F G I J K L M N 1 Base model* X 2 Global Temp X X 3 Global Temp & Precipitation X X X 4 The 52 countries’ Temp X X 5 The 52’s Temp & Precipitation X X X 6 Côte d’Ivoire’s & Ghana’s total X X X Temp data 40 7 IC’s & G’s farm area: Temp X X X 8 Farm area: Temp & it’s optimal X X X Range 9 Only IC’s farm area: Temp & X X X Precipitation 10 IC & G’s farm area: Precipitation X X X 11 IC & G’s farm area: Precipitation X X X & its Optimal Range 12 Only Global Temp X 13 Only Global Temp & X X Precipitation 41