Publications

Song, XP; Li, HJ; Potapov, P; Hansen, MC (2022). Annual 30 m soybean yield mapping in Brazil using long-term satellite observations, climate data and machine learning. AGRICULTURAL AND FOREST METEOROLOGY, 326, 109186.

Abstract
Long-term spatially explicit information on crop yield is essential for understanding food security in a changing climate. Here we present a study that combines twenty-years of Landsat and MODIS data, climate and weather records, municipality-level crop yield statistics, random forests and linear regression models for mapping crop yield in a multi-temporal, multi-scale modeling framework. The study was conducted for soybean in Brazil. Using a recently developed 30 m resolution, annual (2001-2019) soybean classification map product, we aggregated multi-temporal phenological metrics derived from Landsat and MODIS data over soybean pixels to the munic-ipality scale. We combined phenological metrics with topographic features, long-term climate data, in-season weather data and soil variables as inputs to machine learning models. We trained a multi-year random forests model using yield statistics as reference and subsequently applied linear regression to adjust the biases in the direct output of the random forests model. This model combination achieved the best performance with a root -mean-square-error (RMSE) of 344 kg/ha (12% relative to long-term mean yield) and an r2 of 0.69, on the basis of 20% withheld test data. The RMSE of the leave-one-year-out model assessment ranged from 259 kg/ha to 816 kg/ha. To eliminate the artifacts caused by the coarse-resolution climate and weather data, we developed multiple models with different categories of input variables. Employing the per-pixel uncertainty estimates of different models, the final soybean yield maps were produced through per-pixel model composition. We applied the models trained on 2001-2019 data to 2020 data and produced a soybean yield map for 2020, demonstrating the predictive capability of trained machine learning models for operational yield mapping in future years. Our research showed that combining satellite, climate and weather data and machine learning could effectively map crop yield at high resolution, providing critical information to understand yield growth, anomaly and food security.

DOI:
10.1016/j.agrformet.2022.109186

ISSN:
1873-2240