Publications

Cao, J; Zhang, Z; Tao, FL; Zhang, LL; Luo, YC; Zhang, J; Han, JC; Xie, J (2021). Integrating Multi-Source Data for Rice Yield Prediction across China using Machine Learning and Deep Learning Approaches. AGRICULTURAL AND FOREST METEOROLOGY, 297, 108275.

Abstract
Timely and reliable yield prediction at a large scale is imperative and prerequisite to prevent climate risk and ensure food security, especially with climate change and increasing extreme climate events. In this study, integrating the publicly available data (i.e., satellite vegetation indexes, meteorological indexes, and soil properties) within the Google Earth Engine (GEE) platform, we developed one Least Absolute Shrinkage and Selection Operator (LASSO) regression, one machine learning (Random Forest, RF), and one deep learning (Long Short-Term Memory Networks, LSTM) model to predict rice yield at county-level across China. For satellite data, we compared the contiguous solar-induced chlorophyll fluorescence (SIF), a newly emerging satellite retrieval, with a traditional vegetation index (enhanced vegetation index, EVI). The results showed that LSTM (with R-2 ranging from 0.77 to 0.87, RMSE from 298.11 to 724kg/ha) and RF (with R-2 ranging from 0.76 to 0.82, RMSE from 366 to 723.3 kg/ha) models outperformed LASSO (with R-2 ranging from 0.33 to 0.42, RMSE from 633.46 kg/ha to 1231.39 kg/ha) in yield prediction; and LSTM was better than RF. Besides, ESI (combining EVI and SIF together) could slightly improve the model performance compared with only using EVI or SIF as the single input, primarily due to the ability of satellite-based SIF in capturing extra information on drought and heat stress. Furthermore, we also explored the potential for timely rice yield prediction, and concluded that the optimal prediction could be achieved with approximately two/one-month leading-time before single/double rice maturity. Our findings demonstrated a scalable, simple and inexpensive methods for timely predicting rice yield over a large area with publicly available multi-source data, which can potentially be applied to areas with sparsely observed data and worldwide for estimating crop yields.

DOI:
10.1016/j.agrformet.2020.108275

ISSN:
0168-1923