Publications

Raza, A; Shahid, MA; Zaman, M; Miao, YX; Huang, YB; Safdar, M; Maqbool, S; Muhammad, NE (2025). Improving Wheat Yield Prediction with Multi-Source Remote Sensing Data and Machine Learning in Arid Regions. REMOTE SENSING, 17(5), 774.

Abstract
Wheat (Triticum aestivum L.) is one of the world's primary food crops, and timely and accurate yield prediction is essential for ensuring food security. There has been a growing use of remote sensing, climate data, and their combination to estimate yields, but the optimal indices and time window for wheat yield prediction in arid regions remain unclear. This study was conducted to (1) assess the performance of widely recognized remote sensing indices to predict wheat yield at different growth stages, (2) evaluate the predictive accuracy of different yield predictive machine learning models, (3) determine the appropriate growth period for wheat yield prediction in arid regions, and (4) evaluate the impact of climate parameters on model accuracy. The vegetation indices, widely recognized due to their proven effectiveness, used in this study include the Normalized Difference Vegetation Index (NDVI), the Enhanced Vegetation Index (EVI), and the Atmospheric Resistance Vegetation Index (ARVI). Moreover, four machine learning models, viz. Decision Trees (DTs), Random Forest (RF), Gradient Boosting (GB), and Bagging Trees (BTs), were evaluated to assess their predictive accuracy for wheat yield in the arid region. The whole wheat growth period was divided into three time windows: tillering to grain filling (December 15-March), stem elongation to grain filling (January 15-March), and heading to grain filling (February-March 15). The model was evaluated and developed in the Google Earth Engine (GEE), combining climate and remote sensing data. The results showed that the RF model with ARVI could accurately predict wheat yield at the grain filling and the maturity stages in arid regions with an R-2 > 0.75 and yield error of less than 10%. The grain filling stage was identified as the optimal prediction window for wheat yield in arid regions. While RF with ARVI delivered the best results, GB with EVI showed slightly lower precision but still outperformed other models. It is concluded that combining multisource data and machine learning models is a promising approach for wheat yield prediction in arid regions.

DOI:
10.3390/rs17050774

ISSN:
2072-4292