Publications

Nihar, A; Patel, NR; Danodia, A (2022). Machine-Learning-Based Regional Yield Forecasting for Sugarcane Crop in Uttar Pradesh, India. JOURNAL OF THE INDIAN SOCIETY OF REMOTE SENSING, 50(8), 1519-1530.

Abstract
Sugarcane (Saccharum officinarum) is a major cash crop in India that needs to be monitored cautiously as it contributes significantly to the national exchequer and provides employment to over a million people mainly through sugar and renewable bioenergy production. The objective of this study is to predict the regional sugarcane crop yield for the Uttar Pradesh (UP) province using analysis-ready moderate-resolution satellite images. The four machine learning regression algorithms, namely support vector regression (SVR), gradient boosting regression (GBR), eXtreme gradient boosting regression (XGB), and random forest regression (RF), were used to train and predict the district-wise sugarcane yield in UP. Standard MODIS data products such as leaf area index (LAI), fraction of photosynthetically active radiation (FPAR), normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), evapotranspiration (ET), potential evapotranspiration (PET), latent heat flux (LE), and gross primary product (GPP) were obtained for a period of eighteen years, and their monthly average was extracted and used as features in the model. The models were trained using eighty percent of the observations with the annual district-wise sugarcane yield as the response variable. Iterative feature selection was done based on correlation factor and feature importance to reduce the dimensionality of the data from around a hundred features to twenty-four. The R-2 metric was used as the evaluation metric to choose the best predictive model. The study showed results with moderate accuracy and was used to estimate the sugarcane crop yield for the year 2019. The highest R-2 of 0.66 and an RMSE value of 7.15 t/ha were obtained using the GBR algorithm by using seven variables and 24 features. The model was very closely followed by the XGB model with an R-2 of 0.65 and an RMSE value of 7.20 t/ha. The FPAR-based features contributed the most to the model followed by the LAI and NDVI features. The simple methodology used in the study uses ready-to-use satellite products and has operationalization potential. The model can be improved by incorporating more satellite-derived parameters and an accurate crop mask to avoid spectral interference from other cropland.

DOI:
10.1007/s12524-022-01549-0

ISSN:
0974-3006