MODIS Web

Science Team

Publications

Xue, ZH; Zhang, YJ; Zhang, L; Li, H (2022). Ensemble Learning Embedded With Gaussian Process Regression for Soil Moisture Estimation: A Case Study of the Continental US. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 60, 4508817.

Abstract
Soil moisture (SM) is critical in maintaining the balance of Earth's environment and climate system. Existing machine learning-based SM estimation methods mainly belong to a single model, which may be unstable and probably lacks adaptability when switching to other sites. In addition, the in situ observations are usually inadequate, deteriorating the generalization performance of a single model. To overcome the above issues, we design two novel ensemble learning models based on the Gaussian process regression (GPR) for SM estimation. One is bagging embedded with GPR (BAGGPR), which is a parallel algorithm designed by randomly selecting multiple data subsets to train an ensemble of GPR models. The other is gradient boosting (BOOST) embedded with GPR (GBGPR), which is a sequential algorithm built by iteratively learning the residual between the previous prediction and its true value. In GBGPR, we also propose an improved Huber loss function by applying square loss on different residuals. The proposed methods greatly enhance the stability, adaptability, and generalization performance of a single GPR model. Experiments are conducted in the continental U.S. (CONUS) between April 2015 and March 2016, where BAGGPR and GBGPR are tested to estimate SM based on 11 multisource remote sensing features. The results demonstrate that our proposed methods outperform other state-of-the-art models, including thirteen single models and three ensemble models, with R being 0.8958 and 0.9047, and root mean square error (RMSE) being 0.0513 cm(3)/cm(3) and 0.0490 cm(3)/cm(3), respectively, for BAGGPR and GBGPR. Moreover, the proposed methods can well capture spatial dynamics, and they have higher consistent with the in situ measurements, better generalization performance in terms of training data, and higher adaptability to different in situ networks.

DOI:
10.1109/TGRS.2022.3166777

ISSN:
1558-0644