Noi, PT; Degener, J; Kappas, M (2017). Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data. REMOTE SENSING, 9(5), 398.
Abstract
Recently, several methods have been introduced and applied to estimate daily air surface temperature (T-a) using MODIS land surface temperature data (MODIS LST). Among these methods, the most common used method is statistical modeling, and the most applied algorithms are linear/multiple linear regression models (LM). There are only a handful of studies using machine learning algorithm models such as random forest (RF) or cubist regression (CB). In particular, there is no study comparing different combinations of four MODIS LST datasets with or without auxiliary data using different algorithms such as multiple linear regression, random forest, and cubist regression for daily Ta-max, Ta-min, and Ta-mean estimation. Our study examines the mentioned combinations of four MODIS-LST datasets and shows that different combinations and differently applied algorithms produce various T-a estimation accuracies. Additional analysis of daily data from three climate stations in the mountain area of North West of Vietnam for the period of five years (2009 to 2013) with four MODIS LST datasets (AQUA daytime, AQUA nighttime, TERRA daytime, and TERRA nighttime) and two additional auxiliary datasets (elevation and Julian day) shows that CB and LM should be applied if MODIS LST data is used solely. If MODIS LST is used together with auxiliary data, especially in mountainous areas, CB or RF is highly recommended. This study proved that the very high accuracy of T-a estimation (R-2 > 0.93/0.80/0.89 and RMSE similar to 1.5/2.0/1.6 degrees C of Ta-max, Ta-min, and T-a-mean, respectively) could be achieved with a simple combination of four LST data, elevation, and Julian day data using a suitable algorithm.
DOI:
10.3390/rs9050398
ISSN:
2072-4292