Publications

Ahmadi, B; Gholamalifard, M; Ghasempouri, SM; Kutser, T (2025). Comparative analysis of k-nearest neighbors distance metrics for retrieving coastal water quality based on concurrent in situ and satellite observations. MARINE POLLUTION BULLETIN, 214, 117816.

Abstract
It is time consuming and expensive to monitor extensive areas of coastal waters with sufficient frequency using in situ (ship based) methods. Satellite remote sensing is much more cost effective. Satellites can detect Optically Active Constituents (OACs) in water. Therefore, it is crucial to know the concentrations of OACs in the study area in order to develop and validate remote sensing methods suitable for assessing water quality in this region. The Pars Special Economic Energy Zone (PSEEZ), a major hub of natural gas extraction in the Persian Gulf, has undergone rapid industrial expansion since 1998, intensifying environmental pressures and necessitating high- resolution, frequent water quality assessments. However, a structured, long-term monitoring framework is absent despite the significance of this region. In order to develop satellite-based remote sensing methods for this region we carried out measurements of different OACs (chlorophyll-a, coloured dissolved organic matter (CDOM) and turbidity) and tested Landsat 8, Sentinel-2, and Sentinel-3 performance in retrieving the OACs. We tested the k-Nearest Neighbors machine learning algorithm. The selection of distance metrics demonstrated a significant influence on the accuracy of retrieving OACs. In turbidity retrieval, the Euclidean Distance (ED) enhanced the regression slope to 0.90 (a 55.17 % improvement over Fuzzy Mahalanobis Distance (FD)) and reduced the RMSLE to 0.51, corresponding to an approximate 160 % enhancement in precision. For CDOM, RMSLE values for ED and FD were 0.39 and 0.48, respectively, indicating an 18.75 % improvement favoring ED. Furthermore, bias analysis revealed deviations of 1-6 % compared to reference data, with the lowest values observed for Mahalanobis Distance (MD) with MSI and FD with OLCI. In chlorophyll-a retrieval, the choice of distance metric directly impacted the accuracy of the OLCI sensor, inducing bidirectional bias, comprising both overestimation and underestimation, which varied depending on the selected metric. These results underscore the critical importance of optimizing distance metric selection to enhance prediction accuracy and mitigate systematic errors in remote sensing applications. Furthermore, the results revealed that the implementation of this algorithm exhibited substantially superior performance compared to other evaluated algorithms within the study area, achieving significantly higher accuracy metrics. Thereby establishing k-NN as the optimal framework for satellite-based water quality monitoring in environmentally sensitive regions like PSEEZ.

DOI:
10.1016/j.marpolbul.2025.117816

ISSN:
1879-3363