Publications

Palacios, HJG; Pantoja, GAH; Navarro, AAM; Puetaman, IMA; Toledo, RAJ (2016). Comparativa entre CRISP-DM y SEMMA para la limpieza de datos en productos MODIS en un estudio de cambio de cobertura y uso del suelo Comparative between CRISP-DM and SEMMA for data cleaning of MODIS products in a study of land use and land cover change. 2016 IEEE 11TH COLOMBIAN COMPUTING CONFERENCE (CCC).

Abstract
The studies about land use and land cover change allows through vegetation indices, determine if a field in terms of coverage is better or worse. However the validity and reliability of the study depends on the quality of the data used for it, for this reason to ensure the quality of the data, is suggested implement a data mining methodology, however for such studies, it is difficult to identify the methodology to implement, given this situation is necessary to make a comparison between two very popular data mining methodologies. For the case study were applied the CRISP-DM and SEMMA methodologies, thoroughly following each stage, general task, specific task and activity according to the official documentation. Thus it is began with the understanding the problem of the case study, proposing data mining goals, data understanding and finally, with the cleaning process and the construction of the data repository as detailed in this article. As for the download, reprojection transformation, cleaning and storage of products MODIS, were used in all cases R and Python scripts to optimize the process.

DOI:

ISSN:
2378-8216