Evaluation and Comparison of Different Data Mining Models for Identifying Areas at Risk of Gully Erosion: A case study of Mian Ab Watershed in Khuzestan Province

Document Type : Full length article

Authors

Department of Physical Geography, Faculty of Geography Sciences and Planning, University of Isfahan, Isfahan, Iran

10.22059/jphgr.2025.387445.1007862

Abstract

ABSTRACT
Gully erosion refers to the formation and expansion of erosional channels in the soil as a result of concentrated water flow. Generally, when the eroded channels on the land surface become so large that they can no longer be leveled through conventional farming operations, they are referred to as gullies. The objective of this study is to compare the CART and SVM models for identifying high-risk areas of gully erosion and determining the most influential parameters contributing to gully erosion in the Mian-Ab watershed of Shushtar County in Khuzestan Province. First, the locations of existing gullies were recorded using satellite images, Google Earth software, and the Global Positioning System (GPS). Then, the independent variables influencing gully erosion were prepared, and after assigning their respective values, modeling was conducted in R software to delineate or predict gully-prone areas. In total, 3,000 data records related to gully erosion were collected, with 70% used for training and 30% for testing the models. According to the results, the SVM model, with an R² value of 0.846, demonstrated higher accuracy compared to the CART model. Moreover, based on the gully erosion risk maps, the very low risk class covered the most significant portion of the watershed area (approximately 59%), followed by the low, moderate, high, and very high-risk classes, covering 12.27%, 10.29%, 9.23%, and 8.49% of the watershed area, respectively. Based on the findings, the land-use, vegetation cover, and soil texture indices played the most significant roles in the occurrence and expansion of gully erosion.
Extended Abstract
Introduction
Soil erosion and sediment production are significant limitations in the use of water and soil resources. Currently, gully erosion is becoming one of the most significant forms of erosion worldwide and has thus received considerable attention from researchers in recent decades. Various studies have been conducted on how gully erosion occurs and develops in different climates. In many regions, a substantial amount of sediment generated in watersheds is attributed to gully erosion. Notably, around 125 million hectares of Iran's total land area of 165 million hectares are susceptible to water erosion. Soil erosion leads to soil degradation and abandonment of farmland, resulting in irreparable damage. Developing appropriate strategies for preventing and mitigating gully erosion requires a complete understanding of its dynamics and controlling factors. Given the development of machine learning models and their successful performance in various scientific fields, many researchers have utilized machine learning models for hazard mapping and predicting erosion risk. The results indicate the successful and accurate performance of these models. This study also evaluates the effectiveness of two machine learning algorithms, SVM and CART, in mapping the risk of gully erosion.
 
Methodology
In this study, the sensitivity of gully erosion in the Mianab-Shushtar watershed has been investigated, and machine learning methods have been utilized to predict gully erosion sensitivity. In the first step, a map of gully locations has been prepared, using various methods and tools including satellite images, aerial photographs, and field visits. Subsequently, topographic indices such as elevation, slope, slope aspect, soil texture, Stream Power Index (SPI), Topographic Wetness Index (TWI), vegetation cover (NDVI), lithology, distance from rivers, Terrain Ruggedness Index (TRI), distance from roads, soil erodibility index (K), rainfall erosivity index (R), and drainage density index are examined as environmental parameters influencing gully erosion occurrence. In the next step, 70% of the gullies under study are randomly selected and used as training data, while the remaining 30% are utilized as validation data. In the following stage, the map of gully locations is entered into the SVM and CART models as the dependent variable, with the environmental layers serving as independent variables to model the occurrence of gully erosion.
 
Results and discussion
In this study, the variables of landforms, elevation, slope, slope direction and length, vegetation cover, soil texture, distance from roads, land-use, lithology, soil erodibility, topographic moisture, flow power, drainage density, erosive rainfall, and distance from rivers were selected and examined as influential factors in gully erosion. Erosion points were used as the dependent variable in this research. Field surveys and ground surveys were employed to collect these points. The exact locations of the gullies were recorded using handheld GPS and then reviewed and corrected using Google Earth software. In total, 3,000 gully erosion points were collected, representing the spatial distribution of this phenomenon in the area. Most points affected by this type of erosion are found in the southern and eastern regions of the watershed.
Next, to obtain a potential gully erosion map for the watershed, layers of the studied indicators were prepared. After preparing the independent and dependent variables, a risk zoning map for gully erosion was created using the CART model in R software. The correlation coefficient between the predicted values of the CART model and the observed values was 0.889. The R² coefficient for this model was calculated to be 0.791, which is considered an appropriate level of determination for models related to gully erosion.
According to the zoning map produced by the CART model, areas with very high risk are primarily concentrated in the eastern and southeastern parts of the basin, which coincide with sloped and foothill lands. Areas with high risk are distributed in a band adjacent to these regions. In contrast, areas with moderate risk are mainly located in the center of the basin and near stream networks.
The results from the SVM model demonstrate its significant performance in predicting and assessing erosion. The evaluation results indicated a correlation coefficient of 0.92 between observed and predicted values, showing a robust correlation between actual and predicted data.  Additionally, R² was calculated to be 0.846. Statistical indicators suggest that the SVM model successfully identified and modeled the complex patterns of gully erosion in the study area. Sensitivity analysis indicated that the most important factors affecting the SVM model included soil texture and the NDVI index. According to the zoning map from the SVM model, the "high risk" and "very high risk" classes are mainly concentrated in the eastern sections, close to tributaries and relatively steeper slopes. This geographical distribution may occur due to the high density of waterways alongside other influencing indicators of this phenomenon.
Based on the results obtained from SVM and CART models, both models performed well and were able to predict gully erosion risk with reasonable accuracy; however, according to the results, the SVM model showed better performance.
 
Conclusion
The present study showed that machine learning models such as SVM and CART can play an important role in identifying and mapping the risk of gully erosion in the Miānāb watershed. A precise understanding of the factors affecting gully erosion and the implementation of appropriate management measures can significantly help reduce the damage caused by this phenomenon and preserve the region's natural resources. This research contributes to the advancement of scientific knowledge in the field of gully erosion and the application of data mining models in natural resource management, and it can provide a foundation for future studies in this area.
 
Funding
There is no funding support.
 
Authors’ Contribution
Authors contributed equally to the conceptualization and writing of the article. All of the authors approved the content of the manuscript and agreed on all aspects of the work declaration of competing interest none.
 
Conflict of Interest
Authors declared no conflict of interest.
 
Acknowledgments
We are grateful to all the scientific consultants of this paper.

Keywords

Main Subjects


  1. Aboutaib, F., Krimissa, S., Pradhan, B., Elaloui, A., Ismaili, M., Abdelrahman, K., & Namous, M. (2023). Evaluating the effectiveness and robustness of machine learning models with varied geo-environmental factors for determining vulnerability to water flow-induced gully erosion. Frontiers in Environmental Science, 11, Article 1207027. https://doi.org/10.3389/fenvs.2023.1207027
  2. Al-Abru, D. J. K., & Al-Moardah, P. D. H. J. U. (2024). The role of artificial intelligence data in estimating the extent of gully erosion the basin is Wadi Khashm Al-Mujadar. South Eastern European Journal of Public Health, 695–708. https://doi.org/10.70135/seejph.vi.1278
  3. Asadi Nalivan, O., Rabet, A., Vakili tajareh, F., Ramezani, M., Momeni, M., & Heydari, K. (2023). Zoning gully erosion susceptibility using ANN, CART and RF models. Watershed Engineering and Management, 15(2), 155-171. https://doi.org/10.22092/ijwmse.2022.356379.1920 [In Persian]
  4. Besharati, B., Abedini, M., & Asghari, S. (2018). Analyzing and investigating effective factors on creating and promoting gully erosions in Shorchay watershed. GeoRes, 33(2), 206-222. https://doi.org/10.29252/geores.33.2.206 [In Persian]
  5. Burrough, P. A., McDonnell, R. A., & Lloyd, C. D. (2015). Principles of geographical information systems. Oxford University Press.
  6. Castillo, C., & Gómez, J. A. (2016). A century of gully erosion research: Urgency, complexity and study approaches. Earth Science Reviews, 160, 300–319. https://doi.org/10.1016/j.earscirev.2016.07.009
  7. Chaplot, V. (2013). Impact of terrain attributes, parent material and soil types on gully erosion. Geomorphology, 186, 1–11. https://doi.org/10.1016/j.geomorph.2012.10.031
  8. Filho, J. D. P. M., Guerra, A. J. T., Cruz, C. B. M., Jorge, M. D. C. O., & Booth, C. A. (2024). Machine learning models for the spatial prediction of gully erosion susceptibility in the Piraí Drainage Basin, Paraíba Do Sul Middle Valley, Southeast Brazil. Land, 13(10), Article 1665. https://doi.org/10.3390/land13101665
  9. Gayen, A., & Haque, S. M. (2024). Gully erosion susceptibility using advanced machine learning method in Pathro River Basin, India. In R. Sarkar, S. Saha, B. R. Adhikari, & R. Shaw (Eds.), Geomorphic risk reduction using geospatial methods and tools (pp. xx-xx). Springer. https://doi.org/10.1007/978-981-99-7707-9_2
  10. Gayen, A., Pourghasemi, H. R., Saha, S., Keesstra, S., & Bai, S. (2019). Gully erosion susceptibility assessment and management of hazard-prone areas in India using different machine learning algorithms. Science of the Total Environment, 668, 124-138. https://doi.org/10.1016/j.scitotenv.2019.02.436
  11. Gelete, T. B., Pasala, P., Abay, N. G., Woldemariam, G. W., Yasin, K. H., Kebede, E., & Aliyi, I. (2024). Integrated machine learning and geospatial analysis enhanced gully erosion susceptibility modeling in the Erer watershed in Eastern Ethiopia. Frontiers in Environmental Science, 12, Article 1410741. https://doi.org/10.3389/fenvs.2024.1410741
  12. Hasanuzzaman, M., & Shit, P. (2024). Assessment of gully erosion susceptibility using four data-driven models AHP, FR, RF and XGBoosting machine learning algorithms. Natural Hazards Research. Advance online publication. https://doi.org/10.1016/j.nhres.2024.05.001
  13. Hitouri, S., Meriame, M., Ajim, A. S., Pacheco, Q. R., Nguyen-Huy, T., Bao, P. Q., & Varasano, A. (2024). Gully erosion mapping susceptibility in a Mediterranean environment: A hybrid decision-making model. International Soil and Water Conservation Research, 12(2), 279-297. https://doi.org/10.1016/j.iswcr.2023.09.008
  14. Hosseinalizadeh, M., Alinejad, M., Mohammadian Behbahani, A., Khormali, F., Kariminejad, N., & Pourghasemi, H. R. (2020). A review on the gully erosion and land degradation in Iran. In P. Shit, H. R. Pourghasemi, & G. S. Bhunia (Eds.), Gully erosion studies from India and surrounding regions (pp. 393-403). Springer. https://doi.org/10.1007/978-3-030-23243-6_26
  15. Huete, A. R. (1988). A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment, 25(3), 295-309. https://doi.org/10.1016/0034-4257(88)90106-X
  16. Kordavani, P. (2002). Soil conservation (7th ed.). Tehran University Press [In Persian]
  17. Liu, G., Zheng, F., Jia, L., Jia, Y., Chang, X., Zhang, Hu, F., & Zhang, J. (2019). Interactive effects of raindrop impact and groundwater seepage on soil erosion. Journal of Hydrology, 578, Article 124066. https://doi.org/10.1016/j.jhydrol.2019.124066
  18. Masoudi, M., & Zakerinejad, R. (2011). A new model for assessment of erosion using desertification model of IMDPA in Mazayjan plain, Fars province, Iran. Ecology, Environment and Conservation, 17(3), 489–594.
  19. Mohebzadeh, H., Biswas, A., & DeVries, B. (2024). Transferability of predictive models to map susceptibility of ephemeral gullies at large scale. Natural Hazards, 120, 4527–4561. https://doi.org/10.1007/s11069-023-06377-0
  20. Nnah, S. I., Esechie, S., & Ikwueze, U. H. (2024). Impact of urbanization on gully erosion in Benin City, Edo State using remote sensing and GIS. International Journal of Innovative Environmental Studies Research, 12(2), 33-44.
  21. Oraegbu, A., & Jolaiya, E. (2024). Mapping soil erosion classes using remote sensing data and ensemble models. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 48(4/W12), 135-142. https://doi.org/10.5194/isprs-archives-XLVIII-4-W12-2024-135-2024
  22. Phinzi, K., & Szabo, S. (2024). Predictive machine learning for gully susceptibility modeling with geo-environmental covariates: Main drivers, model performance, and computational efficiency. Natural Hazards. Advance online publication. https://doi.org/10.1007/s11069-024-06481-9
  23. Poesen, J., Nachtergaele, J., Verstraeten, G., & Valentin, C. (2003). Gully erosion and environmental change: Importance and research needs. Catena, 50(2–4), 91–133. https://doi.org/10.1016/S0341-8162(02)00143-1
  24. Pourghasemi, H. R., Sadhasivam, N., Kariminejad, N., & Collins, A. L. (2020). Gully erosion spatial modelling: Role of machine learning algorithms in selection of the best controlling factors and modelling process. Geoscience Frontiers, 11(6), 2207-2219. https://doi.org/10.1016/j.gsf.2020.03.005
  25. Refahi, H. (2006). Water erosion and its control (5th ed.). Tehran University Press [In Persian]
  26. Saeediyan, H., Shirani, K., Salajegheh, A., & Ahmadi, R. (2023). Investigating the performance of the entropy maximum model in determining the importance of effective environmental factors in creating gully erosion in semi-arid areas. Journal of New Approaches in Water Engineering and Environment, 2(1), 129-144. https://doi.org/10.22034/nawee.2023.407297.1047 [In Persian]
  27. Sarkar, R., Saha, S., Adhikari, B. R., & Shaw, R. (Eds.). (2024). Geomorphic risk reduction using geospatial methods and tools. Springer.
  28. Shahbazi, A., Vakili tajareh, F., Alvandi, E., Bayat, A., & Asadi nalivan, O. (2021). Assessment of artificial neural network models and maximum entropy in zoning of gully erosion sensitivity of Golestan Dam basin. jwmseir, 15(52), 12-23. http://doi.org/20.1001.1.20089554.1400.15.52.4.6 [In Persian]
  29. Tahmasebipoor, N., Rahmati, O., & Ghorbani Nejad, S. (2016). Prediction of gully erosion susceptibility in Seimare region using certainty factor model and importance analysis of conditioning factors. Iranian Journal of Eco Hydrology, 3(1), 83-93. https://doi.org/10.22059/ije.2016.59192 [In Persian]
  30. Teimurian, T., Nazari Samani, A., Feiznia, S., Ahmadaali, K., & Soleimanpour, S. M. (2022). Determining the spatial distribution of gully erosion probability using the MaxEnt model. Watershed Management Research, 35(2), 2-15. https://doi.org/10.22092/wmrj.2021.354647.1415 [In Persian]
  31. Tucker, C. J. (1979). Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 8(2), 127-150. https://doi.org/10.1016/0034-4257(79)90013-0
  32. Valentin, C., Poesen, J., & Li, Y. (2005). Gully erosion: Impacts, factors and control. Catena, 63(2-3), 132-153. https://doi.org/10.1016/j.catena.2005.06.001
  33. Vosoghi, S., Zakerinejad, R., & Entezari, M. (2025). Prediction of gully erosion and identifying factors affecting it using the maximum entropy model and BCC-CSM2-MR climate change models for the years 2020-2040 (case study: Alamarvdasht watershed). Journal of Geography and Planning, 28(90), 141-163. https://doi.org/10.22034/gp.2023.57572.3169 [In Persian]
  34. Wilson, J. P., & Gallant, J. C. (2000). Terrain analysis: Principles and applications. Wiley.
  35. Wischmeier, W. H., & Smith, D. D. (1978). Predicting rainfall erosion losses: A guide to conservation planning (Agriculture Handbook No. 537). U.S. Department of Agriculture.
  36. Xia, J., Cai, C., Wei, Y., & Wu, X. (2019). Granite residual soil properties in collapsing gullies of south China: Spatial variations and effects on collapsing gully erosion. Catena, 174, 469-477. https://doi.org/10.1016/J.CATENA.2018.11.015
  37. Yamaguchi, S., & Izumi, N. (1999). Effects of vegetation on gully formation. Doboku Gakkai Ronbunshuu B, 43, 605-610. https://doi.org/10.2208/PROHE.43.605
  38. Yousefi Mobarhan, E., & Shirani, K. (2023). Assessment of maximum entropy (ME) to identify effective factors on gully erosion and determination of sensitive areas in Alaa Semnan watershed. Journal of Watershed Management Research, 14(28), 37-54. https://doi.org/10.61186/jwmr.14.28.37 [In Persian]
  39. Zakerinejad, R. (2020). Evaluation of DEMs to the modeling of the potential of gully erosion using Maxent model (Case study: Semirom catchment in the south of Isfahan Province, Iran). Journal of RS and GIS for Natural Resources, 11(3), 106-122. https://doi.org/10.30495/girs.2020.674955 [In Persian]
  40. Zakerinejad, R., Christian, S., Volker, H., & Michael, M. (2021). Spatial distribution of water erosion using stochastic modeling in the southern Isfahan Province, Iran. Geografia Fisica e Dinamica Quaternaria, 44(2), 203–216. https://doi.org/10.4461/GFDQ.2021.44.14
  41. Zakerinejad, R., & Moavi, M. (2023). Investigating the effects of land use changes on vegetation (Case study: Mian-Ab watershed in the period 2000-2020). Arid Regions Geographic Studies, 15, 132-147. https://doi.org/10.22034/jargs.2023.407397.1049 [In Persian]
  42. Zakerinejad, R., Omran, A., Hochschild, V., & Maerker, M. (2018). Assessment of gully erosion in relation to lithology in the Southwestern Zagros Mountains, Iran using ASTER data, GIS and stochastic modeling. Geografia Fisica e Dinamica Quaternaria, 41(2), 95-104. https://doi.org/10.4461/GFDQ.2018.41.15
  43. Zakerinejad, R., Hochschild, V., Rahimi, M., & Maerker, M. (2016). Morphotectonic analysis of the Zagros Mountains using high resolution DEM to assess gully erosion processes: A case study in the Fars province, Southwest of Iran. International Geoinformatics Research and Development Journal, 7(1), 1-17.
  44. Zakerinejad, R., & Maerker, M. (2014). Prediction of gully erosion susceptibilities using detailed terrain analysis and maximum entropy modeling: A case study in the Mazayejan Plain, Southwest Iran. Geografia Fisica e Dinamica Quaternaria, 37(1), 67–76. https://doi.org/10.4461/GFDQ.2014.37.7