Prediction model for archaeological site locations- comparison of maximum entropy and Weights-of-Evidence: A case study of North Khorasan province

Document Type : Full length article

Authors

Department of RS and GIS, Faculty of Geography, University of Tehran, Tehran, Iran

Abstract

ABSTRACT
The protection and identification of archaeological sites have become increasingly important due to threats posed by urbanization, agricultural expansion, and infrastructure development. While traditional field surveys are effective, they are often time-consuming and costly. Spatial predictive modeling using environmental variables and known site data provides an efficient tool for identifying areas with high archaeological potential. This study compares two predictive modeling methods—Weights of Evidence (WoE) and Maximum Entropy (MaxEnt)—for identifying archaeological sites in North Khorasan Province, Iran. This region holds archaeological significance following the recent discovery of an Achaemenid palace near Ashkhaneh. A dataset of 980 archaeological sites was randomly divided into training (70%) and validation (30%) sets. Six environmental variables—elevation, slope, distance from rivers, vegetation cover, proximity to settlements, and geology—were processed in a GIS environment at a 30-meter resolution. The WoE model was implemented using a Bayesian approach and conditional independence testing (Chi-square), while MaxEnt was applied without requiring variable independence. Results showed that MaxEnt outperformed WoE, with training and validation success rates of 88.65% and 88.20%, respectively, compared to 83.26% and 83.00% for WoE. Proximity to settlements and rivers were the most influential predictive factors. Due to its flexibility and independence from variable assumptions, MaxEnt is more suitable for complex environments, whereas WoE remains effective for smaller-scale projects due to its simplicity. Future research is recommended to incorporate hybrid models and high-resolution datasets to enhance prediction accuracy.
 
Extended Abstract
Introduction
In recent decades, the protection and identification of archaeological sites have become increasingly critical due to the growing threats posed by urbanization, agricultural expansion, and infrastructure development. Conventional archaeological field surveys, while effective, are typically time-consuming, labor-intensive, and costly. Additionally, unintentional damage to undiscovered sites during construction or land conversion is a growing concern. To address these challenges, predictive spatial modeling has emerged as a powerful tool in archaeological research. By integrating environmental factors and the known locations of archaeological sites into statistical and geospatial models, researchers can generate potential maps that estimate the likelihood of undiscovered settlements. These maps are invaluable for guiding targeted surveys, preserving cultural heritage, and allocating research resources more effectively.
This study aims to compare two prominent predictive modeling approaches, Weights of Evidence (WoE) and Maximum Entropy (MaxEnt), in their ability to identify areas with high potential for archaeological settlements. The research focuses on North Khorasan Province in northeastern Iran, a region of emerging archaeological significance following the discovery of a large Achaemenid structure near the city of Ashkhaneh. Given the limited number of previously documented sites in the area, especially from the Achaemenid period, the region offers a compelling test case for evaluating the utility of predictive modeling in uncovering hidden archaeological landscapes.
 
Methodology
The research utilized a dataset comprising 980 known archaeological sites in North Khorasan. These were randomly divided into training (70%) and validation (30%) subsets. Six environmental variables, elevation, slope, distance from rivers, distance from vegetation, distance from villages, and geology, were selected based on their hypothesized influence on ancient human settlement patterns. All environmental data were resampled to a 30-meter spatial resolution and processed using GIS tools. For the WoE model, a Bayesian approach was employed, requiring an initial independence test among variable pairs using the chi-square method. Based on the test results, six combinations of conditionally independent variables were selected to create potential maps. Each map was generated by overlaying weighted raster layers corresponding to the chosen variables. In contrast, the MaxEnt model did not require conditional independence among variables. It utilized all selected environmental factors simultaneously, applying the principle of maximum entropy to estimate the most uniform probability distribution consistent with the input data. MaxEnt’s flexibility makes it particularly suitable for complex ecological or archaeological contexts where variable interdependence is likely to occur. To ensure consistency between the two modeling approaches, the same variable combinations used in the WoE model were applied in MaxEnt. For both models, success rates were calculated by measuring the proportion of known site locations falling within high-probability areas on the generated maps.
 
Results and discussion
The analysis revealed that MaxEnt consistently outperformed WoE in terms of predictive accuracy. In the best-performing scenario, MaxEnt achieved a training success rate of 88.65% and a validation rate of 88.20%, while WoE’s top result was 83.26% for training and 83.00% for validation. This difference, although not drastic, suggests that MaxEnt may be more reliable in identifying areas with high archaeological potential, particularly in data-rich or complex environments. Variable contribution analysis in MaxEnt showed that proximity to villages and rivers was the most significant predictor of site locations. In some combinations, these two variables accounted for over 90% of the model's predictive power. This result aligns with archaeological theory, which emphasizes access to water, arable land, and inter-settlement connectivity as key factors influencing ancient settlement decisions. Slope and elevation were moderately influential, while geological factors played a relatively minor role.
Visual inspection of the predictive maps further demonstrated the superiority of MaxEnt. The maps it produced featured clearer transitions between high- and low-potential areas, as well as more precise delineation of low-probability zones. This level of detail is essential for guiding field surveys and minimizing effort in areas with low archaeological potential. While WoE also produced reasonable results, its performance was limited by the requirement for variable independence and its comparatively lower spatial resolution in modeling subtle environmental gradients. From a practical standpoint, WoE has the advantage of simplicity. It requires less computational power and is easier to interpret, making it suitable for smaller-scale projects or regions with limited data availability. However, the methodological constraints inherent to WoE, particularly the assumption of conditional independence, can reduce model flexibility and accuracy when dealing with complex real-world environments.
 
 
Conclusion
This comparative study demonstrates that MaxEnt offers a more robust and flexible framework for predictive archaeological modeling in North Khorasan. Its ability to incorporate multiple environmental variables without imposing statistical independence assumptions, along with its transparent variable contribution metrics, makes it especially advantageous for exploratory research and regional planning. Moreover, the superior predictive accuracy of MaxEnt suggests that it can play a critical role in heritage management, conservation planning, and the efficient allocation of archaeological survey resources. Despite these advantages, WoE should not be overlooked. Its lower computational requirements and intuitive Bayesian foundation still make it a valuable tool, particularly in resource-constrained settings. Future research should consider hybrid approaches that combine the strengths of both methods, for instance, using WoE to narrow down variable sets before applying MaxEnt for final predictions. Further studies should also incorporate high-resolution satellite imagery, socio-cultural and historical datasets, and alternative modeling techniques such as machine learning and fuzzy logic. Validation of models in different regions and landscape types will be essential for assessing the transferability and robustness of the predictive frameworks established in this study.
 
Funding
There is no funding support.
 
Authors’ Contribution
All of the authors approved the content of the manuscript and agreed on all aspects of the work.
 
Conflict of Interest
Authors declared no conflict of interest.
 
Acknowledgments
We are grateful to all the scientific consultants of this paper.

Keywords

Main Subjects


  1. Berger, A., Della Pietra, S. A., & Della Pietra, V. J. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22(1), 39–71. doi: 10.5555/234285.234289
  2. Berglund, B. E., Birks, H. J. B., Ralska-Jasiewiczowa, M., & Wright, H. E. (2014). Predictive models in archaeological site location studies. Archaeological Prospection, 21(3), 175–187.
  3. Betts, A., Jia, P. W., & Dodson, J. (2014). The origins of wheat in China and potential pathways for its introduction: A review. Quaternary International, 348, 158–168. doi: 10.1016/j.quaint.2013.07.044
  4. Bonham-Carter, G. F., & Bonham-Carter, G. (1994). Geographic information systems for geoscientists: Modelling with GIS. Elsevier.
  5. Chapman, P. M., McDonald, B. G., & Lawrence, G. S. (2002). Weight-of-evidence issues and frameworks for sediment quality (and other) assessments. Human and Ecological Risk Assessment, 8(7), 1489–1515. doi: 10.1080/20028091057457
  6. Dahal, R. K., Hasegawa, S., Nonomura, A., Yamanaka, M., Dhakal, S., & Paudyal, P. (2008). Predictive modelling of rainfall-induced landslide hazard in the Lesser Himalaya of Nepal based on weights-of-evidence. Geomorphology, 102(3–4), 496–510. doi: 10.1016/j.geomorph.2008.05.041
  7. Fry, G. L., Skar, B., Jerpåsen, G., Bakkestuen, V., & Erikstad, L. (2004). Locating archaeological sites in the landscape: A hierarchical approach based on landscape indicators. Landscape and Urban Planning, 67(1–4), 97–107. doi: 10.1016/S0169-2046(03)00031-8
  8. Goodchild, M. F. (2000). GIS and archaeological site location modeling. In Spatial Models and GIS (pp. 115–129). Taylor & Francis.
  9. Gull, S. F., & Skilling, J. (1984). Maximum entropy method in image processing. IEE Proceedings F: Communications. Radar and Signal Processing, 131(6), 646–659. doi: 10.1049/ip-f-1:19840099
  10. Haynes, K. E., & Storbeck, J. E. (1978). The entropy paradox and the distribution of urban population. Socio-Economic Planning Sciences, 12(1), 1–6. doi: 10.1016/0038-0121(78)90002-4
  11. Johnson, G. A. (1977). Aspects of regional analysis in archaeology. Annual Review of Anthropology, 6, 479–508. doi: 10.1146/annurev.an.06.100177.002403
  12. Kohler, T. A., & Parker, S. C. (1986). Predictive models for archaeological resource location. In Advances in Archaeological Method and Theory (pp. 397–452). Elsevier.
  13. Kvamme, K. L. (1990). The use of geographic information systems for archaeological predictive modeling. In Quantitative Methods in Archaeology (pp. 345–362). Academic Press.
  14. Kvamme, K. L. (2005). There and back again: Revisiting archaeological locational modeling. In GIS and Archaeological Site Location Modeling (pp. 3–38). CRC Press.
  15. Li, B., Wei, W., Ma, J., & Zhang, R. (2009). Maximum entropy niche-based modeling (Maxent) of potential geographical distributions of fruit flies Dacus bivittatus, D. ciliatus, and D. vertebrates (Diptera: Tephritidae). Acta Entomologica Sinica, 52(10), 1122–1131.
  16. Märker, M., & Bolus, M. (2018). Explorative spatial analysis of Neandertal sites using terrain analysis and stochastic environmental modelling. Journal of Geographic Information Science, 6, 21–38.
  17. Martin, P., Bladier, C., Meek, B., Bruyere, O., Feinblatt, E., Touvier, M., Watier, L., & Makowski, D. (2018). Weight of evidence for hazard identification: A critical review of the literature. Environmental Health Perspectives, 126(7), 076001. doi: 10.1289/EHP3068
  18. Mehrer, M. W., & Wescott, K. L. (2005). GIS and archaeological site location modeling. CRC Press.
  19. Muttaqin, L. A., Murti, S. H., & Susilo, B. (2019). MaxEnt (Maximum Entropy) model for predicting prehistoric cave sites in Karst area of Gunung Sewu, Gunung Kidul, Yogyakarta. Sixth Geoinformation Science Symposium, 11287, 87–95. doi: 10.1117/12.2547296
  20. Phillips, S. J., Anderson, R. P., & Schapire, R. E. (2006). Maximum entropy modeling of species geographic distributions. Ecological Modelling, 190(3–4), 231–259.
    doi: 10.1016/j.ecolmodel.2005.03.026
  21. Thanh, N. N., Thunyawatcharakul, P., Ngu, N. H., & Chotpantarat, S. (2022). Global review of groundwater potential models in the last decade: Parameters, model techniques, and validation. Journal of Hydrology, 615, 128501. doi: 10.1016/j.jhydrol.2022.128501
  22. Vaughn, S., & Crawford, T. (2009). A predictive model of archaeological potential: An example from northwestern Belize. Applied Geography, 29(4), 542–555. doi: 10.1016/j.apgeog.2009.03.001
  23. Verhagen, P., & Whitley, T. G. (2013). Integrating archaeological theory and predictive modeling: A live report from the scene. Journal of Archaeological Method and Theory, 19(1), 49–100. doi: 10.1007/s10816-011-9102-3
  24. Wachtel, I., Zidon, R., Garti, S., & Shelach-Lavi, G. (2018). Predictive modeling for archaeological site locations: Comparing logistic regression and maximal entropy in north Israel and north-east China. Journal of Archaeological Science, 92, 28–36. doi: 10.1016/j.jas.2018.02.001
  25. Zhang, G., Wang, S., Chen, Z., Liu, Y., Xu, Z., & Zhao, R. (2023). Landslide susceptibility evaluation integrating weight of evidence model and InSAR results, west of Hubei Province, China. The Egyptian Journal of Remote Sensing and Space Science, 26(1), 95–106. doi: 10.1016/j.ejrs.2023.01.001