Hybrid Machine Learning Algorithm Detects Near 10k Archaeological Tumuli in Galicia

Image: detected tumuli in Galicia (Spain): (a) point distribution; (b) heat map. Author: Iban Berganzo.

Archaeological tumuli are one of the most common types of archaeological sites and can be found across the globe. This is perhaps why many studies have attempted to develop methods for their automated detection. Their characteristic tumular shape has been the primary feature for their identification on the field and in LiDAR-based topographic data, which usually takes the form of Digital Terrain Models (DTMs). The simple shape of mounds or tumuli is ideal for their detection using deep learning approaches. Deep learning detectors usually require large quantities of training data (in the order of thousands of examples) to be able to produce significant results. However, the homogenously semi-hemispherical shape of tumuli, allows the training of usable detectors with a much lower quantity of training data, reducing considerably the effort required to obtain it and the significant computational resources necessary to train a convolutional neural network (CNN) detector. This type of features, however, present an important drawback. Their common, simple, and regular shape is similar to many other non-archaeological features and therefore studies implementing methods for mound detection in LiDAR-derived DTMs and other high-resolution datasets are characterised by a very large presence of false positives (objects incorrectly identified as mounds)

Tumuli of the megalithic concentrations in A Serra do Barbanza. Picture: Miguel Carrero-Pazos
Tumuli of Touro Morto (Oia). Picture: Miguel Carrero-Pazos.

We recently published a blog post about our research on the automated detection of tumular burials or barrows in Galicia. Now an early access version of the published research is out as an open access paper in Remote Sensing, one of the top journals in the discipline.

In our initial research we located almost 9000 tumuli. However, not all of these were actual tumuli as the automated detection results also included false positives. After initial data validation was performed in collaboration with our colleagues Dr. Miguel Carrero (University College London / Santiago de Compostela UniversityGEPN-AAT), Dr. João Fonte (Universidad de Exeter) and Dr. Benito Vilas (Universidad de Vigo) we realised that from our ca. 9000 detected objects only ca. 7600 corresponded to real archaeological mounds. Although, this was an excellent result, well below the percentage of false positives presented by similar studies, we thought we could improve the detection rate while decreasing the number of false positives.

During the summer, GIAP members Iban Berganzo and Hector Orengo in collaboration with Dr. Felipe Lumbreras from the Computer Vision Center (CVC) developed a new approach to reduce the number of false positives while increasing the detection rate. After analysing the nature of the detected false positives, we developed a hybrid approach that mixes classical machine learning and deep learning. The objective was to obtain a more precise definition of archaeological tumuli in which not just the shape but also the multispectral characteristics of the objects will be taken into account when looking for tumuli.

Topographic data based on LiDAR. Image: Miguel Carrero-Pazos

The results that this new approach has produced are nothing less than spectacular:

  • The area covered is the largest (to the extent of our knowledge) in which archaeological DL approaches have ever been applied and it covers almost 30,000 km2
  • 10,527 objects have been detected of which approximately 9,422 correspond to archaeological tumuli (after careful visual validation with high resolution imagery and pending ground validation). That is, a 89.5% of the detected tumuli correspond to true positives.
  • We have only employed open source data in this research. However, the use of higher resolution data, in particular higher resolution satellite imagery instead of the Sentinel 2 (10m/px) images employed, would radically decrease the number of false positives reaching a success rate above 97%.
  • Code, sources and results (including validation) are freely available and the code is designed to be used in freely accessible cloud computing platforms Google Colaboratory and Earth Engine) so the lack of computational resources will not pose a problem for its application to other study areas (even very large ones).

Our approach provides a way forward for the detection of tumuli avoiding the inclusion of most false positives. The algorithm can be applied in areas of the world where topographic data of enough resolution are available. Providing specific training data, this hybrid approach can also be used to detect other types of features where large number of false positives area an issue. 

Read the full article here (Open Source): https://www.mdpi.com/2072-4292/13/20/4181/htm


This research has received funding from multiple sources, that we would like to acknowledge here: Iban Berganzo’s PhD is funded with an Ayuda a Equipos de Investigación Científica of the Fundación BBVA for the Project DIASur and the the R+D project “Translands” (PGC2018-093734-B-I00) of the Spanish Ministry of Science, Innovation and Universities. Hector A. Orengo is a Ramón y Cajal Fellow (RYC-2016-19637). Felipe Lumbreras work is supported in part by the Spanish Ministry of Science, Innovation and Universities project BOSSS TIN2017-89723-P. Miguel Carrero and João Fonte are Marie Skłodowska-Curie Fellows (Grant Agreements 886793 and 794048 respectively). Some of the GPUs used in these experiments are a donation of Nvidia Hardware Grant Programme.

Tags: , , , ,