The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay insc... more The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay inscriptions, written in Hebrew. These texts originate from the biblical kingdoms of Israel and Judah, and dated to the late First Temple period (8th – early 6th centuries BCE). The ostraca are almost the sole remaining epigraphic evidence from the First Temple period and are therefore important for archaeological, historical, linguistic, and religious studies of this era. This “noisy” material offers a fertile ground for the development of various “robust” image analysis, image processing, computer vision and machine learning methods, dealing with the challenging domain of ancient documents’ analysis. The common procedures of modern epigraphers involve manual and labor-intensive steps, facing the risk of unintentionally mixing documentation with interpretation. Therefore, the main goal of this study is establishing a computerized paleographic framework for handling First Temple period epigraphic material. The major research questions, addressed in this thesis are: quality evaluation of manual facsimiles; quality evaluation of ostraca images; automatic binarization of the documents and its subsequent refinement; quality evaluation of binarizations on global and local levels; identification of different writers between inscriptions (two distinct methods are proposed); image segmentation (with improvements over the classical Chan-Vese algorithm); and letters’ shape prior estimation. The developed methods were tested on real-world archaeological and modern data and their results are found to be favorable.
Kiriath-jearim: The Shmunis Family Excavations, 2025
This chapter supplies basic DNA information regarding two individuals from the Iron II burial cav... more This chapter supplies basic DNA information regarding two individuals from the Iron II burial cave at Kiriath-jearim. Genetically these two individuals are one male (I14918) and one female (I14919). This data builds upon previously published materials (Shaus et al. 2023).
The maritime Phoenician civilization from the Levant transformed the entire Mediterranean during ... more The maritime Phoenician civilization from the Levant transformed the entire Mediterranean during the first millennium BCE. However, the extent of human movement between the Levantine Phoenician homeland and Phoenician–Punic settlements in the central and western Mediterranean has been unclear in the absence of comprehensive ancient DNA studies. Here, we generated genome-wide data for 210 individuals, including 196 from 14 sites traditionally identified as Phoenician and Punic in the Levant, North Africa, Iberia, Sicily, Sardinia and Ibiza, and an early Iron Age individual from Algeria. Levantine Phoenicians made little genetic contribution to Punic settlements in the central and western Mediterranean between the sixth and second centuries BCE, despite abundant archaeological evidence of cultural, historical, linguistic and religious links. Instead, these inheritors of Levantine Phoenician culture derived most of their ancestry from a genetic profile similar to that of Sicily and the Aegean. Much of the remaining ancestry originated from North Africa, reflecting the growing influence of Carthage. However, this was a minority contributor of ancestry in all of the sampled sites, including in Carthage itself. Different Punic sites across the central and western Mediterranean show similar patterns of high genetic diversity. We also detect genetic relationships across the Mediterranean, reflecting shared demographic processes that shaped the Punic world.
New Studies in the Archaeology of Jerusalem and its Region: Collected Papers, Vol. XVI, 2023
In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was ... more In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was spotted, subsequently leading to a salvage excavation carried out by the Antiquities Theft Prevention Unit and the Jerusalem Regional Office of the Israel Antiquities Authority. Among the finds from the extensively disturbed tomb were the remains of at least ten individuals, two of which were analyzed by ancient DNA methods.
Contrast is not uniquely defined in the literature. There is a need for a contrast measure that s... more Contrast is not uniquely defined in the literature. There is a need for a contrast measure that scales linearly and monotonically with the optical scattering depth of a translucent scattering layer that covers an object. Here, we address this issue by proposing an image contrast metric, which we call the Haziness contrast metric. In its essence, the Haziness contrast compares normalized histograms of multiple blocks of the image, a pair at a time. Subsequently, we test several prominent contrast metrics in the literature, as well as the new one, by using milk as a scattering medium in front of an object to simulate a decline in image contrast. Compared to other contrast metrics, the Haziness contrast metric is monotonic and close to linear for increasing density of the scattering material, compared with other metrics in the literature. The Haziness contrast has a wider dynamic range, and it correctly predicts the order of scattering depth for all the channels in the RGB image. Utilization of the metric to evaluate the performance assessment of dehazing algorithms is also suggested.
Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and cult... more Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and culture of people of long-gone eras. Paleography, the study of writing, aims at documenting the inscriptions, transliterating the texts, reconstructing their historical context, and studying the evolution of writing itself. The digital revolution gave rise to computational paleography, introducing new tools of data acquisition, image processing, and machine learning. Herein, we will provide an introduction to the emerging field of computational paleography through the lens of ancient Hebrew inscriptions, dating from the Iron Age through the Middle Ages. The years that passed since their composition had a great effect on their preservation level, including blurs, stains, and erosions; moreover, some documents tend to fade in the years after their discovery. Therefore, it is of paramount importance to promptly document ancient inscriptions using the most suitable imaging techniques, such as visible, infra-red, or multispectral imaging. Image analysis and processing techniques, such as binarizations, letter segmentation, and letters' prior estimation are valuable in their own right or may serve as a stage for subsequent tasks. We will also discuss automatic handwriting analysis and writers' identification, which could shed light on the historical background of the inscriptions.
The Materiality of Greek and Roman Curse Tablets: Technological Advances, 2022
Many issues faced by paleographers and philologists in their study of the materiality of the obje... more Many issues faced by paleographers and philologists in their study of the materiality of the objects at hand might provide obstacles that can literally make or break our ability to interact with a given text. The essays in this book show how new technologies are significantly helping in the tasks of deciphering, understanding, and restoring ancient texts written on different materials. Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts. In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.
Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. E... more Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. Excavation of the site yielded over 100 Hebrew ostraca (ink inscriptions on potsherds) dated to ca. 600 BCE, the eve of Nebuchadnezzar's destruction of Jerusalem. Due to the site's isolation, small size and texts that were written in a short time span, the Arad corpus holds important keys to understanding dissemination of literacy in Judah. Here we present the handwriting analysis of 18 Arad inscriptions, including more than 150 pair-wise assessments of writer's identity. The examination was performed by two new algorithmic handwriting analysis methods and independently by a professional forensic document examiner. To the best of our knowledge, no such large-scale pair-wise assessments of ancient documents by a forensic expert has previously been published. Comparison of forensic examination with algorithmic analysis is also unique. Our study demonstrates substantial agreement between the results of these independent methods of investigation. Remarkably, the forensic examination reveals a high probability of at least 12 writers within the analyzed corpus. This is a major increment over the previously published algorithmic estimations, which revealed 4-7 writers for the same assemblage. The high literacy rate detected within the small Arad stronghold, estimated (using broadly-accepted paleo-demographic coefficients) to have accommodated 20-30 soldiers, demonstrates widespread literacy in the late 7 th century BCE Judahite military and administration apparatuses, with the ability to compose biblical texts during this period a possible by-product.
Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of seve... more Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of several years. We are happy to dedicate this article to him and wish to express our gratitude for what has been both a prodigious and enjoyable experience. The purpose of our joint endeavor has been the introduction of modern techniques from computer science and physics to the realm of Iron Age epigraphy. One of the most important issues addressed during our cooperation was the topic of facsimile creation. Facsimile creation is a necessary preliminary step in the process of deciphering and analyzing ancient inscriptions. Several manual facsimile construction techniques are currently in use: drawing upon collation of the artifact; outlining on transparent paper overlaid on a photograph of the inscription; and computer-aided depiction via software such as Adobe Photoshop, Adobe Illustrator, Gimp or Inkscape (see Summary section below for software web links). Despite their importance for the field of epigraphy, little attention has thus far been devoted to the methodology of facsimile creation (though the recent comprehensive treatment by Parker and Rollston 2016). Recent decades have seen rapid development and consolidation of various computerized image processing algorithms. Among the most basic and popular tasks in this field is the creation of a blackand-white version of a given image, denoted as image binarization (see Fig.1a-b). Often, such a binarized image is used as a first step for further image processing missions, such as Optical Character Recognition (OCR), texts digitization and text analysis tasks. An algorithmic creation of binarizations can therefore be seen as another method of facsimile creation. Furthermore, a relatively new sub-domain of image processing, Historical Imaging and Processing (HIP), specializes in handling antique documents of different types, periods and origins. Accordingly, binarization algorithms stemming from HIP are even more suitable for archaeological purposes.
We deal with the general issue of handling statistical data in archaeology for the purpose of ded... more We deal with the general issue of handling statistical data in archaeology for the purpose of deducing sound, justified conclusions. The employment of various quantitative and statistical methods in archaeological practice has existed from its beginning as a systematic discipline in the 19th century (Drower 1995). Since this early period, the focus of archaeological research has developed and shifted several times. The last phase in this process, especially common in recent decades, is the proliferation of collaboration with various branches of the exact and natural sciences. Many new avenues of inquiry have been inaugurated, and a wealth of information has become available to archaeologists. In our view, the plethora of newly obtained data requires a careful reexamination of existing statistical approaches and a restatement of the desired focus of some archaeological investigations. We are delighted to dedicate this article to Israel Finkelstein, our teacher, adviser, colleague, an...
Proceedings of the 4th International Workshop on Historical Document Imaging and Processing, 2017
The problem of finding a prototype for typewritten or handwritten characters belongs to a family ... more The problem of finding a prototype for typewritten or handwritten characters belongs to a family of "shape prior" estimation problems. In epigraphic research, such priors are derived manually, and constitute the building blocks of "paleographic tables". Suggestions for automatic solutions to the estimation problem are rare in both the Computer Vision and the OCR/Handwriting Text Recognition communities. We review some of the existing approaches, and propose a new robust scheme, suitable for the challenges of degraded historical documents. This fast and easy to implement method is employed for ancient Hebrew inscriptions dated to the First Temple period.
In this paper we claim that during the First Temple period, no organized or fixed system of liqui... more In this paper we claim that during the First Temple period, no organized or fixed system of liquid volume measurements existed in Judah. The biblical bath, which has been understood to be the basic measurement of the system, was not a measurement at all but a well-known vessel -the Judahite storage jar-also known as the lmlk jar. The nēḇel and the kaḏ were two other vessels that had other uses. The lōḡ, hîn, and ®iśśārôn, which are usually termed "measurements" and considered part of the system of liquid volume measurements, were actually vessels that were part of the official Temple cult during the Second Temple period and were never part of the First Temple economy and administration. The basis of the assumed Judahite First Temple period liquid volume measurement system is the bath, which is comprised of six hîn or 72 lōḡ (cf.
Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspec... more Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspected Corpus, along with all the relevant data (e.g., letter quantities). Obs_N_Seps-the observed number of separations in the Inspected Corpus. MC_ITER-number of Monte Carlo (MC) simulations, in this case 100,000.
The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. T... more The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. Their facsimiles (binary depictions) are indispensable for further analysis. Previous attempts at mechanizing the creation of facsimiles have been problematic. Here, we present a proof of concept of objective binary image acquisition, via Raman mapping. Our method is based on a new peak detection transform, handling the challenging fluorescence of the clay, and circumventing preparatory ink composition analysis. A sequence of binary mappings (signifying the peaks) is created for each wavelength; their legibility reflects the prominence of Raman lines. Applied to a biblical-period ostracon, the method exhibits high statistical significance.
Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Nav... more Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Naveh in 2005, were re-imaged using a high-end multispectral imaging technique. The re-imaging yielded dozens of changed or added characters and resulted in renewed, larger and improved readings, hereby published. In addition, we interpret the texts of the ostraca and place them in the context of the economy and administration of Judah in the seventh century BCE.
Bulletin of the American Schools of Oriental Research, 2017
Arad Ostracon 16 is part of the Elyashiv Archive, dated to ca. 600 b.c. It was published as beari... more Arad Ostracon 16 is part of the Elyashiv Archive, dated to ca. 600 b.c. It was published as bearing an inscription on the recto only. New multispectral images of the ostracon have enabled us to reveal a hitherto invisible inscription on the verso, as well as additional letters, words, and complete lines on the recto. We present here the new images and offer our new reading and reinterpretation of the ostracon.
The authors present a new method of writer identification, employing the full power of multiple e... more The authors present a new method of writer identification, employing the full power of multiple experiments, which yields a statistically significant result. Each individual binarized and segmented character is represented as a histogram of 512 binary pixel patterns-3 × 3 black and white patches. In the process of comparing two given inscriptions under a "single author" assumption, the algorithm performs a Kolmogorov-Smirnov test for each letter and each patch. The resulting p-values are combined using Fisher's method, producing a single p-value. Experiments on both Modern and Ancient Hebrew data sets demonstrate the excellent performance and robustness of this approach.
2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016
This article discusses the quality assessment of binary images. The customary, ground truth based... more This article discusses the quality assessment of binary images. The customary, ground truth based methodology, used in the literature is shown to be problematic due to its subjective nature. Several previously suggested alternatives are surveyed and are also found to be inadequate in certain scenarios. A new approach, quantifying the adherence of a binarization to its document image is proposed and tested using six different measures of accuracy. The measures are evaluated experimentally based on datasets from DIBCO and H-DIBCO competitions, with respect to different kinds of binarization degradations.
This paper suggests a new quality measure of an image, pertaining to its contrast. Several contra... more This paper suggests a new quality measure of an image, pertaining to its contrast. Several contrast measures exist in the current research. However, due to the abundance of Image Processing software solutions, the perceived (or measured) image contrast can be misleading, as the contrast may be significantly enhanced by applying grayscale transformations. Therefore, the real challenge, which was not dealt with in the previous literature, is measuring the contrast of an image taking into account all possible grayscale transformations, leading to the best "potential" contrast. Hence, we suggest an alternative "Potential Contrast" measure, based on sampled populations of foreground and background pixels (e.g. scribbles or saliency-based criteria). An exact and efficient implementation of this measure is found analytically. The new methodology is tested and is shown to be invariant to invertible grayscale transformations.
Chan-Vese is an important and well-established segmentation method. However, it tends to be chall... more Chan-Vese is an important and well-established segmentation method. However, it tends to be challenging to implement, including issues such as initialization problems and establishing the values of several free parameters. The paper presents a detailed analysis of Chan-Vese framework. It establishes a relation between the Otsu binarization method and the fidelity terms of Chan-Vese energy functional, allowing for intelligent initialization of the scheme. An alternative, fast, and parameter-free morphological segmentation technique is also suggested. Our experiments indicate the soundness of the proposed algorithm.
Uploads
Thesis by Arie Shaus
Papers by Arie Shaus
Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts.
In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.