4.1.3.1 Overview of Analysis and Visualization for Brill’s Textual History of the Bible volume 3D, Science and Technology

Todd R. Hanneken, St. Mary’s University

Table of Contents

1. Introduction

2. Material Analysis

3. Codicology and Papyrology

4. Palaeography

5. Visualizing Relationships Between Copies

1. Introduction

Science and technology assist the study of the textual history of the Bible from the most tangible aspects of the inks and writing surfaces of manuscripts, to the scribal cultures that produced them, to the most abstract visualizations of the big data of the relationships between many manuscripts. The study of textual history has always been concerned with questions such as when a copy was produced, geographic provenance, the style and number of scribal hands, the size and structure of a codex or scroll, the intended and actual use of a copy, and the relationship between a copy and other copies based on similar textual readings and scribal features. What is new is the ability to measure what cannot be measured by human perception, such as the chemical composition of inks and genetic markers of animal skin. Also new is the ability to collect, analyze, and visualize big data that previously might (at best) have been intuited by an elite few with access to large numbers of manuscripts. The potential benefits range from reconstructing damaged or missing letters to identifying networks of transmission and use of texts across communities. The analytic techniques described in this section often build on the data captured with advanced @4.1.1 Imaging Technologies and contribute to more effective @4.1.2 Conservation.

2. Material Analysis

Copies of biblical texts have chemical properties. Ink can be made in different ways, which can correlate with time and location of production. Organic materials, such as parchment, papyrus, and carbon-based ink, have additional properties that allow us to determine when the materials were alive. Materials from plants and animals also have genetic information stored in their DNA. This genetic information can tell us whether two scrolls were produced from the same flock of animals. Other material properties may be secondary to the scribal production, but important for understanding the history of the object or identifying forgeries. Forgeries that used ancient parchment were detected from chemical analysis of the modern table salt used to simulate the accretion of salts near the Dead Sea. While the authentic Dead Sea Scrolls have in common that they were in a salty environment for almost two millennia, the otherwise great diversity of their material composition casts light on questions of who produced them.

3. Codicology and Papyrology

Even if one’s research interests lead one to think of scribes primarily as copyists of texts, it is beneficial to recognize them also as producers of cultural artifacts. Those cultural artifacts, such as scrolls and codices, reflect interpretations and judgments about the texts in relationship to communities and other texts. Some technologies make it more difficult to study and appreciate the manuscript as a cultural artifact, more than a text container. Conservation technologies sometimes call for unbounding codices for preservation of individual bifolia. Imaging technologies make it easier to see letters on a single page, but often neglect or obscure the experience of holding and turning the pages of a large codex. In some cases, the artifact is preserved mostly intact and the challenge is to communicate its materiality through digital media. In other cases, the artifact is already damaged, disbound, spread across collections, or partially lost. Reconstruction of a codex relies not only on textual content, but observations of features such as hair or flesh side of parchment, quire numbers, and exact measurements of writing surface and layout of text.

An additional set of questions and methodologies follows when the object of study is not an individual codex, but a large number of codices (or other artifacts). How do scribal features such as parchment preparation, layout, binding (not to mention palaeography, below), correlate with chronology and geography? Which communities of scribes were influencing each other? Do those observations support or challenge studies from other disciplines about relationships between religious communities? The questions become especially interesting at sites such as St. Catherine’s Monastery in Sinai, which attracted pilgrims and their books from great distances. Many scribal artifacts are dispersed today in collections far from their original production and use. Standards for digital encoding of measurements and observations about codices (and other artifacts) facilitate large scale “mining” of data. Big data allows the confirmation or disconfirmation of scholarly intuition and can suggest patterns never before observed.

4. Palaeography

See volume 3C for language-by-language detailed discussion of the current state of palaeographic research. This volume addresses specifically the new sciences and technologies that assist the study and digital encoding of letters. At the nexus of thinking about scribes as producers of cultural artifacts and thinking about scribes as copyists of abstract texts lies thinking about scribes as producers of letters. Technologies assist us with challenges that range from reading or reconstructing an individual letter, to identifying the scribal practice of the production of the letter, to making machine readable millions or billions of letters, any one of which may be human readable.

At one extreme there may be nothing left of a letter at all except context. Scholars of the Dead Sea Scrolls have often relied on intuition to reconstruct text based on surrounding preserved text and the size of the gap. That intuition did not always take into account exact measurements or the differences in character width between modern type and ancient handwriting. Photoshop makes it possible to visualize exactly how a proposed reconstruction might have looked and whether it fits the space (@photoshop). In the case of spectral imaging of unreadable palimpsests, success most often means making it possible to read text, not easy to read text. A human reader would want to compare the visible strokes to charts of letters by the same hand, a process facilitated by IIIF (@IIIF). All the more so, neural networks and machine learning require training sets of thousands. Some scholarly interests are concerned not with reading the letter but analyzing how it was made. That information can indicate the time and location of the copy, as well as the relative identity of scribes. Even within a single community of scribes, one scribe may exhibit distinct tendencies in the shape of a letter and distinct tendencies of innovation when duplicating a source text.

In other situations, a manuscript may be perfectly readable but the size and number of manuscripts make it far from trivial to read or transcribe the text into a machine-readable format. Even for print publications, which have highly standardized letter shapes and sizes, the recognition of characters from visual appearance faces real challenges (@OCR). Software often looks to context to determine a “correct” reading by comparison with an “expected” reading. 1 Digital technologies can make such judgments much faster than humans. They may even make those judgments as well as humans. When we teach a computer to see what it expects to see, we introduce a problem characteristic of human perception. It is important to recognize such methodological limitations when they cannot be eliminated. 2


1 For example, in modern printed type “learn” and “leam” may look very similar, but if we know the language is English then the former is much more likely. For a case such as “modern” or “modem” other judgments from context could be helpful. For handwritten texts the problems are substantially more complicated. Scribes may vary character width based on target line width, change text direction, and so forth. In some medieval Latin scripts the characters “m,” “n,” and “i” differ only in number of strokes, so when those characters run together context and probability are the only differentiators.

2 Machine transcription may also benefit from encoding standards such as TEI (@TEI) that allow encoding the probability of a certain reading, the probability of an alternative reading, and whether “corrections” or “normalizations” based on context reject what is visible on the page.

5. Visualizing Relationships Between Copies

Some of the most dramatic innovations from science and technology have been in the new ways to answer old questions. The field of textual history has traditionally been dominated by concern with the abstract texts in the manuscripts as witnesses to an even more abstract earliest reconstructible source (Ausgangstext). In the past, text families and stemmata of variations have relied largely on intuition and partial sampling, not necessarily complete collation of all data. 3 If the texts of witnesses can be encoded to be machine readable, a complete collation becomes possible. Such analysis might verify or challenge the intuition of the pre-digital scholar. It might allow a visualization that more precisely represents the relationships.


3 Biblical studies is often the exception to generalizations about the digital humanities because “big data” projects such as lexicons, concordances, and apparatuses were prepared (at great labor) using pre-digital technologies. When the number of exemplar manuscripts reaches into the hundreds or thousands, a complete collation of all variants surpassed the “big data” capabilities of humans with index cards.

More importantly, complete collation makes it possible to ask new questions. In the past, scholars may have followed their intuition and representative sampling in an interest of moving toward an earliest reconstructible text. That concern inevitably leads to dismissing manuscript families as further from the source. A different kind of question, requiring different methodologies, is the question of development of texts in communities. As it becomes increasingly clear that scribal transmission is much more complicated than source-scribe-copy, the scale of big data reaches thoroughly into the domain of computational methodologies. For example, a scribe may be influenced not only by one source seen or heard, but everything the scribe has heard before or thinks should sound right. If scribes traveled great distances (as to pilgrimage sites), the range of possible comparison and influence may go well beyond what a human would expect or readily perceive. This is not to say that machine-readable big data instantly answers all questions. The human is still responsible for defining criteria, implementing methodologies, and interpreting visualizations. Besides the ability to process large amounts of data in a short time, computational methodologies may have the benefit of requiring scholars to be more explicit about their assumptions.


Todd R. Hanneken, “4.1.3.1 Overview of Analysis and Reconstruction for Brill’s Textual History of the Bible volume 3D, Science and Technology.” San Antonio, Texas: St. Mary’s University, October 20, 2021.