Abstracts

The Digital Orationes Project: Interfacing a Restoration Manuscript

July 17, 2013, 15:30 | Centennial Room, Nebraska Union

Funded by the Academy of Finland (2011-2014), the Digital Orationes Project is an interdisciplinary initiative intended to bring an important unpublished Early Modern manuscript into the scholarly arena. The manuscript, preserved as Lit. MS E41 in the archive of Canterbury Cathedral, was collected, and in part composed, by George Lovejoy (c. 1675), Headmaster of the King’s School, Canterbury, after the English Civil War. The texts within it represent one of the most substantial unpublished sources of English School Drama from the period. As well as containing a previously unnoticed adaptation of a pre-war play by a major author (James Shirley), this large volume, comprising 656 folio pages and running to some 230,000 words, includes a number of short plays and dramatized orations written in English, Latin and Greek by the scholars and staff of the King’s School. (Amid much else, these works celebrate the Restoration of Charles II to power, re-enact the Gunpowder plot, discuss a wide range of topical issues, and provide a wealth of information about the role of drama in Early Modern schooling.)

The overall aim of the project has been to create a state-of-the-art digital archive which makes the texts in the manuscript available to a wider audience at the same time as it offers new affordances for its scholarly users. In this, we have been responding to, and actively critiquing, best practices within the field of digital editions so succinctly summarized by Pierazzo (2011). The final digital interface for the manuscript will enable the searching of its handwritten text by means of visual recognition of the letter forms as well as the more usual text-based functions relating to the transcribed, translated, and edited manuscript; and simultaneously allow access to a number of higher scholarly functions. Accordingly, the present paper will focus on ongoing developments concerning three elements in the (otherwise, substantially developed) package: a) an image calibration tool (i.e. a tool to flatten pages, scale them consistently and link with the transcript); b) an image search prototype (which can search for graphic features in the manuscript using visual cues); and c) the blueprint for the final Orationes user interface. We are presenting these features before they have been packaged into a single polished user-friendly entity – and at the penultimate stage of a larger project involving an international team from four Universities (Oulu, Åbo Akademi, Helsinki [Finland], and Austin, Texas) – in order to make new research available at the same time as we solicit the feedback which will make our final interface as helpful as possible within the DH community.

In its basic form, the Orationes manuscript is represented by a rich digital edition that utilizes the high resolution scanned images of the manuscript pages and TEI-compliant transcriptions created by domain experts. As demonstrated in our JADH-2012 presentation (Opas-Hänninen et al. 2012), our modus operandi has been to approach the work from two opposite directions. First, the team created an uncompromising TEI-XML version of the manuscript (a process which has not only involved rigorous textual transcription but also entailed the translation of the Latin and Greek portions of the material and the identification of features of interest for use in a rich visual interface). Second, it has been working on the production of a reusable software package which, without sacrificing functionality or source integrity, can be used to generate digital editions from similar XML and image source materials in a straightforward manner. In order to combine these two goals successfully, the team navigated the requisite TEI guidelines (tweaking them where necessary as developments within, rather than departures from, the system), constructed an automatic linking mechanism between image and text, and have been creating efficient search mechanisms for TEI data: not to mention an interface that is both intuitive and generic at the same time.

The high resolution scanned images of the manuscript were produced by experts at the Canterbury Cathedral Archives. When beginning the work on image searching, i.e. the recognition of graphical forms such as letters and punctuation, it quickly became evident that in order to be able to carry out any such work, the images of the manuscript would need to be flattened and scaled consistently first, because of slight warping on the inner edges of the pages. Preprocessing will simplify the actual pattern recognition search process, because there will be less variation to account for and thus better results should be achieved; it is also beneficial for creating a clean GUI. Thus we set out to develop a process for calibrating the images, which we think will be very useful for other projects which use similar manuscripts that simply can’t be flattened as the scanned images are produced.

Although, since 2012, the Digital Orationes interface has been able to allow for full searches of the transcribed text (along with translations from the relevant Latin or Greek passages), links to the apparatus and editorial notes, or a layered comparison (at varying magnifications) between the written text and the transcription, our endeavours in the present move considerably closer to an engagement with the manuscript as an artefact and the sort of operations which a professional palaeographer or historical linguist might require of it. In particular, by developing an optical recognition faculty which is able to identify and search out letters and other graphical forms manifested in the manuscript (such as varieties of dashes and other idiosyncracies of punctuation), the 2013 prototype indicates how our edition might contribute to raising the bar for palaeographers: helping them to search, line up, or compare visual features across the manuscript in an easy, intuitive way. Looking ahead, the presenters of the 2013 poster will also be prepared to engage with the wider perspective of how, adapting a tool developed and presented for other purposes (Juuso et al. 2011), our final interface will also be able to serve historians of language and literature by keying in to etymological dictionaries of English, Latin or Greek, identifying new words and colour-coding lexical items according to the first date of occurrence in the historical corpus.

References

Pierazzo, E. (2011) A Rationale of Digital Documentary Editions, LLC: The Journal of Digital Scholarship in the Humanities, 26(4): 463-477.
Opas-Hänninen, L. L., I. Juuso, T. Toljamo, A. W. Johnson, and T. Seppänen (2012). The Orationes Project: Bringing a Restoration Manuscript Online. Paper presented at JADH2012, Tokyo, 17 September 2012.
Juuso, I., L. L. Opas-Hänninen, A. W. Johnson, and T. Seppänen (2011). The Time Machine: capturing Worlds Across Time in Texts. Paper presented at DH2011, Stanford 18 June 2011.