Innovations in Finding Aids and Digital Archives

July 17, 2013, 15:30 | Centennial Room, Nebraska Union

Despite the ubiquity of digital programs at most research libraries, many archives still document and describe collections in much the same way as they did in the pre-digital era. The development and adoption of the Encoded Archival Description (EAD) XML schema serves as the basis of an infrastructure for dynamic research tools capable of seamlessly connecting archival materials to the digital information ecosystem. However, many institutions implement EAD as little more than a formatting guide for displays designed to resemble, as closely as possible, the digital finding aids’ paper predecessors. If we instead think of EAD as a means of encoding archival description as machine-readable data, we open new possibilities for how our finding aids can be displayed and for exposing them to the ever-growing linked open data environment of the semantic web, ensuring that our content can be found and effectively used by those we hope to serve.

XML documents have several advantages over PDF or Word documents as a vehicle for recording archival description: they “future proof” the finding aid data against changes in display technology; they allow for multiple presentations of the same data; and they allow descriptive data to be harvested by automated agents for purposes other than display. Despite these advantages, EAD is frequently used simply to produce Web documents designed to resemble PDFs of Word Documents, often in such a way as to make the original data unavailable for other purposes. This situation is not due to any inherent shortcoming of the standard, but rather to a failure to make full use of its potential.

If we forget, for a moment, our preconceptions about finding aid design and instead ask ourselves what our researchers want to know about our collections, we will likely find that the 'right' way to present a finding aid depends very much on the needs and intentions of the user. Imagine, for example, if instead of a text-heavy display we generate an Excel-style table in which the specific components displayed could be narrowed as the user types a query string in a search box. On the other hand, with the right styles and visual themes, we could use the descriptive data to construct the sort of “featured collection" sites so many donors support. What if switching between views was as easy as clicking an iTunes-style button that flipped from one option to another to best suit the users’ needs and preferences?

In addition to enabling innovative displays, a data-centric approach to EAD facilitates the transformation of archival description into forms that can participate more effectively in the Web environment, particularly in the area of Linked Open Data. The same technologies that enable Google to suggest resources related to searches and Facebook to suggest new friends can be employed to recommend related resources in our own or other institutions that may be of interest to researchers. In addition to bibliographic and archival resources, data from other data sets such as Wikipedia and the Internet Movie Database can be incorporated to provide additional context for our collections and to point readers to more sources of information.

Of course the structure of our data is only half of the story — in order to facilitate these new uses the data itself must be sufficient to effectively establish links to other resources. The philosophy of "more product, less process" has allowed us to make more archival materials available for public use despite shrinking resources, but the primary cost of this efficiency has been the depth of descriptive data produced. Here is another area where we may benefit from allowing our data to participate in the larger Web environment, by taking advantage of the Web's capacity for enabling interaction between users and providers. Consider a finding aid that, in addition to presenting descriptive information, provides access to digital surrogates of the archival materials — a feature that is increasingly common. Among the features of the new generation of finding aids could be tools that allow researchers to provide annotations and access terms associated with the collection as they read and become familiar with the materials. These tools could integrate dynamically with open data sources in order to aid the user in selecting standardized identifiers for names, subjects and titles relevant to the collection. This user-contributed metadata could increase the depth and quantity of our descriptive data with very little investment on the part of the institution, providing a clear benefit to all.

There is very little reason why we can’t do most of this right now. Indeed, the New York Public Library is currently experimenting with finding aid interfaces such as those described above as part of several collection-specific projects. In this presentation, Doug Reside (Digital Curator for the Performing Arts) and Trevor Thornton (Senior Applications Developer at NYPL Labs) will demonstrate prototype interfaces for archival collections and reflect on the future of finding aids and digital archives.