This is the start of a ‘white paper’ on eResearch in the Arts and Humanities. Comments are most welcome (I do admittedly rely a little too much on Susan Hockey’s wonderful history of Digital Humanities in ‘A Companion to Digital Humanities).1
…by its very nature, humanities computing has had to embrace “the two cultures”, to bring the rigour and systematic unambiguous procedural methodologies characteristic of the sciences to address problems within the humanities that had hitherto been most often treated in a serendipitous fashion (Susan Hockey)
What are the Digital Humanities?
The disciplines and sub-fields that make up the humanities have a long interdisciplinary relationship with computing. Since the Italian Jesuit Priest, Father Roberto Busa approached Thomas. J Watson of IBM in 1949 to assist him in indexing some 11 million words of Medieval Latin, numerous humanities scholars have had productive if not at times challenging relationships with computing. Some of the early computing tasks set by humanities scholars included verification of authorship of disputed texts, automating the laborious task of creating concordances on seminal texts, and encoding and defining document structures for digital publication and analyses. Literature and linguistics were the forerunners of computing in the humanities, spreading out to other disciplines at later stages depending on the specific needs and questions of the disciplines and the capabilities of digital technologies.
The term ‘Digital Humanities’ is a banner term that encompasses all the disciplines in the humanities and the meaningful use of computing within them. As a field it is interdisciplinary by nature and although its definition is hotly disputed, it is generally agreed that ‘humanities computing’ or ‘digital humanities’ is an attitude towards computing encompassing theoretical sophistication and an applied technical know-how. It is this balance between the needs of the humanities and the needs of applied computing that is the most taxing aspect of the field. Accordingly the institutional arrangements of the field differ vastly from applied computing centres to full academic departments. The knowledge in the field is communicated through established journals and conferences as well as through a plethora of digital means.
What is eResearch?
The broader eResearch agenda, largely driven by the need to store and re-use the vast amounts of data produced by modern research, provides another set of challenges and opportunities for the humanities. eResearch, commonly referred to ‘Cyberinfrastruture’ in the US or ‘eScience’ in Europe, is largely an infrastructure movement to support ‘big science’. eResearch may be understood as a response to the pressing needs for large scale, interdisciplinary and trans-national collaborations using important data sets and analytical tools to address some of the most pressing questions facing humankind. The planets diminishing energy resources, stressed atmosphere and rising temperatures are problems too large to be dealt with by one discipline, one university or indeed one nation state. Large scale problems require large scale research collaborations and the accompanying infrastructure to support them. Climate data sets, agricultural crop data, emissions measurements, and historical data may be combined, collaborated upon, and communicated in such a way to create new knowledge and thus new approaches.
On a less monumental scale, eResearch enables researchers to address all sort of problems associated with the management of data, the citation of data, the location of data, and the communication of data. Although the humanities do not have the same set of challenges in terms of ‘the data deluge’ as the sciences, the humanities do produce (and need to manage) data in the form of oral interviews, image databases, text resources, and other varied accounts of the human condition. Humanities data is often laborious and expensive to produce, yet highly reusable in subsequent research contexts.
What is Data?
For the humanities, the term ‘data’ is rarely used to describe the apparatus of the research process, except perhaps in terms of those disciplines that engage in gathering data through ‘field work’ in social studies or empirical archival investigations. However, in the digital domain, where seminal corpuses, libraries, literature, and language resources are increasingly in digital form, almost any resources that helps scholars understand the human condition may be understood as ’data’. Records of the Old Bailey, newspapers, parliamentary papers, and court records are not only digital facsimiles of their original published online, but are also to all intents and purposes, ‘data’ that can be holistically analysed, compared and contrasted, and utilised as evidence in a similar way to a scientist understands data. Placing a million books online is a notable exercise in distribution, but the more remarkable attribute of a million books in digital form is that when viewed as data, they may be extracted in such a way to construct meaning that helps us understand new knowledge about these books that is beyond the scope of traditional scholarly labour.
What is architecture?
To take advantage of some of the computing infrastructures being built within the broader eResearch agenda, the ‘computing architecture’ must be built in such as way to take account of researchers working practices. In the humanities, the context of the ‘data’ is important as it is through context that humanities scholars establish the veracity of the resources and its subsequent meaning. Humanities scholars often require sophisticated anthologies to establish how knowledge ‘came into being’ (and its relationships), so that it can be built upon though monographs and articles. It must also have the ability to be cited so that its original location can be verified; of similar importance to the repeatability of the scientific method in science. Well designed Humanities architectures are a mix of more generic ‘services’ common to humanities practices; often containing tools and services more specific to disciplines and research questions.
The challenges and opportunities of eResearch in the Arts and Humanities
Perhaps the greatest benefit of the eResearch within the arts and humanities, beyond the many useful services and resources already produced, is that it allows humanities scholars to engage with advanced computing and imagine what is possible. We may not always get this right; it is an interdisciplinary experiment of methods and approaches, of tool development and application which promise to augment the humanities critical, analytics and speculative skills, or if driven by the wrong impulses, abate them. eResearch in the arts and humanities is a something that the humanities themselves must grasp and lead.
1. Susan Hockey, ‘The History of Humanities Computing” A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004.