I am back in Melbourne after attending the Digital Humanities conference at Kings College London and in my short experience of the event; it was by far the best. I get the feeling that the field is at a pivotal moment in its history and without continued institutional support and strong academic leadership, the field isn’t going to make the transition easily into the next stage (what ever this next stage may be).  We really need to build the field in Australia in a similar way to the Canadians by offering career options, degrees, research funding; all within strong academic departments and centres. The field will always have a service function; this is important, but in Australia we also need to push further into the ‘methodological commons’ and academic research beyond simply delivering someone else’s research from one place to another (or the ‘delivery boy’ scenario). I will write about this over coming weeks. I will try and not aggregate so much on this blog and keep that to

What they are saying’: Political Issue Analysis System (PIAS): Political Issue analysis in an age of the ‘data deluge’

(This new seeding project has just been accepted for funding from the Institute for Broadband Enabled Society (IBES) at the University of Melbourne. Led by VeRSI and myself, it is a short project with results available towards the end of the year or early next year).

Summary of Proposal

The Internet is recognised as a vital component of our political information systems.  Although extensively used by governments and civil society groups, its effects upon political processes; particularly deliberative political processes, currently remains relatively unknown.  Emerging research suggests that the Internet’s capacity to easily produce information has also led to data overload, undermining its deliberative potential.  With the advent of the National Broadband Network the ‘data deluge’ promises to intensify increasing the need for political information—in its various guises—to be delivered in much more meaningful ways.[1] This is especially important for younger audiences who are increasingly abandoning broadcast media in favour of online political information[2].

This project is an iterative study and design of an online ‘Political Issues Analysis System’ (PIAS) to assist users’ research and analyse political issues. It will deliver information about important political topics (ie. environmental issues, socio-economic issues, immigration, government policy etc.) using important data sources within a coherent ‘deliberative’ framework.  It will evaluate the needs of users to comprehend political issues through the application of a number of semantic indexing and data matching tools and design a prototype system.  It will do this in part through five public workshops using the University of Melbourne’s Usability Lab; each workshop focussing on a particular issue utilising particular tools and methods.[3] It will in tandem uncover recommendations to assist in the design of a unique software tool that fosters user-driven processes to effectively filter and visualise online political information obtained from government data-sets (partly within the ‘Government 2.0’ policy framework), the media, NGOs, historical data, and other user-generated online sources; (blogs, video etc).

The outputs of the research will be a working prototype as well as a report documenting the research outcomes with a series of recommendations for further research. This project may lead to the first major study of online deliberative processes within Australia; competitive within the ARC’s Linkage or Discovery scheme. The work will be of benefit to governments, community groups and other major producers of political sites and the users of such sites. The project is within IBES’s Social Infrastructures and Community theme and in particular, adheres to IBES’s and VeRSI’s shared aspirations ‘to make existing and available data more accessible’. In summary the broad aims of the project are:

  • To explore the evolving applications of online political information tools in an Australian and International context (especially in the analysis of broadband-enabled video and audio)
  • To examine deliberative processes with a number of stakeholder groups using semantic indexing methods and various communication tools at the University’s IDEA Lab.
  • To build, test and provide further recommendations for a ‘Political Issues Analysis System’ (PIAS)

Through these processes we address the following research questions:

  • How can we better understand online deliberation in the international and Australian context and what tools need to be developed to assist this?
  • How can we better design deliberative ‘ideas’ using data and online analysis tools that will involve people in a meaningful and inclusive way in consequential goal-orientated political processes?

Approach and Outcomes:

The combination of theoretical groundwork, empirical study, and the design and implementation of the PIAS, will make an important contribution to the emerging body of research on the nature of political information on the Internet and in particular, the use of government data within it. Of chief significance is that the research will make explicit and open up to critical analysis the dichotomy between the availability of government and other data sources and effective online deliberative design. By consciously foregrounding information abundance as a condition of the present ‘information revolution’—through a unique fusion of political theory with semantic analysis and clustering tools—new perspectives will emerge and fresh research areas in design will open up.

The approach, then, is both innovative and unique because it combines the theoretical sophistication of Politics and Media Studies with the technical proficiency of Humanities Computing, eDemocracy, and Information Systems to expose important issues of online political information to critique in ways that were previously unavailable. [4] The work will open up theoretical and technological pathways towards a more genuinely identifiable (and sustainable) online political engagement and democratic structuring.

Technology and potential collaborators:

Potential collaborators for this work include the UK’s They have developed some of the UK’s most well-know sites including and its local derivative,  The open source solutions, API, raw data and results will be collaboratively developed and shared with mysociety and OpenAustralia to complete the PIAS. Likewise, solutions developed through the ‘inquiry into Improving Access to Victorian Public Sector Information and Data’ as well as the Federal ‘e-Government Strategy’ will be investigated and may provide potential collaborators. In essence the PIAS is a ‘parsing’ project; to parse structured government and other data sets to extract and deliver meaningful political information to a general audience. It will explore ways to crawl, cluster and analyse unstructured data contained in blogs and other ‘unofficial’ sources including video and audio (perhaps using XPROC processing).

The broad samples obtained through the PIAS iterative design workshops and subsequent prototype will provide a unique model to analyse web-based dialogue, agenda setting, and responses to official government positions on important political topics. This work may be up-scaled at a later date to include other collaborators; particularly the Pollsters who may be eager to invest in such a system.

[1]One of the first major agencies to coin the term the ‘Data Deluge’ was the UK’s JISC (Joint Information Systems Committee):  Briefing Paper, Data Deluge: Preparing for the Explosion in Data, 1 November, 2004  <> (Accessed 14 May, 2010).

[2] See: Clare Kurmond, Readership Decline Continues for Papers, Sydney Morning Herald, Sydney, 14 Mat, 2010

<> (Accessed 14 May, 2010).

[3]Interaction Design Evaluation Analysis (IDEA), Department of Information Systems, University of Melbourne,

<> (Accessed 14 May 2010).

[4] Carson, L ‘Avoiding ghettos of like-minded people: Random selection and organisational collaboration’ in S. Schuman, (ed) Creating a Culture of Collaboration, ed. Jossey Bass/Wiley.pp.418-423.

Founders and Survivors: Australian Life Courses in Historical Context; 1803-1920


Project report. Dr Craig Bellamy, VeRSI, June 2010

I recently attended a project workshop for the ARC funded Founders and Survivors project: Led by Professor Janet McCalman from the University of Melbourne, Associate Professor Hamish Maxwell-Stewart from the University of Tasmania, and an interdisciplary team of genealogists, demographers, and population health researchers; the project seeks to link the most important records about the convict system in Tasmania to uncover new knowledge about the system and the lives of the people within it.

The project — at a reasonably early stage—presented many of the interim results of digitising and parsing the data about the 72,500 convicts that were transported to Tasmania in the first half of the 19th Century. The convict records in Tasmania are some of the most significant and detailed records of the lives, socio-economic position, bodies, and health of any group in the 19th Century.  The project has the bold ambition of not only linking and analysing the convict records, but also linking other detailed institutional records; such as Australian military records, to gain a rich, intergenerational perspective of the health and lives of Australians.  No other settler society has such intimate details of its founding population.

In one of the earlier presentations, Hamish Maxwell-Stewart explained that the records are being digitised, analysed, and presented according to significant life events. These events include birthplace, upbringing, and trial, the voyage to Australia, the convict’s behaviour under sentence and their cause of death. Many convict records and registers have already been digitised and made available through the State Library of Tasmania and other institutions, but many hours are also being spent painstakingly transcribing muster records, pardon records, departures, absconders, apprehensions, certificates of freedom, and other records that ‘fill the gaps’ to assist in reconstructing the chain of events that make up the lives of the largely working class people who were transported to Tasmania. There are 456, 663 records recorded in the system so far.

Associate Professor John Bass, who is mainly responsible for liking the data, explained to me in a coffee shop in Salamanca Place in Hobart, how the records are linked, the decisions that are made in matching, linking, and the eventual historical analysis of the data. John has been involved in record linking projects for many years; primarily in the health sector (to such a degree that he was awarded an Order of Australia for his work). He explained how he searchers for a ‘linkage key’ (name, date of birth, etc.) from say, the records from a particular convict voyage and then matches this to other records of ‘arrival’ or ‘leave of pardon’ or ‘marriage’. It is not a purely scientific endeavour and the raw data is later used by the historian who will formulate this evidence into their broader historical arguments (and the data is held in separate databases and links stored separately). As Hamish Maxwell-Stewart explained in one of his presentations, matching rates are generally high at above 50% but some; as in matching ‘arrival’ with ‘death’ or ‘departure’ has been higher. Only about 20% of ‘arrival’ and ‘death’ records have been matched so far, but the samples have produced some remarkable results.

Hamish Maxwell-Stewart discovered some interim results from analysis of the surgeons’ sick-list on the very long, 4-6 month voyage the convict ships took to get to Tasmania. He graphed what diseases where prevalent at what stage of the voyage (scurvy, digestive system, fever etc.) and speculated upon the broader policy arrangements or period of the voyage that may have contributed to the disease.  An argument repeatedly made by many of the historians at the meeting was that as long as the convict survived the voyage, transportation may have extended their life expectancy as life in a penal colony in Tasmania may have been healthier than working-class life  in 19th Century Britain.  However, Janet McCalman did stress the need to see results from the whole population first so that the sub-studies could be contextualised (and it isn’t good research practice to release results too soon as later results may contradict earlier results).

In 1834 at the age of 20, my great grandfather, Francis Fitzmaurice, was transported to Tasmania for stealing clothes. After a long history of well-documented recalcitrance in the convict system in Tasmania; being freed, having children, imprisoned, and freed again, he died of exposure to the elements on June 10, 1883.  I wonder if this is why I wear such large woolly jackets in the winter.

What is eResearch in the Arts and Humanities

This is the start of a ‘white paper’ on eResearch in the Arts and Humanities. Comments are most welcome (I do admittedly rely a little too much on Susan Hockey’s wonderful history of Digital Humanities in ‘A Companion to Digital Humanities).1

…by its very nature, humanities computing has had to embrace “the two cultures”, to bring the rigour and systematic unambiguous procedural methodologies characteristic of the sciences to address problems within the humanities that had hitherto been most often treated in a serendipitous fashion (Susan Hockey)

What are the Digital Humanities?

The disciplines and sub-fields that make up the humanities have a long interdisciplinary relationship with computing. Since the Italian Jesuit Priest, Father Roberto Busa approached Thomas. J Watson of IBM in 1949 to assist him in indexing some 11 million words of Medieval Latin, numerous humanities scholars have had productive if not at times challenging relationships with computing. Some of the early computing tasks set by humanities scholars included verification of authorship of disputed texts, automating the laborious task of creating concordances on seminal texts, and encoding and defining document structures for digital publication and analyses. Literature and linguistics were the forerunners of computing in the humanities, spreading out to other disciplines at later stages depending on the specific needs and questions of the disciplines and the capabilities of digital technologies.

The term ‘Digital Humanities’ is a banner term that encompasses all the disciplines in the humanities and the meaningful use of computing within them. As a field it is interdisciplinary by nature and although its definition is hotly disputed, it is generally agreed that ‘humanities computing’ or ‘digital humanities’ is an attitude towards computing encompassing theoretical sophistication and an applied technical know-how. It is this balance between the needs of the humanities and the needs of applied computing that is the most taxing aspect of the field. Accordingly the institutional arrangements of the field differ vastly from applied computing centres to full academic departments. The knowledge in the field is communicated through established journals and conferences as well as through a plethora of digital means.

What is eResearch?

The broader eResearch agenda, largely driven by the need to store and re-use the vast amounts of data produced by modern research, provides another set of challenges and opportunities for the humanities. eResearch, commonly referred to ‘Cyberinfrastruture’ in the US or ‘eScience’ in Europe, is largely an infrastructure movement to support ‘big science’. eResearch may be understood as a response to the pressing needs for large scale, interdisciplinary and trans-national collaborations using important data sets and analytical tools to address some of the most pressing questions facing humankind. The planets diminishing energy resources, stressed atmosphere and rising temperatures are problems too large to be dealt with by one discipline, one university or indeed one nation state. Large scale problems require large scale research collaborations and the accompanying infrastructure to support them. Climate data sets, agricultural crop data, emissions measurements, and historical data may be combined, collaborated upon, and communicated in such a way to create new knowledge and thus new approaches.

On a less monumental scale, eResearch enables researchers to address all sort of problems associated with the management of data, the citation of data, the location of data, and the communication of data. Although the humanities do not have the same set of challenges in terms of ‘the data deluge’ as the sciences, the humanities do produce (and need to manage) data in the form of oral interviews, image databases, text resources, and other varied accounts of the human condition. Humanities data is often laborious and expensive to produce, yet highly reusable in subsequent research contexts.

What is Data?

For the humanities, the term ‘data’ is rarely used to describe the apparatus of the research process, except perhaps in terms of those disciplines that engage in gathering data through ‘field work’ in social studies or empirical archival investigations. However, in the digital domain, where seminal corpuses, libraries, literature, and language resources are increasingly in digital form, almost any resources that helps scholars understand the human condition may be understood as ’data’. Records of the Old Bailey, newspapers, parliamentary papers, and court records are not only digital facsimiles of their original published online, but are also to all intents and purposes, ‘data’ that can be holistically analysed, compared and contrasted, and utilised as evidence in a similar way to a scientist understands data. Placing a million books online is a notable exercise in distribution, but the more remarkable attribute of a million books in digital form is that when viewed as data, they may be extracted in such a way to construct meaning that helps us understand new knowledge about these books that is beyond the scope of traditional scholarly labour.

What is architecture?

To take advantage of some of the computing infrastructures being built within the broader eResearch agenda, the ‘computing architecture’ must be built in such as way to take account of researchers working practices. In the humanities, the context of the ‘data’ is important as it is through context that humanities scholars establish the veracity of the resources and its subsequent meaning. Humanities scholars often require sophisticated anthologies to establish how knowledge ‘came into being’ (and its relationships), so that it can be built upon though monographs and articles. It must also have the ability to be cited so that its original location can be verified; of similar importance to the repeatability of the scientific method in science. Well designed Humanities architectures are a mix of more generic ‘services’ common to humanities practices; often containing tools and services more specific to disciplines and research questions.

The challenges and opportunities of eResearch in the Arts and Humanities

Perhaps the greatest benefit of the eResearch within the arts and humanities, beyond the many useful services and resources already produced, is that it allows humanities scholars to engage with advanced computing and imagine what is possible. We may not always get this right; it is an interdisciplinary experiment of methods and approaches, of tool development and application which promise to augment the humanities critical, analytics and speculative skills, or if driven by the wrong impulses, abate them. eResearch in the arts and humanities is a something that the humanities themselves must grasp and lead.

1. Susan Hockey, ‘The History of Humanities Computing” A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004.,

That Camp at Digital Humanities 2010

That Camp, is a ‘user generated’ conference focussing upon the tools, methods, and theoretical issues within the Digital Humanities. It originates from the Centre for History and New Media at George Mason University in the US and has been held in a number of other locations. ‘That Camp’ London to be help immediately before the Digital Humanities conference at King’s College London on 6th and 7th July. Digital Humanities occurs on the 7th to 10th July. I hope to see you there (link).


DHO Summer School 28 June – 2 July 2010, Trinity College, Dublin

Registration is now open for the 2010 Summer School. Please see the registration page for further details.

The Digital Humanities Observatory in conjunction with NINES and the EpiDoc Collaborative is pleased to offer the DHO Summer School 2010. It will bring together 60 Irish and International humanities scholars undertaking digital projects in diverse areas to explore issues and trends of common interest. Workshops and lectures will offer attendees opportunities to develop their skills, share insights, and discover new opportunities for collaboration and research. Activities focus on the theoretical, technical, administrative, and institutional issues relevant to the needs of digital humanities projects today.
The full summer school package offers participants four week-long workshop strands to choose from, a second day–long workshop and two lectures all on innovative topics by leading experts and theorists in digital humanities with additional options of private consultation time with a digital humanities specialist and evening social activities.
For those unable to attend the full Summer School, it is possible to register for the one-day workshop and/or one or both of the lectures (link)