Open Government Data: possibilities and problems

“Government 2.0 involves direct citizen engagement in conversations about government services and public policy through open access to public sector information and new Internet based technologies. It also encapsulates a way of working that is underpinned by collaboration, openness and engagement”[1]

Back ground and context

The Political Issues Analysis System (PIAS) project (view original report .pdf)—in which this work is a sub-set—sought to investigate how citizens in Melbourne, Australia used the Internet to seek political information about key political issues. It also sought to understand how citizens contacted and interacted with their elected representative in relation to these issues. Through workshops, case studies, and the development and testing of prototype software, the research uncovered some notable trends in terms of user engagement with important aspects of the formal political process online.

The PIAS project principally focussed upon citizen information use through investigating interaction with party web-sites and the policy documents that they made available. However, the participants in our study largely found 1), the sites difficult to use 2), the information hard to navigate and compare with other policies and 3), the written policies unreliable and unclear. One of our key recommendations from the study emphasized that polices published by political parties should be made available in a ‘machine readable’ form so that they can be automatically aggregated into other systems to enable citizens to compare the policy positions of the parties. Also, strict metadata publishing standards and frameworks should be used so that the information aggregated is of a high-standard allowing it be re-utilised effectively.

This work compliments the PIAS project through listing some of the key projects and services that available that utilise government data. It also explores in more detail the limited availability of what could be termed ‘democratic data’. For the purposes here, “democratic data” is described as: 1) Hansard: making the working of government available in new ways, 2) Transparency: newer forms of transparency through ‘data’, and 3) Policy: enhance and extend the policy making process through online open consultation.

Why Open Access to government data?

Much of the impetus behind the drive for Open Access to government data stems from a push for greater transparency to the functions of government. However, in the case of Victoria, for instance, much of the data being released within the Gov 2.0 agenda tends to be of an administrative nature and of little democratic potential. Whist the Parliament of Victoria does make an enormous amount of useful material available to the public through its website; it is not made available in a technically sophisticated, machine readable way, to take full advantage of the potential of the Internet. Bills are only available in .pdf or word format and the most important document about the workings of government, Hansard, is also only available as .pdf (although it is possible to do a full-text search of Hansard from 1991 onwards). If these important documents were available in a machine readable form, they could be utilised by application developers in innovative ways.

The Open Access movement is a push to make data both machine readable and interoperable so that it may be linked together and leveraged for all sorts of purposes. This may be for new business opportunities, medical research, or new areas of social research. However, doing this is no easy task as multiple data sources require linking and matching across diverse and complex systems (and ‘cleansing’). The first step in this process is to expose data in a standardised way so that it may be located and machine-read. The Victorian public sector has a policy framework specifically designed to achieve these tasks titled the Victorian Public Sector Action Plan. Two key points are:

  1. Participation: Engaging communities and citizen through using Government 2.0 initiatives to put citizens at the centre and provide opportunities for co-design, co-production and co-delivery.
  2. Transparency: Opening up government through making government more open and transparent through the release of public sector data and information[2]

Making data available in this way can only help to “deepen democratic processes” and promote a strong and healthy democracy (however this is often an aspiration rather than an actuality).[3] Accordingly, there is a promising international trend to promote a two-way dialogue between political representatives and the public through combining ‘’democratic data’’ with citizen produced data through popular social media platforms.[4] Rather than building a completely new platform (as has been the case with a number of somewhat underutilised government initiatives), some projects take advantage of largely existing and heavily used social network platforms and provide tools and services to augment their existing capacity (usually to inform and communicate government policy processes) The large EU funded WeGov project[5] and other projects in the US and Europe are welcome movements in this direction. [6]

What are Open Educational Resources?

As the name suggests, Open Educational Resources (OER) are freely available resources for learning and teaching; such as documents, videos, syllabi, software, and images. The advantage for educators is that these resources may be deposited, shared and re-used thus saving time in creating new courses or updating existing courses (also the promotion of the particular institution or field and peer support for others in the same subject area is an advantage of sharing teaching materials). OER’s may be available as individual objects or bundled together as a package. They are most likely ‘open licensed’ through licenses such as Creative Commons or GNU and are made available either on the open web or within institutions. Also, the term ‘Open CourseWare is often used.





What types of materials?
The types of materials that are distributed as Open Educational Resources are usually those that have been previously used in a class-room setting, or designed for a purely online or in a blended learning context. They may be materials for activities or labs, full courses, games, lecture notes, lesson plans, teaching and learning strategies, video recorded lectures, or images and illustrations. The audience for these materials may be lecturers (which is primarily the case) or may be students or even parents or administrators.
What type of licences?
Open Educational Resources are usually licenced so that they may be easily re-used within a non-commercial educational content (ie not re-sold). Many licences allow for ‘re-mixing’ which means that they may be adapted and enhanced to suit differing institutional contexts and student cohorts. Some licences only allow for sharing and re-use and no major revision (ie. ‘read the fine print’) and many are available within the certain educational copyright regime of the particular country (ie. ‘educational use of copyrighted material’ provisions). Attribution is always an important consideration, meaning that the materials taken from OER repositories must be acknowledged so that the original creators of the work are credited.
Where are OER found?
Many OER repositories are available on the open web, such as the OER Commons project or Connexions. The repositories may be run by volunteers or through paid employees on project funding provided by a university or funding agency. Although projects such as OER Commons and Connexions were designed specifically for OER, broader definitions of the term may include projects such as the Internet Archive or even Wikipedia. OER repositories may also exist at a university level to be maintained either by the university library or through the team responsible for the university Leaning Management System (LMS). Leaner Management Systems such as Desire2Learn have inbuilt repositories so that course content may be deposited and shared at a school, faculty, or institutional level (or open to the broader community).
What are the archival (technical) standards?
When OER materials are places into a repository, metadata and archival standards need to be associated with them so that they may be easily located, archived and shared in a meaningful way. SCORM (Sharable Content Object Reference Model) is a common way in which objects may be described, zipped-up into a package and re-used by different Learner Management Systems (LMS). Succinctly, SCORM is a ‘package of lessons’ that are bundled together so as to be understood by the LMS. What this means for educators, is that when placing OER materials into a repository, the correct ‘meta-data’ (data about data) is required about the material; usually inputted through a form to demarcate the type of materials and subjects addressed.
What are the archival (teaching) standards?
Many OER resources are likewise aligned with the teaching standards that may exist in different institutions or jurisdictions. The resources available are often aligned through a peer-assessment of the OER’s utility, quality of explanation, or quality of technical interactivity. The value of this for educators is the certainty that OER resources are of high quality and currency and purposefully meet teaching challenges.

OpenTech ’09

I attended the OpenTech ’09 forum on Saturday; organised by the UK Unix Users Group and friends at the University of London Union (ULU). For those interested in the social and political aspects of computing; this is an excellent forum to discuss new modes of political communication, privacy, advocacy and other issues that arise from the broader computing movement. There was an excellent talk on the two cultures of science/technology and the humanities from Bill Thompson who compared CP Snow’s pioneering work to present social circumstances. Bill basically argued that technological literary needs to rise considerably; especially in the political classes, otherwise we are doomed! He argued that many people in senior positions (as well as the broader public) do not understand the ‘power in code’ and this is perhaps why so many large government systems have failed in the UK (I just ordered CP shows book on Amazon for 10 quid).

Another interesting session was from a representative from the Guardian newspaper who discussed their experience of reporting the Ian Tomlinson death at the G20 protests earlier this year. The speaker explained how the video footage was released immediately  on the web rather the usual slower way through the print-edition. Although the analysis of this technique was not well communicated by the speaker, he did made the interesting observation that the Guardian in this instance had used their online distributing power to ‘crown source’ news rather than simply publish it. They had allowed others to use the video of Tomlinson’s death in Blogs and Youtube etc. rather than slowly releasing it thorough the print edition.

Another speaker from the Guardian talked about the paper’s very bold initiative to make much of their data open to the public. They have RSS feeds, an API system, and a sophisticated tagging system. I found their DataBlog one of the most interesting initiatives in that many of the facts that are researched by journalists have been aggregated for later use and open to the public.

The Guardian’s initiative to crowd source the expenses claims-documents of MPS was also discussed; along with the limitations and opportunities of this approach.

Digital copyright: it’s all wrong

A draft treaty proposes draconian measures to protect copyright.

THE forces of reaction are fighting back. As they often do, they are carrying out their planning in secret, in the knowledge that if more people knew of their activities they would not be allowed to get away with it (link)

eArts and eHumanities – eScience technologies and methodologies in Arts and Humanities research

This workshop is being held as part of the Open Grid Forum in Manchester next Monday May 5.

Andreas Aschenbrenner (TextGrid), Stephen Beck (HASS-RG), Tobias Blanke (AHeSSC), Allison Clark (HASS-RG), Stuart Dunn (AHeSSC), Peter Gietz (TextGrid), Mark Hedges (AHDS)

The first session will be a Birds of a Feather session – presenting the work of TextGrid in Germany, the Arts and Humanities e-Science Initiative in the UK, and related projects in the US.
The second session will discuss how to cooperate better on emerging standards and tools for eHumanities and eArts.
In the first session interested projects present their work, find common interests, discuss about desirable service structures for research in the Arts and Humanities, and detect areas of standardization needs.

Session contributors include:
– David de Roure (Southampton) will speak about the usage of Semantic Grid technologies in musicology research support
– Andreas Aschenbrenner (University of Goettingen, TextGrid) will introduce the concept of e-Humanities and the new European research infrastructure to support arts and humanities research (
– Peter Gietz (TextGrid) will present the work at TextGrid (, a virtual workbench to support research in textual studies
– Alex Voss (NCeSS) will present on collaboration support for virtual research communities (link).

Fast Facts Found Online

This article appeared in the Sydney Morning Herald today. There is a small quote from myself on the use of Wikipedia for research.

David Adams talks to four Australians who have helped to build the collaborative online giant that is Wikipedia.

NEXT time you’re sitting at the computer – it may even be as you’re reading this – take a look at the Wikipedia entry for “North Warrandyte”. What about the entry for “United Petroleum” or “Australian architectural styles”. Notice anything similar?All three entries were started by Melburnian Nick Carsen. The 20-year-old, who has just finished a drafting course at NMIT and hopes to study architecture next year, is part of the global revolution in the way we now find information.

For many people, the days when checking a fact meant taking a dusty encyclopedia volume off a shelf are gone. Now their first port of call is a collaborative internet site such as Wikipedia that not only provides a constantly expanding and updated resource but allows you to change information or add to the entry.

Founded in 2001 by US internet entrepreneur Jimmy Wales, Wikipedia has become one of the most popular websites in the world.

With entries on everything from the Azerbaijani people to Zeppelin airships, the Wikipedia juggernaut had 1.6 million articles on its English-language site by the start of December. To get an idea of how fast it’s expanding, Wikipedia grew by 30 million words in July alone.

Mr Carsen discovered the site while surfing the web early last year and decided to start contributing after finding gaps in information about Melbourne’s suburbs.

He spends three or four hours each week contributing to whatever subject happens to catch his interest, whether it’s the Nokia 6820 mobile phone (he owns one) or AFL-related subjects. A Collingwood supporter, he is a member of the Wikiproject expounding on all things AFL.

Look at his entry on United Petroleum, for example. Mr Carsen decided to write it after noting that his local servo sold CSR ethanol-enhanced fuel. “I typed it into Wikipedia and there was nothing about it so I figured, ‘OK, I might as well make an article about it’,” he says.

However, while Mr Carsen describes the site as “really the best source of information available to anybody today”, Craig Bellamy, who teaches media and communications at Melbourne University, says while Wikipedia might be a good place to start your research, it’s “not a good place to end it”.

“The term ‘encyclopedia’ doesn’t always sit well with me,” Dr Bellamy says. “Wikipedia is really good for technical stuff, if you’re building a website for example, and it’s really good for popular culture – you know, references to the history of Pacman – but with the sort of scholarly stuff that encyclopedias traditionally included, it’s not as strong in those areas.” (link)