Some examples of structured and unstructured data projects and services (which also overlap). Data is almost always wrong, but sometimes it is useful!
Structured data (Pre-defined and machine-readable, a locatable, sometimes relational ‘data model’ usually of real-world objects)
- What is meta-data? (Australian National Data Service) http://www.ands.org.au/
- Library Catalogues (date, author, place, subject, etc)
- Census records (birth, income, employment, place etc.)
- Federal and State Hansard http://www.openaustralia.org/
- Legal records: Old Bailey Online (1674-1913) http://www.oldbaileyonline.org/
- Economic data (GDP, PPI, ASX etc.)
- FaceBook like button (big-data collection!)
- Phone numbers (and the phone book)
- Databases (structuring fields)
- XML-TEI (bringing structure to the text through tagging particular elements like versions of the word ”canal’ in 17th C Dutch.
Un-structured data (no pre-defined data model, usually text. But there is always some structure)
Techniques for dealing with unstructured data usually involve text-analysis (sometimes statistical) to look for patterns (semantic, linguistic, historical ‘dates, numbers, facts’ etc), to aid in search and discovery (not analysis, that involved critical humanities scholars). The patterns can be small (ie a single author) or large-scale (ie. a newspaper corpus), but sometimes so large scale that results may lack meaning.
- The Web! (google’s Page Rank algorithm)
- email (body), web-page, word-precessed document
- Voyant Tools http://docs.voyant-tools.org/tools/
- TROVE http://trove.nla.gov.au/
- Stylometry (ie. work of the Australian, John Burrows), Federalist papers a good example
- Factiva http://unimelb.libguides.com/content.php?pid=99524&sid=761325
- British Newspapers (first ‘newspaper’ 1702) http://www.britishnewspaperarchive.co.uk/
- Google NGrams http://books.google.com/ngrams
- Health records (and there is a move to e-records)
- Topsy (Social web-research) http://topsy.com/
- Spying with un-structured ‘big data’ (NSA). Perhaps good for real-time analysis or locating one specific individual.
- SPSS and NVivo (commercial products for un-structured data)
I used to teach this guy once (in Media and Comm). He captures the issues on ‘digital death’ pretty well.
Call for Papers, Posters and BoFs.
DIGITAL HUMANITIES AUSTRALASIA 2014: Expanding Horizons
The Australasian Association for Digital Humanities (aaDH) is pleased to announce its second conference, to be held at The University of Western Australia, 18-21 March, 2014.
The aim of DHA 2014 is to advance digital methods, tools and projects within humanities research and develop new critical perspectives. The conference will provide a supportive, interdisciplinary environment to explore and share new and advanced research within the digital humanities.
The conference is sponsored by iVEC@UWA, The University of Western Australia, Edith Cowan University, Perth Convention Bureau, and the Australian Literature Westerly Centre, UWA.
The conference will feature long and short papers, posters and workshops, and informal ‘birds of a feather’ discussions. We invite proposals on all aspects of digital humanities, and especially encourage papers showcasing new research and developments in the field and/or responding to the conference themes.
Proposals may focus on, but need not be limited to:
1. WORKING WITH TEXT such as;
• Critical text editing and electronic editions
• Digitisation, text encoding and analysis
• Text mining in historical scholarship
• Book history, and digitising the book
• Computational stylistics and distant reading
• Digital curation and archives for cultural materials
2. NEW MEDIA and the DIGITAL such as;
• Computational approaches in new media and Internet studies
• The digital in culture, creativity, arts, music, performance
3. METHODS, APPROACHES, USERS such as;
• Crowd-sourcing scholarship in the humanities
• Quantitative methods in humanities research
• Code studies, and code in the humanities
• Mapping and spatial visualisation
• Human Computer Interaction (HCI) in digital humanities research
• Gaming for learning, serious gaming, and game archiving
• Archaeology using digital methods including marine archaeology
4. WORKING WITH DATA
• Modelling humanities data
• Linked Data and the humanities
5. BUILDING the DH COMMUNITY and PRESENCE
• Measuring and valuing research in the digital humanities
• Institutionalisation, interdisciplinarity and collaboration
• Curriculum and pedagogy in the digital humanities
• Virtual research environments in humanities research
6. INDIGENOUS AND CROSS-CULTURAL DIGITAL RESEARCH
• Cross-cultural studies
• International comparisons
Abstracts of no more than 600 words, together with a biography of no more than 100 words, should be submitted to the Program Committee by 14 September 2013. All proposals will be fully refereed.
Proposals should be submitted via the online form at http://www.conftool.net/dha2014/
Please indicate whether you are proposing a poster, a short paper (10 mins + 5 mins questions), a long paper (25 mins + 5 mins questions), or birds of a feather session (60 mins). Proposals will be assessed in terms of alignment with the conference themes and the quality of research within these or related themes. Presenters will be notified of acceptance of their proposal on 14 October 2013.
1. Poster presentations
Poster presentations may include work-in-progress as well as demonstrations of computer technology, software and digital projects. A separate poster session will take place during one day of the conference, during which time presenters will need to be available to explain their work, share their ideas with other delegates, and answer questions. Presenters are encouraged to provide material and handouts with more detailed information and URLs. Poster guidelines are available on the conference website to help you prepare your poster.
2. Short papers
Short papers are allocated 10 minutes (plus 5 minutes for questions) and are suitable for describing work-in-progress and reporting on shorter experiments and software and tools in early stages of development.
3. Long papers
Long papers are allocated 25 minutes (plus 5 minutes for questions) and are intended for presenting substantial unpublished research and reporting on significant new digital resources or methodologies.
4. BoFs (Birds of a Feather sessions) are 60 minute sessions that should be used for guided discussions on one topic. BoFs are informal, open presentations for exploring key community issues and debates within the digital humanities.
Do you have an issue to discuss or are unsure how to progress a topic? For example:
• Digital humanities what are the risks and rewards? or
• Digital humanities and computer science as an interdisciplinary challenge – where to from here?
60 minutes will be provided for each session. Each speaker will have a short time to present their points for discussion and the audience should also have an opportunity to comment (recommend allocation of up to 40% of the total time available).
On behalf of the Program Committee
Professor Hugh Craig, The University of Newcastle
Dr Craig Bellamy, The University of Melbourne