Distant reading the past
This is a really interesting thesis underway at King’s. I suspect he is using the Old Bailey records as some of the findings about the Irish and their likelihood of prison are similar to the Founders and Survivors project about class-background and transportation to Tasmania.
You wish!
Discounted members places for DHA2014
ADHO, the Association of Digital Humanities Associations (in which the aaDH is associated) has a new discounted members category, which is a good way to join the aaDH. It costs about $45 to join, but this is without the subscription to LLC.
And if you join aaDH, you get a discount of $150 to register for DHA2014 plus a similar discount for the major international DH conference in Switzerland this year.
More details here:
http://aa-dh.org/2014/01/aadh-new-discounted-membership-categories/
5 most important (computing) technologies for the humanities
- Judgement: the ability to examine a situation from a number of different perspectives and offer an independent position on the situation based upon hard-won life experiences
- Empathy: the ability to imagine a world bigger than oneself and also, the other wonderful people in it, who are also engaging with the world in unique and special ways (well, sometimes)
- Synthesis: the ability to put desperate ideas together (woops, I mean disparate) in such a way as to make sense to you and possible someone else
- Analysis: the ability to critical examine an idea, and how it got into your head, and not just describe the idea from a pseudo-objective standpoint (like the objective god of empirical science…shite the robot has spotted me…run!)
- The critical application of XML to significant and important historical (and other) phenomena to bring depth and perspective to angry robots
Digital Humanties Australasia, 2014
Examples of structured and un-structured data
Here are some examples of structured and unstructured data projects and services (which at times overlap). And remember that data is almost always wrong but sometimes it is useful!
Structured data (Pre-defined and machine-readable, is locatable and usually has a relational ‘data model’ and usually is about real-world objects)
- What is meta-data? (Australian National Data Service) http://www.ands.org.au/
- Library Catalogues (date, author, place, subject, etc)
- Census records (birth, income, employment, place etc.)
- Federal and State Hansard http://www.openaustralia.org/
- Legal records: Old Bailey Online (1674-1913) http://www.oldbaileyonline.org/
- Economic data (GDP, PPI, ASX etc.)
- FaceBook like button (big-data collection!)
- Phone numbers (and the phone book)
- Databases (structuring fields)
- XML-TEI (bringing structure to the text through tagging particular elements like versions of the word ”canal’ in 17th C Dutch.
Un-structured data (no pre-defined data model, usually text. But there is always some structure)
The techniques for dealing with unstructured data usually involve text-analysis (sometimes statistical) to look for patterns (semantic, linguistic, historical ‘dates, numbers, facts’ etc), to aid in search and discovery (not analysis, that involved critical humanities scholars). The patterns can be small (ie a single author) or large-scale (ie. a newspaper corpus), but sometimes so large scale that results may lack meaning.
- The Web! (google’s Page Rank algorithm)
- email (body), web-page, word-precessed document
- Voyant Tools http://docs.voyant-tools.org/tools/
- TROVE http://trove.nla.gov.au/
- Stylometry (ie. work of the Australian, John Burrows), Federalist papers a good example
- Factiva http://unimelb.libguides.com/content.php?pid=99524&sid=761325
- British Newspapers (first ‘newspaper’ 1702) http://www.britishnewspaperarchive.co.uk/
- Google NGrams http://books.google.com/ngrams
- Health records (and there is a move to e-records)
- Topsy (Social web-research) http://topsy.com/
- Spying with un-structured ‘big data’ (NSA). Perhaps good for real-time analysis or locating one specific individual.
- SPSS and NVivo (commercial products for un-structured data)