Examples of structured and un-structured data

Here are some examples of structured and unstructured data projects and services (which at times overlap). And remember that data is almost always wrong but sometimes it is useful!

Structured data (Pre-defined and machine-readable, is locatable and usually has a relational ‘data model’ and usually is about real-world objects)

  • What is meta-data? (Australian National Data Service) http://www.ands.org.au/
  • Library Catalogues (date, author, place, subject, etc)
  • Census records (birth, income, employment, place etc.)
  • Federal and State Hansard http://www.openaustralia.org/
  • Legal records: Old Bailey Online (1674-1913) http://www.oldbaileyonline.org/
  • Economic data (GDP, PPI, ASX etc.)
  • FaceBook like button (big-data collection!)
  • Phone numbers (and the phone book)
  • Databases (structuring fields)
  • XML-TEI (bringing structure to the text through tagging particular elements like versions of the word ”canal’ in 17th C Dutch.

Un-structured data (no pre-defined data model, usually text. But there is always some structure)

The techniques for dealing with unstructured data usually involve text-analysis (sometimes statistical) to look for patterns (semantic, linguistic, historical ‘dates, numbers, facts’ etc), to aid in search and discovery (not analysis, that involved critical humanities scholars). The patterns can be small (ie a single author) or large-scale (ie. a newspaper corpus), but sometimes so large scale that results may lack meaning.




Leave a Reply