Geo-referencing Digitised Collections

georeferencing
There are a couple of projects underway here at the Centre for eReseach (CeRch) and the Centre for Computing in the Humanities (CCH) about ‘Geo-referencing’. Geo-referencing is a way of ‘tagging’ digital collections so they can be searched by geographical place names or mapped.  Dr Claire Grover of the Language Technology Group, School of Informatics, University of Edinburgh is working on text-mining methods for extracting geographical information from unstructured text (ie. not encoded).  She is talking here next week. If you would like to come;  just send me an email.

There are vast quantities of textual information which people
typically access through standard search queries. Many collections
have added value in metadata associated with texts but this is costly
and time-consuming to generate by hand. Researchers in the field of
natural language processing (NLP) have been been working for the past
couple of decades on technologies for information extraction (aka text
mining) that will allow for the automatic extraction of structured
information that currently resides in unstructured text. In this talk
I will describe the NLP system that we have been developing to extract
‘who, where and when’ metadata from textual content. The primary focus
of the system is geo-referencing so that the place names in a text can
be recognised and grounded to a gazetteer entry to provide lat/long
information. In addition the system recognises person names as well as
dates and other temporal expressions.

System development was previously funded as part of EDINA’s
GeoCrossWalk project and we are currently refining it further for use
in the GeoDigRef project where we are geo-referencing three digitised
collections, Histpop, parliamentary records from BOPCRIS and metadata
from the British Library’s Archival Sound Recordings. In a parallel
project we are geo-referencing the Stormont Papers. I will discuss the
issues that arise from these different collections and will use them
to illustrate the difficulties in trying to develop a general purpose
tool that can be useful across different text types.

Post to Twitter

This entry was posted in digital humanities, humanities computing and tagged , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • ...this blog is obsessively directed at profiling digital humanities developments in a cultural, social, and technical sense and in terms of books and applications...it is an aggregation or 'meta' style blog with the occasional commentary

    Hi, my name is Dr Craig Bellamy and I am a digital humanities analyst for the Victorian eResearch Strategic Initiative, a consortium based at the University of Melbourne, however, the views expressed in this blog are the responsibility of the author alone.

    Subscribe

    Follow me on Twitter

  • Pages

  • Categories

  • Archives