Thanks to Gabriel B for the information…(link THATCamp; London)
| dataset | format(s) | size | availability | license |
| Archimedes Palimpsest transcriptions | XML: TEI P5 | 5.6 MB | http://www.archimedespalimpsest.net/ | CC-BY |
| Archimedes Palimpsest images | TIFF | approx 1 TB | http://www.archimedespalimpsest.net/ or on HD | CC-BY |
| British Prints Database: http://www.bpi1700.org.uk | MySQL dump + online images | MySQL dump of metadata: 21.7 MB | CD; images http://image.cch.kcl.ac.uk/bpi/ (not to be redistributed) | CC-BY-NC |
| Centre for History and Analysis of Recorded Music (CHARM) catalogues | bespoke XML + METS | 145MB | CD | CC-BY-NC |
| Clergy of the Church of England: http://www.theclergydatabase.org.uk/index.html | MySQL dump | dump is 474 MB | CD | CC-BY |
| DEMOS (text of articles) | XML: TEI P5 | 2.4 MB | CD | CC-BY-NC-SA |
| Domesday/Prosopography of Anglo-Saxon England project | Spreadsheet | CD | CC-BY-NC | |
| Duke Databank + HGV + APIS (papyri transcriptions, translations + metadata) | XML: TEI P5 (EpiDoc) | 2.2 GB | git clone http://idp.atlantides.org/git/idp.data.git/ | CC-BY (except APIS) |
| Euripidies Scholia | pseudo-TEI P5 | 500 KB | http://euripidesscholia.org/sourceFiles/ | CC-BY-NC-SA |
| Greek, Roman and Byzantine Pottery at Ilion | HTML, JPG, RDFa (+KML) | 345 MB | http://classics.uc.edu/troy/grbpottery/ | CC-BY-NC-ND |
| Hofmeister | TEI XML + Authority files | 115MB + | http://www.hofmeister.rhul.ac.uk/2008/content/reference/thesaurus_download.html | CC-BY-NC-SA |
| Homer Multitext images | TIFF, JPEG2000, JPG, Pyramid TIFF, +c | >500 GB (TIFFs alone), several TB total | http://amphoreus.hpcc.uh.edu/ | CC-BY-NC-SA |
| Inscriptions of Aphrodisias | XML: TEI P4 (EpiDoc) | 6.6 MB | http://insaph.kcl.ac.uk/iaph2007/xml/inscriptions.zip | CC-BY |
| Inscriptions of Aphrodisias: feeds | Atom | 2.2 MB | http://concordia.atlantides.org/examples/iaph2007.atom | CC-BY |
| Inscriptions of Roman Tripolitania | XML: TEI P4 (EpiDoc) | 10.2 MB | http://irt.kcl.ac.uk/irt2009/redist/inscr/irt2009_inscriptions.zip | CC-BY |
| Inscriptions of Roman Tripolitania: feeds | Atom | 2.2 MB | http://irt.kcl.ac.uk/irt2009/index.atom | CC-BY |
| Inscriptions of Roman Tripolitania: geodata | KML | 400 KB | http://irt.kcl.ac.uk/irt2009/redist/maps/tripolitania_earth.kml | CC-BY |
| Jonathan Swift Archive | bespoke XML | 35 MB | CD | CC-BY-NC |
| Khirbat al-Mudayna al-Aliya excavations | Atom + images + structured data | http://ckan.net/package/khirbat-al-mudayna-al-aliya | CC-BY | |
| Nineteenth Century Serials Edition | Plain text | 2.6 GB | DVD | CC-BY |
| Nomisma.org (ancient coins) | RDFa (+KML) | 2.3 MB | http://nomisma.org/nomisma.org.xml | CC-BY-NC |
| Old Bailey Transcripts | bespoke XML | > 1 GB | FTP | non-commercial (license required) |
| Perseus Greek and Roman texts | XML: TEI P4 | 340MB | http://nlp.perseus.tufts.edu/hopper/opensource | CC-BY-NC-SA |
| Perseus Treebanks (grammatical markup) | XML | 10 MB | http://nlp.perseus.tufts.edu/syntax/treebank/ | CC-BY-NC-SA |
| Petra Great Temple Excavations | Images + KML + Atom | http://opencontext.org/sets/Jordan/Petra+Great+Temple | CC-BY | |
| Prosopography of the Byzantine World | MySQL dump | CD | CC-BY | |
| Stormont Papers (Hansard): text | XML | 47 MB | CD | non-commercial (license attached) |
| Stormont Papers (Hansard): geodata | KML | 78 MB | CD | non-commercial (license attached) |
| Victoria and Albert Museum Collections | JSON via webservice | API doc: http://www.vam.ac.uk/api | non-commercial (terms online) | |
| Vision of Britain relational data (http://www.visionofbritain.org.uk) | postgres dump | 2GB | DVD | CC-BY-NC-SA |
| Vision of Britain historic mapping | georeferenced rasters | http://www.visionofbritain.org.uk/maps | (images not for redistribution) | |
| WGBH OpenVault Vietnam interview transcripts | TEI with SMIL & RDF | internet access via OAI-PMH with Fedora repository at http://openvault.wgbh.org | non-commercial (terms online) | |
| WW1 Poetry Archive | JPG + metadata CSV | 60 MB sample; full >10 GB | sample on CD; remainder scrapable from http://www.oucs.ox.ac.uk/ww1lit | non-commercial (license attache |

Leave a Reply