What to do with 30 million books?


(Posted to that wonderful Digital Humanities list, Humanist).

Date: Wed, 14 Oct 2009 18:22:57 +0100
From: Jockers Matthew <mjockers@stanford.edu>
Subject: Possible Text Mining Opportunity at Stanford


As I’m sure many of you already know, Stanford has been closely
involved with Google’s book scanning project, and we (Stanford) are
currently preparing a proposal for the creation of a text mining /
analysis Center on campus. The core assets of the proposed Center
would include all of the Google data (approx. 30 million books) plus
all of our Highwire data and all of our licensed content. We see a
wide range of research opportunities for this collection, and we are
envisioning a Center that would offer various levels of interaction
with scholars. In particular we envision a “tiered” service model
that would, on one hand, allow technically challenged researchers to
work with Center staff in formulating research questions and, on the
other, an opportunity for more technically advanced scholars to write
their own algorithms and run them on the corpus. We are imagining the
Center as both a resource and as a physical place, a place that will
offer support to both internal and external scholars and graduate
students. We are looking at creating fellowship opportunities and
post docs as well as other ways of encouraging and supporting

I am writing to you specifically because I think this will be
something you are interested in but also because at this stage of the
proposal we are looking for some external validation that this corpus
would be of value and that the research it would support would inspire
new questions and new knowledge. I have already polled our Stanford
faculty, and the response (especially in the humanities and social
sciences) has been very enthusiastic. My hope is that you might be
able to send a few words (at most a short paragraph) that I could add
to a section of our proposal that is titled “Scholarly Interest and
Research Potential”.

Hope you are all well and getting your abstracts polished for London
in 2010.


Matthew Jockers
Stanford University



Leave a Reply