Coming August 11, 2009

Crowdsourcing Scholarly Work

I’d like to be part of a session on crowdsourcing work that has traditionally been led by scholars in an institutional setting.  My own effort along these lines has been FromThePage, a web-based tool for transcribing and annotating  handwritten manuscripts.

I demoed FromThePage at THATCamp 2008 while it was still under development, in private alpha.  At the beginning of this year, I started editing a transcription project in earnest and gained some passionate and talented users. These users have transcribed hundreds of pages, researched subjects mentioned within the texts, and even tracked down and scanned lost documents.  At the same time, however, I’ve discovered that many of the anonymous viewers of the site are actually researching the same subjects that I am.  Since even a passing comment can add valuable insights, I would love to be able to engage these fellow researchers, connect with them and draw upon what they know.

Would anyone be interested in a session on crowdsourcing; discussing how to motivate volunteers and engage with the online public to produce high-quality work?

Other conversations I’d like to have at THATCamp Austin are:

  • How to integrate with standard exhibit management systems like ContentDM or Omeka, and whether OAI-PMH or OAI-ORE are at all useful for that.
  • How to mine genealogy and census databases to identify connections between people in historic documents.  (Sometimes this goal is described as a “FaceBook of the Dead”.)
  • How to get from project to product — when a piece of software is good enough for in-house use, how much more needs to be done to fit it for release as OSS?

Comments RSS TrackBack 5 comments


in August 5th, 2009 @ 18:33

Sounds interesting. I’ve been following work from various genealogical organizations (FamilySearch, on crowdsourcing transcription. Also heard a fascinating presentation at JCDL about using the FamilySearch API in other types of databases.


in August 6th, 2009 @ 08:02

The JCDL presentation sounds suspiciously like Douglas Kennard’s paper on his historic journals project, “Improving Historical Research by Linking Digital Library Information to a Global Genealogical Database”. He and I have corresponded quite a bit for the last month or so, and I’ve tried out his application. It’s a really neat system.

I’d love to see what else you’ve found. I’m familiar with the FamilySearch Indexer, but didn’t know that had moved from tagging into transcription proper.


in August 6th, 2009 @ 13:00

Thanks for the added info. Hadn’t heard about Kennard’s work. Am very interested in anyting of this type, and I’m looking forward to hearing about your Fromthepage work next week.


in August 6th, 2009 @ 14:48

It was, indeed, Doug’s paper. I’m exploring the idea of including a session on the FamilySearch API at next year’s THDI meeting, which will either be in San Antonio or Austin, in February.

I assume you’re familiar with the FamilySearch transcription/indexing service (; calls what it offers “annotations,” but in fact the function is transcription, and you can include basic metadata (person, place, date, other). You can see some samples at When I spoke with Footnote a few years ago about digitizing materials from the state archives, though, they were clear that the transcriptions were only for use on their site; they wouldn’t give copies back to us.


in August 7th, 2009 @ 00:10

Jim, I just got permission from Doug to demo his Historical Journals software to anybody who’s interested. I can’t offer any opinions as to its future direction, but can at least show the thing off.

Danielle, what is “THDI”? I’d like to know more about Texas Heritage Online and its mandate. The only similar government-funded project I’m aware of is Allouette Canada, which has apparently just been merged with a different organization, so I don’t know whether anything I knew still holds. How interested are you in acquiring material from the public?