In a recent post, Rose Holley describes how the National Archives of Australia is using crowdsourcing to help transcribe scanned archival descriptions. Dubbed “The Hive” (a play on [arc]Hive), the site allows users to pick a scanned image marked easy, moderate, or difficult and edit the OCR’d version to better reflect the actual document. Work can be saved as partially complete or fully complete.
Holley reports that of the 800 lists available for description, over 300 have been completed in the first two weeks.
Holley is no stranger to crowdsourcing, as she pioneered the hugely successful Australian Newspapers project, now housed at Trove, that enabled the transcription of millions of newspaper stories from scanned newspapers.
It has been surprising to me how few libraries and archives have followed Holley’s lead. The clear success of enabling users to work at transcribing or correcting OCR’d documents seems to offer a wonderful opportunity to expand our digitization capabilities at little or no cost.