Library of Congress and Flickr – What Does This Mean for Other Libraries?

As you may have already heard, the Library of Congress and Flickr recently launched a pilot project in which images from LoC’s collections have been added to their Flickr page. The purpose of this project is to increase access to the collections and to obtain more metadata in the form of tags for the images.

In order to do this, Flickr created a new model of publication for publicly held collections, which they call “The Commons.” This is how anyone is able to tag images on LoC’s Flickr page.

And, the response has been awesome! Look at all the tags!

Since I read this news story, I’ve been pondering if this may encourage smaller libraries to upload their special collections to Flickr as well. I recognize that digitizing photos requires cost, particularly for smaller libraries with limited resources. However, unique collections of historical and cultural items are certainly an asset of our nation’s libraries, and I hope this makes it easier to show them off. We’ll see what the future holds as we continue to watch this project.

Book News: There’s More than One Way to Digitize a Book

2 Models for Digitizing Collections
http://insidehighered.com/news/2007/06/07/google

A large group of universities, the 12 universities that make up the Committee on Institutional Cooperation, has now partnered with Google to digitize their collections.

However, Emory has chosen a different path. This is discussed in the article, but Library Journal provides a little more explanation.

With Scan Plan, Emory University Takes Control
http://www.libraryjournal.com/info/CA6451403.html?nid=2673#news3

I love the idea of print on demand, and I think this is a great application of that idea. (FYI: I used to be employed by Emory University, and I say YAY! for them.)

Books: Saving the Books – One Word at a Time

reCAPTCHA
http://recaptcha.net/

So, most of us have had to enter a word from warped text when we sign up for a free newsletter or something else. reCAPTCHA has found a way to use this to help in digitizing books.

The quick, very simple explanation on digitizing books is that you can scan the book and have an image available. In order to make the text searchable, you must do another step using optical character recognitition (OCR) so that all of the text can be read by the system. With older books and odd fonts, the OCR system may have trouble reading all of the characters correctly.

reCAPTCHA helps the system know the correct characters by these words with the warped words to humans so that humans read the text and enter the correct characters for the system to store and add to the text of the book. The human eye is still better at deciphering strange characters than a computer, and reCAPTCHA is taking advantage of that – one word at a time.

Now, that is a crude explanation, but I hope it makes sense. reCAPTCHA has a much prettier explanation on their page.

found via Chronicle: Wired Campus blog
(and Brad Baxter’s post to an internal list at work)

Posted in Uncategorized. Tags: , . Leave a Comment »
Design a site like this with WordPress.com
Get started