Saturday 13 November 2010

Hacking the Library -- ShelfLife@Harvard

What is Shelflife?
Shelflife is web application that uses what libraries know (about books, usage and comments) to allow researchers and scholars to access the riches of Harvard’s collections through a simple search.

Researchers will be able to access, read about, and comment on works using common social net- work features. ShelfLife will bring Harvard results to the forefront of the research process, allowing users to easily access and explore our vast collections.
What makes it unique?

Shelflife is designed to help you find the next book. Each search will retrieve a unique web page providing key information about the thing searched, including basic information, fluid links to related neighborhoods, and analytic data about use, all presented in a clean graphical format with intuitive navigation with discoverability in mind.


From the Harvard Library Innovation Lab. The site provides no information about ShelfLife beyond the above, but Ethan Zuckerman, who's a Berkman Fellow at the moment, has a useful blog post reporting a presentation by David Weinberger and Kim Dulin, who co-direct the project.
Libraries tend to be very knowledgeable about what they hold in their collections. But they’re much less good about helping people discover that information. There are few systems like Amazon or Netflix recommendations that help scholars and researchers discover the good stuff within libraries. Dulin argues that librarians have been pretty passive in the face of new technology – they’ve purchased fairly primitive systems and had to buy back their content from the companies who build those systems.

Researchers tend to start with Google, Dulin tells us. They might move to Google Books or Amazon to find out more about a specific book. And perhaps a library will come into play if the book can’t be downloaded or purchased inexpensively. Libraries would like to move to the front of that process, rather than sitting passively at the end. And lots of libraries are trying to take on this challenge – new librarians often come out of school with skills in web design and application development.

The Lab hopes to bring fellows into the process, much as Berkman does. It works to build software, often proof of concept software. And innovation happens on open systems and standards, so libraries and other partners can adopt the technology they’re developing.

Two major projects have occupied much of the Lab’s time – Library Cloud and ShelfLife, both of which Weinberger will demo today. There are smaller applications under development as well. Stackview allows the visualization of library stacks. Check Out the Checkouts lets us see what groups of users are borrowing – what are graduate divinity students reading, for instance. And a number of projects are exploring Twitter to share acquisitions, checkouts and returns.

Weinberger explains that ShelfLife is built atop Library Cloud, a server that handles the metadata of multiple libraries and other educational institutions and makes that metadata available via API requests and “data dumps”. Making this data available, Weinberger hopes, will inspire new applications, including ones we can’t even imagine. ShelfLife is one possible application that could live atop Library Cloud. Other applications could include recommendation systems, perhaps customized for different populations (experts, versus average users, for instance.)


Turns out the ShelfLife is in a pre-Alpha state of development. The metaphor behind it is the "neighbourhood" -- i.e. clusters that a given book might sit within.
We see a search for “a pattern language”, referring to Christopher Alexander’s influential book on architecture and urban design. We see a results page that includes a new factor – a score that indicates how appropriate a title is for the search. We can choose any result and we’ll be brought into “stack view”, where we can see virtual books on a shelf as they are actually sequenced on the physical shelf. Paul explains that it’s actually much more powerful than that – many books at Harvard are in a depository and never see the light of a shelf. And many colelctions have their own special indices – the virtual shelf allows a mix of the Library of Congress categories with other catalogs.

The system uses a metric called “shelfrank” to determine how the community has interacted with a specific book. The score is an aggregate of circulation information for undergraduates, graduates and faculty, information on whether the book has been assigned for a class, placed on reserve, put on recall, etc. That information exists in Library Cloud as a dump from Harvard’s HOLLIS catalog system – in the future, the system might operate using a weekly refresh of circulation data. The algorithm is pretty arbitrary at this point – it’s more a provocation for discussion than a settled algorithm.


Ethan reports some of the Q&A and generally does a great job of writing up the event. His post is worth reading in full.

No comments: