Capricorn Technologies Logo

Stanford University Libraries LOCKSS Logo Stanford University Libraries CLOCKSS Logo

Company Name: Stanford University Libraries LOCKSS/CLOCKSS Initiative

Line of Business: Taking custody of and preserving cultural and social assets for future generations.

Objective: To find storage affordable and reliable enough to allow the member community to build a trusted, distributed dark archive in order to protect online scholarly content from catastrophic events and other long-term interruptions.

Result: Widely-distributed deployment of low-cost PetaBox GB series nearline storage, providing a secure, long-term, reliable, and infinitely scalable archiving solution.

http://www.lockss.org
http://www.lockss.org/clockss

Capricorn Technologies PetaBox Helps Preserve Today’s Materials for Tomorrow’s Citizens

"...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident."*

Thomas Jefferson

Ten years ago, web technology forced a change in the business relationship between librarians and publishers. Libraries could no longer take custody of materials – they had to lease subscription materials and were allowed only restricted access to non-subscription materials. The change disrupted the role libraries have played in society for hundreds of years as trusted keepers of information and culture for future generations. Vicky Reich and David S.H. Rosenthal at Stanford University Libraries, alarmed at the prospect of access to this important content being removed from the scholarly community, co-founded the LOCKSS (Lots of Copies Keep Stuff Safe) Program in 1996. The LOCKSS Program and preservation “system” reinstate the traditional role of libraries to take custody of and preserve cultural and social assets. The LOCKSS system provides a collaborative approach to long-term preservation of a library’s local collections. Under the Program, librarians use LOCKSS boxes to collect and preserve locally the subscription journal content they purchase.

The LOCKSS boxes at libraries around the world form a peer-to-peer network allowing each box to find other copies of the same material they have collected, compare the copies, and repair any damage. This obviates the need for libraries to back up their LOCKSS boxes, and provides continual reassurance that preservation is being achieved. Thanks to research that won a prestigious ACM award in 2004, the peer-to-peer audit and repair process also helps secure the system against malicious attacks.

The LOCKSS system also has a “migration on the fly” process for converting content from one format to a newer one. In addition to reducing the cost of ingest, this process allows the reader to see the content in the best format available at the time access is required.

In 2006, a group of publishers, librarians, and learned societies launched an initiative employing the LOCKSS technology to support a community-managed large archive to serve as a failsafe repository for scholarly content. The Controlled LOCKSS (CLOCKSS) Program aimed to provide the global research and scholarly community perpetual access to unavailable, orphaned, or abandoned content in the event of a long-term business interruption. Due to funding realities associated with the initiative, the proposed archive had to be housed on extremely affordable and reliable technology.

According to Vicky Reich, LOCKSS Program Director, “The single greatest threat to preserving materials over the long term is money. Societies will have good times and bad. Keeping content safe must be a marginal expense in order to decrease the threats during bad times as well as to maximize available funds for new acquisitions during good times.”

In scouting around for a suitable price-performing storage platform for CLOCKSS, Vicky was directed to the Internet Archive (www.archive.org) and their experience building a massive, continuously scaled datastore based on Capricorn Technologies’ PetaBox devices.

“Until we discovered Capricorn, entry costs for our huge digital preservation project were prohibitive. Thanks to their low-entry price and low overall maintenance costs, we have been able to proceed with our plans for housing large datastores in various locations on a very limited budget.”

Today, nearly 100 publishers have chosen to preserve their materials in libraries using the LOCKSS system. In addition to running LOCKSS boxes to preserve electronic journals, libraries and consortia groups are now preserving an incredibly wide variety of content, including image collections, web sites, electronic thesis, and archival and manuscript collections. The U.S. General Printing Office is leading a federal government depository library document preservation project, and many state governments are joining the preservation effort.

As Vicky Reich states, “It’s remarkable what a community and a bit of open source software can accomplish.”

To learn more about LOCKSS and CLOCKSS,

Visit: www.lockss.org and www.lockss.org/clockss

* In Thomas Jefferson: Writings: Autobiography, Notes on the State of Virginia, Public and Private Papers, Addresses, Letters, edited by Merrill D. Peterson. New York: Library of America 1984.