Posts Tagged ‘LOCKSS’

E-journal archiving solutions

Monday, July 14th, 2008

A JISC-funded report on e-journal archiving solutions is an interesting read, and it seems as if no particular solution is the hands-down “winner”.

Terry Morrow, et al. recently wrote a report sponsored by JISC called “A Comparative study of e-journal archiving solutions“. Its goal was to compare & contrast various technical solutions to archiving electronic journals and present the informed opinion on the subject.

Begged and unanswered questions

The report begins by setting the stage. Of particular note is the increased movement to e-only journal solutions many libraries are adopting. This e-only approach begs unanswered questions regarding the preservation and archiving of electronic journals — two similar but different aspects of content curation. To what degree will e-journals suffer from technical obsolescence? Just as importantly, how will the change in publishing business models, where access, not content, is provided through license agreements effect perpetual access and long-term preservation of e-journals?

Two preservation techniques

The report outlines two broad techniques to accomplish the curation of e-journal content. On one hand there is “source file” preservation where content (articles) are provided by the publisher to a third-party archive. This is the raw data of the articles — possibly SGML files, XML files, Word documents, etc. — as opposed to the “presentation” files intended for display. This approach is seen as being more complete, but relies heavily on active publisher and third party participation. This is the model employed by Portico. The other technique is harvesting. In this case the “presentation” files are archived from the Web. This method is more akin to the traditional way libraries preserved and archived their materials. This is the model employed by LOCKSS.

Compare & contrast

In order to come their conclusions, Morrow et al. compared & contrasted six different e-journal preservation initiatives while looking through the lense of four possible trigger events. These initiatives (technical archiving solutions) included:

  1. British Library e-Journal Digital Archive – a fledgling initiative by a national library
  2. CLOCKSS – a dark archive of articles using the same infrastructure as LOCKSS
  3. e-Depot – a national library initiative from The Netherlands
  4. LOCKSS – an open source and distributed harvesting implementation
  5. OCLC ECO – an aggregation of aggregators, not really preservation
  6. Portico – a Mellon-backed “source file” approach

The trigger events included:

  1. cancelation of an e-journal title
  2. e-journal no longer available from a publisher
  3. publisher ceased operation
  4. catastrophic hardware or network failure

These characteristics made up a matrix and enabled Morrow, et al. to describe what would happen with each initiative under each trigger event. In summary, they would all function but it seems the LOCKSS solution would provide immediate access to content whereas most of the other solutions would only provide delayed access. Unfortunately, the LOCKSS initiative seems to have less publisher backing than the Portico initiative. On the other hand, the Portico initiative costs more money and assumes a lot of new responsibilities from publishers.

In today’s environment where information is more routinely sold and licensed, I wonder whether or not what level of trust can be given to publishers. What’s in it for them? In the end, neither solution — LOCKSS nor Portico — can be considered ideal, and both ought to be employed at the present time. One size does not fit all.

Recommendations

In the end there were ten recommendations:

  1. carry out risk assessments
  2. cooperate with one or more external e-journal archiving solutions
  3. develop standard cross-industry definitions of trigger events and protocols
  4. ensure archiving solutions cover publishers of value to UK libraries
  5. explicitly state perpetual access policies
  6. follow the Transfer Code of Practice
  7. gather and share statistical information about the likelihood of trigger events
  8. provide greater detail of coverage details
  9. review and update this study on a regular basis
  10. take the initiative by specifying archiving requirements when negotiating licenses

Obviously the report went into much greater detail regarding all of these recommendations and how they derived. Read the report for the details.

There are many aspects that make up librarianship. Preservation is just one of them. Unfortunately, when it comes to preservation of electronic, born-digital content, the jury is still out. I’m afraid we are suffering from a wealth of content right now, but in the future this content may not be accessible because society has not thought very long into the future regarding preservation and archiving. I hope we are not creating a Digital Dark Age as we speak. Implementing ideas from this report will help reduce the possibility of this problem from becoming a reality.