Open access publishing

Introduction

This essay outlines the history and development of open access publishing from the author's perspective, and it advocates librarians take a more active role in making open access publishing a norm for facilitating scholarly communication as opposed to an exception. (This essay is also available as an abbreviated one-page PDF file as well as a set of PowerPoint slides.)

History

The history of open access publishing has been woven with strands from the serials pricing crisis and the development on the Internet.

I, unknowingly, had my first experience with the serials pricing crisis in 1989 or so when I was a circuit-rider medical librarian at the Catawba-Wateree AHEC Library in rural Lancaster, South Carolina. My small collection of 102 journals included, I believe, the Journal of Anesthesiology. For many reasons, this particular title was rarely used, if used at all. Consequently, when I noticed a very large increase in its price, I quickly cancelled the title and never looked back.

A few years later I was working at the North Carolina State University Libraries, and I participated in an ARL Collection Analysis Project. The purpose of the project was to outline a plan for formalizing the newly created Collection Development Department. It was at that time when I learned about the scholarly communication process and the "serials pricing crisis".

Technologically speaking, this was a time (around 1992) of increasing access to Internet email. FTP was the primary way to put and get data on remote computers, and telnet was becoming a more popular way to connect to "online public access computers". WAIS and Gopher servers were considered cool. This was also a time of a number of newly created electronic scholarly journals. Some of the more popular where Postmodern Culture, The Public Access Computer Systems Review, Bryn Mawr Classical Review, and Psycoloquy. These titles were distributed via email, were possibly read, and then usually deleted. In general, libraries did not have an idea of how to deal with this new format. The not uncommon solution was to print the texts, bind them, and put them on the shelf. In my own way, I created the Mr. Serials Process believing that if the library community demonstrated it could provide collection, organization, preservation, and dissemination services, then free electronic serial publishing would thrive.

Psycoloquy is a particular free title of note because it was spearheaded by Stevan Harnad. You might say Harnad has been a driving force behind open access publishing for more than ten years now, and in his "subversive proposal" posted to a Virginia Tech mailing list in 1994 is a landmark in the history of open access. In a nutshell, the "subversive proposal" advocated the continuance of peer-reviewed scholarly publishing in print form but it also advocated scholarly articles be digitally self-archived by authors and made freely available through the Internet, much like the online network of pre-print archives of the physics community lead by Paul Ginsparg:

The scholarly author wants only to publish them [, the articles], that is, to reach the eyes and minds of peers, fellow esoteric scientists and scholars the world over, so that they can build on one another's contributions in that cumulative... If every esoteric author in the world this very day established a globally accessible local ftp archive for every piece of esoteric writing from this day forward, the long-heralded transition from paper publication to purely electronic publication (of esoteric research) would follow suit almost immediately... The subversion will be complete, because the (esoteric -- no-market) peer-reviewed literature will have taken to the airwaves, where it always belonged, and those airwaves will be free (to the benefit of us all) because their true minimal expenses will be covered the optimal way for the unimpeded flow of esoteric knowledge to all: In advance. --Stevan Harnad (June 27, 1994)

Time passed. FTP, telnet, and gopher gave way to the World Wide Web. The plain text, ASCII-based electronic scholarly serials faded away or made the transition to centralized, Web-based distribution methods. (This transition made the Mr. Serials Process break, but in its place I created a poorly named service called Index Morganagus.) An increasing number of new electronic journals started springing up taking advantage of graphic capabilities of HTML. More and more people were getting on the Internet, and the convenience of electronic access could not be understated. People liked the ability to download and read articles without going to the library to find them.

Publishers also took notice and started creating electronic versions of their print titles. The prices of serials were still going through the roof, and fair use, in an time of licensing agreements, was seen as a moot point. The lure of desktop access was too great and licensing trumped the access libraries traditionally provided. This was the time of the "Big Deal", a phrase coined by Kenneth Frazier to denote "an online aggregation of journals that publishers offer as a one-price, one size fits all package". And that is what is was (and still is), a "package" deal that too many libraries find hard to pass up because it provides increased access to titles while sacrificing archivable collections. All the while, the prices continue to skyrocket and to meet these increased costs other things like other serial titles or monographic items are being cut.

More time passes, and there are more reactions to the events at hand. Some editors of high-priced journals leave their individual titles to go to lesser-expensive titles or create their own titles. The preprint servers of the physics community continue to grow. In the library community, specifically, the academic research library community, in reaction to the "serials crisis", form SPARC in 1998. This initiative, an acronym for Scholarly Publishing and Academic Resources Coalition, had and still has three main goals: 1) to provide alternative titles to the high-priced commercial titles, 2) to encourage leading-edge publishing efforts, and 3) to foster relationships between publishers and scholars. To these ends, SPARC is an advocate for taking control of the scholarly publishing process and helps others publish scholarly journals. SPARC members, mostly libraries, help subsidize these ventures through institutional memberships and subscriptions. Some but not all of the SPARC titles are free in the way alluded to by Harnad, and I find it interesting that the one of reasons SPARC leadership say the Coalition was created was because "the growing commercialization of scientific communication has turned upside-down the traditional 'gift exchange' between researchers, societies and publishers." More on that later.

The computer technology community is not standing still either. For example, Ed Fox of Virginia Tech leads an effort to create a repository known as the Networked Digital Library of Theses and Dissertations (NDLTD) in 1996, a repository whose concept is very similar to the concept behind pre-print servers. Make the information as widely and as easily accessible as possible. It is also around this time Brewster Kahle (author of WAIS) starts the Internet Archive, and the National Center for Biotechnology Information form PubMED, essentially a free version of MEDLINE. In terms of the topic at hand, probably the most significant thing to develop is the Open Archive Initiative (OAI) in 1999. OAI is computer protocol initially designed to harvest metadata from things exactly like the physics preprint servers. Through OAI a person can collect the salient information about sets of bibliographic information and provide services against this information. This protocol filled the niche Harnad saw unresolved when he first articulated his "subversive proposal"; with OAI it is relatively easy to collect the necessary metadata describing journal articles and provide searching services against that data. Incidentally, since the development of the protocol, now called the Open Archive Initiative - Protocol for Metadata Harvesting (OAI-PMH), OAI has become a defacto standard in digital library initiatives.

Serial prices continue to rise. Other things in libraries continue to get cut. More people are beginning to see what is happening, more scholars/researchers are affected, and the phrase "open access" is coined in the Budapest Open Access Initiative (BOAI) in 2002. This Initiative is/was an attempt to see how much an organization called the Open Society Institute (OSI) could help resolve the scholarly communication. One of the outcomes of this attempt was a definition of open access:

By "open access" to this [scientific and scholarly] literature, we mean its fee availability on the public internet, permitting any users to read, download, copy, distribute, print, search or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited. --Budapest, Hungary (February 14, 2002)

The BOAI goes on to become the model for other open archive statements such as the Bethesda Statement on Open Access Publishing, the Berlin Declaration on Open Access to Knowledge, the ACRL Principles and Strategies for the Reform of Scholarly Communication, and the IFLA Statement on Open Access to Research Data from Public Funding. All the while there are a small but growing "independence statements" from journal article editors/staff as well as formal statements from large and larger institutions of higher education as some of them move away from the Big Deal.

By this time essentially two types of open access have taken shape: green and gold. "Green" open access happens when publishers give the "green light" for authors to self-archive their materials. According to SHERPA (Securing a Hybrid Environment for Research Preservation and Access) as many of 90% of journal publishers surveyed have given authors the "green light". "Gold" open access is the "golden road" of institutional repositories. Institutional repositories can be something centered around societies like the physics archives, around government-sponsored repositories like PubMed, or university-centered like MIT's DSpace.

Open access is about providing free access to scholarly, peer-reviewed literature, but the process of publication and peer-review require time and money. Who is going to pay for these services? The Public Library of Science (PLos) and BioMed Central have an answer to this question, "The author pays through page charges." These fees currently range from $500 - $1,500 per article, and the PLoS and BioMed advocate these fees be supplemented by authors' institutions or research dollars. David Shulenberger, an economist and the Executive Vice Chancellor of the University of Kansas, does not advocate author-pay models. He believes page charges will increase faster than subscription prices because individual authors have even less leverage that library consortiums and authors have a very great incentive to pay page charges, namely promotion and tenure.

On the library front, folks at the Lund University with help from SPARC and OSI created the Directory Of Open Access Journals (DOAJ). Presently the directory contains approximately 1,300 open access titles categorized using a limited number of subject terms. The content of the Directory is accessible in many different digital formats. In an another attempt to provide value-added access to open access literature in my own way, I mirrored the full-text of a subset of the DOAJ titles and created a full-text index called DOAJ Index. The mirroring and indexing functioned well but the result was not very usable because the journal content was not very well structured. It contained little metadata. The folks at the DOAJ have used OAI-PMH to harvest the metadata from the OAI-accessible titles to create an index, but again, the meta-data is incomplete and the indexing is not as desireable as full-text indexing.

Most recently, governments are now hearing the problems. The United States government is a bit behind times, compared to other countries, but on September 17 the Federal Register recorded a notice by the National Institute of Health (NIH) in support of open access:

The NIH intends to request that its grantees and supported Principal Investigators provide the NIH with electronic copies of all final version manuscripts upon acceptance for publication if the research was supported in whole or in part by NIH funding. This would include all research grants, cooperative agreements, and contracts, as well as National Research Service Award (NRSA) fellowships. We define final manuscript as the author's version resulting after all modifications due to the peer review process.... The NIH considers final manuscripts to be an important record of the research funded by the Government and will archive these manuscripts and any appropriate supplementary information in PubMed Central (PMC), NIH's digital repository for biomedical research.

You are encouraged to comment on the notice by November 16, 2004.

Librarianship

What do us librarians traditionally do, and how can we help solve these problems?

I like to boil librarianship down to a handful of processes surrounding data, information, and knowledge, namely: collection, organization, archiving and preservation, and dissemination. Traditionally, these process have revolved around tangible items such as books and journals. If libraries expect to be part of the solution to the "serials crisis", and if open access articles -- whether they be self-archived or contained in institutional repositories -- are a part of that solution, then libraries need to learn how to apply the traditional process to the new medium. The real challenge will be developing the skills in librarians who will do the implementation.

Technically speaking, collection is easy, almost trivial. Libraries can use mirroring techniques to simply copy data from one location to another. Using OAI-PMH to collect the metadata is the next best thing. Organization is a bit more difficult, but much of traditional cataloging can come into play. The issues of archiving and preservation have not been ironed out. LOCKSS (Lots of Copies Keep Stuff Safe) provides one solution, but migrating data from one format to another will almost undoubtably be a part of the long-term solution. Dissemination is the hardest problem. Successful indexing techniques are only as good as the structure of the underlying data; if the indexing is poor, then search will not be so great. Librarians might consider providing the meta data for open access literature or help devise systems for deriving meta data automatically. Dissemination does not stop with indexing and search. The wealth of data/information available on the Web have increased people's expectations when it comes to information retrieval. People expect Google-esk simplicity and Amazon.com-like services. At the same time and at the risk of sounding passee, everybody is still "drinking from the fire hose". What people desire are ways to manage their information and put it to use. We need to ask ourselves, "What can we do to help learners, teachers, and scholars turn there data and information into knowledge and wisdom?" This is the real challenge here in the early 21st Century.

Besides implementing these traditional processes, libraries, in order to be part of the solution, must build stronger relationships with scholars and publishers. Most importantly, scholars and publishers need to trust librarians. They need to feel confident that librarians can hold up their end of the bargain. "If I publish my stuff electronically, will you, librarian, do the things you do best for the materials I create?" They answer has to be a definitive "Yes", and it has to be backed up by action. SPARC, the recently formed SPARC Europe, the activities they sponsor, and other national activities are great starts, but more has to happen on the local level. Conversations have to take place. A sincere appreciation of what everybody (scholars, publishers, and librarians) desires need to be shared and taken to heart. As Frazier said, scholarly publishing has been turned upside-down, and in order to turn it right-side up again, each one of us needs to take some sort of action.

Open source software

I would not be the individual librarian I am if I were not to mention open source software.

I believe open source software has a lot in common open access literature. As alluded to above, scholarly publishing has sometimes been seen as a form of "gift exchange". The more you give the more you are respected. Eric Raymond compared open source software to gift cultures, and I have tried to make similar comparisons to librarianship elsewhere. Providing open source software and providing open access to scholarly journal literature are both forms of gift exchange.

Both open source software and open access to scholarly journal literature are "free as a free kitten". When you get the free kitten it is soft, cute, and adorable. It makes you feel warm and fuzzy inside. Then you buy the kitten a collar, food, shots, etc. The kitten gets older and claws at the furniture. The kitten escapes the house over night and you feel ill until it returns. While the kitten was originally "free" you made investments, both financial and emotional in the kitten, and it is no longer "free". Open source software and open access literature are the same way. They are free up front but you must make investments down the line in order maintain them. There still is no such thing as a free lunch.

Both open source software and open access scholarly journal literature are sometimes seen as public goods, created for the benefit of society. Not private interests. By restricting access to source code or peer-reviewed articles control over one's (computing or scholarly) environment is also restricted. On the other hand open source and open access make it easier to build on other people's achievements. Cooperation is more productive than competition.

Specifically in librarianship, the use of open source software and the curation of open access journals require a new skill sets of the profession. Developing these skill should not be seen as extra expenses, but rather investments in the future.

Finally, and almost most obviously, both open source software and open access journal literature depend on peer-review for acceptance. Peer-review is an essential part of the open access movement. Open access is clearly differentiated from other forms of writing where monetary remuneration is expected and peer-review does not take place. Open source software, because the source code can be read, goes through a sort of peer-review process too. The software is read, people comment, and the software is improved. The same holds true for scholarly literature.

Conclusion

In summary, the open access movement has been fermenting for at least a decade. It is the combined result of and reactions to the "serials pricing crisis" and the development of globally networked computers -- the Internet. Librarians are seen by scholars, open access journal publishers, and administrators as partners in the scholarly communications process. As partners librarians must learn to COAPP with the problem:

These are exciting times in the world of data, information, and knowledge. There are so many opportunities. It is an exciting time to be a librarian.


Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: This presentation was given at an LILRC meeting at Dowling College, NY on October 25, 2005.
Date created: 2004-10-20
Date updated: 2004-12-04
Subject(s): presentations; LILRC (Long Island Library Resources Council); open access publishing;
URL: http://infomotions.com/musings/open-access/