Open Source Software in Libraries: A Workshop


Table of Contents

1. Introduction
Purpose and scope of this text/workshop
2. Open Source Software in Libraries
Introduction
What is OSS
Techniques for developing and implementing OSS
OSS Compared to Librarianship
Prominent OSS Packages
State of OSS in Libraries
National leadership
Mainstreaming, workshops, and training
Usability and packaging
Economic viability
Redefining the ILS
Open source data
Conclusion and next steps
Notes
3. Gift Cultures, Librarianship, and Open Source Software Development
Gift Cultures, Librarianship, and Open Source Software Development
Acknowledgements
Notes
4. Comparing Open Source Indexers
Abstract
Indexers
freeWAIS-sf
Harvest
Ht://Dig
Isite/Isearch
MPS
SWISH
WebGlimpse
Yaz/Zebra
Local examples
Summary and information systems
Links
5. Selected OSS
Introduction
Apache
CVS
DocBook stylesheets
FOP
GNU tools
Hypermail
Koha
MARC::Record
MyLibrary
MySQL
Perl
swish-e
xsltproc
YAZ and Zebra
6. Hands-on activities
Introduction
Installing and running Perl
Installing MySQL
Installing Apache
CVS
Hypermail
MARC::Record
swish-e
YAZ
Koha
MyLibrary
xsltproc
7. GNU General Public License
Preamble
GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
NO WARRANTY

This text is a part of a hands-on workshop intended to describe and illustrate open source software and its techniques to small groups of librarians. Given this text, the accompanying set of software, and reasonable access to a (Unix) computer, the student should be able to read the essays, work through the exercises, and become familiar with open source software especially as it pertains to libraries.

I make no bones about it, this text is the combination of previous essays I've written about open source software as well as a couple of other newer items. For example, the second chapter is the opening chapter I wrote for a LITA Guide in 2002 ("Open Source Software for Libraries," in Karen Coyle, ed., Open Source Software for Libraries: An Open Source for Libraries: Chicago: American Library Association, 2002 pg. 7-18.). The third chapter comparing open source software, gift cultures, and librarianship was originally formally published as a book review for Information Technology and Libraries (volume 19, number 2, March 2000). The chapter on open source software indexers is definitely getting old. It was presented at the O'Reilly Open Source Convention, San Diego, CA July 23-27, 2001. The following section is built from the content of a 2001 American Libraries Association Annual Conference presentation. The new materials are embodied in the list of selected software and the hands-on activities.

I believe open source software is more about building communities and less about computer programs. It is more about making the world a better place and less about personal profit. Allow me to explain.

I have been giving away my software ever since Steve Cisler welcomed me into the Apple Library Of Tomorrow (ALOT) folds in the very late 1980's. Through my associations with Steve and ALOT I came to write a book about Macintosh-based HTTP servers as well as an AppleScript-based CGI script called email.cgi in 1994.

This simple little script was originally developed for two purposes. First and foremost it was intended to demonstrate how to write an AppleScript Common Gateway Interface (CGI) application. Second, it was intended to fill a gap in the Web browsers of the time, namely the inability of MacWeb to support mailto URL's. Since then the script has evolved into an application taking the contents of an HTML form, formatting it, and sending the results to one or more email addresses. It works very much like a C program called cgiemail. As TCP utilities have evolved over the years so has email.cgi, and to this date I still get requests for technical support from all over the world, but almost invariably the messages start out something like this. "Thank you so very much for email.cgi. It is a wonderful program, but..." That's okay. The program works and it has helped many people in many ways -- more ways than I am able to count because the vast majority of people never contacted me personally.

As I was bringing this workbook together I thought about Steve Cisler again, and I remembered a conference Apple Computer sponsored in 1995 called Ties That Bind: Converging Communities. (A pretty bad travel log documenting my experiences at this conference is available at http://infomotions.com/travel/ties-that-bind-95/.) In the conference we shared and discussed ideas about community and the ways technology can help make communities happen. In between a session Cisler displayed the original piece of art that became the motif for the conference. He noted that he got the painting in Australia some time the previous year. He liked it for its simplicity and connectivity. The painting is acrylic, approximately 1' 6" X 2" 6", and is composed of many simple dots of color.

The image at the top of the page is that piece of art, and it is significant today. It too is "a lot" (all puns intended) like open source software and the "the Unix way." The value of open source software is measured in terms of its simplicity and connectivity. The simpler and more connective the software, the more it is valued. The Unix way is a philosophy of computing. It posits that a computer program will take some input, do some processing, and provide some output. There is very little human interface to these sorts of programs because they get their input from a thing called standard input (STDIN) and send the output to a thing called standard output (STDOUT). If errors occur, errors are sent to standard error (STERR). Since the applications are expected to get their input from STDIN and send it to STOUT it is possible to string many together to create a working application. Connectivity. Such a design philosophy allows tiny programs to focus on one thing, and one thing only. Simplicity. This modular approach allows for the creation of new applications by adding or deleting older modules from the string.

The motif brought to my attention by Cisler is a lot like stringing together open source software applications. Each individual dot does not do a whole lot on its own, but strung together they form a pattern. The pattern's whole is greater than the sum of its parts. This is true of communities as well. Individuals bring something to the community, and the community is made better for the contribution. The open source community exists because of individuals. These individuals have particular strengths (and weaknesses). As people add what they can to the community, the community is strengthened. The rewards for these contributions are rarely monetary. Instead, the contributions are paid for with respect. People who give freely of themselves and their time are rewarded by the community as experts whose opinions are to be taken seriously. True, participation in open source software activities does not always put food on the table, but neither do other community-based activities our society values to one degree or another such as participation in community theater, helping out at the local soup kitchen, being involved in church activities, picking up litter, giving directions to a stranger, supporting charities, participating in fund-raisers, etc. Open source software is about communities, communities that have been easier to create with the advent of globally networked computers. As described later, it is about "scratching an itch" to solve a problem, but it is also about giving "freely" to the community in the hopes that the community will be better off for it in the end.

A few years after writing email.cgi, I participated in another application called MyLibrary. This portal application grew out of a set of focus group interviews where faculty of the NC State University said they were suffering from information overload. In late 1997, when these interviews were taking place, services like My Yahoo, My Excite, My Netscape, and My DejaNews were making their initial appearance. In the Digital Library Initiatives Department, where I worked Keith Morgan and Doris Sigl, we thought a similar application based on library content (bibliographic databases, electronic journals, and Internet resources) organized by subjects (disciplines) might prove to be a possible solution to the information overload problem. By prescribing sets of resources to specific groups of people we (the Libraries) could offer focused content as well as provide access to the complete world of available information.

Since I relinquished my copyrights to the University and the software has been distributed under the GNU Public License the software has been downloaded about 350 times, mostly from academic libraries. The specific number of active developers is unknown, but many institutions who have downloaded the software have used it as a model for their own purposes. In most cases these institutions have taken the system's database structure and experimented with various interfaces and alternative services. Such institutions include, but are not limited to the University of Michigan, the California Digital Library, Wheaton College, Los Alamos Laboratory, Lund University (Sweden), the University of Cattaneo (Italy), and the University of New Brunswick. Numerous presentations have been given about MyLibrary including venues such as Harvard University, Oxford University, the Alberta Library, the Canadian Library Association, the ACRL Annual Meeting, and ASIS.

As I see it, there are three or four impediments restricting greater success of the project: system I/O, database restructuring, and technical expertise. MyLibrary is essentially a database application with a Web front-end. In order to distribute content data must be saved in the database. The question then is, "How will the data be entered?" Right now it must be done by content providers (librarians), but the effort is tedious and as the number of bibliographic databases and electronic journals grow so does the tedium. Lately I have been experimenting with the use of RDF as an import/export mechanism. By relying on some sort of XML standard the system will be able to divorce itself from any particular database application such as an OPAC and the system will be more able to share its data with other portal applications such as uPortal, My Netscape, or O'Reilly's Meerkat through something like RSS. Yet, the problem still remains, "Who is going to do the work?" This is a staffing issue, not necessarily a technical one.

In order to facilitate the needs a wider audience, the underlying database needs to be restructured. For example, the databases contains tables for bibliographic databases, electronic journals, and "reference shelf" items. Each of the items in these tables are classified using a set of controlled vocabulary terms called disciplines. Many institutions want to create alternative data types such as images, associations, or Internet resources. Presently, do accomplish this task oodles of code must be duplicated bloating the underlying Perl module. Instead a new table needs to be created to contain a new controlled vocabulary called "formats". Once this table is created all the information resources could be collapsed into a single table and classified with the new controlled vocabulary as well as the disciplines. Furthermore, a third controlled vocabulary -- intended audience -- could be created so the resources could be classified even further. Given such a structure the system could be more exact when it comes to initially prescribing resources and allowing users to customize their selections. Again, the real problem here is not necessarily technical but intellectual. Librarians make judgments about resources in terms of the resource's aboutness, intended audience, and format all the time but rarely on such a large scale, systematic basis. Our present cataloging methods do not accommodate this sort of analysis, and how will such analysis get institutionalized in our libraries?

The comparitavly low level of technical expertise in libraries is also a barrier to wider acceptance of the system. MyLibrary runs. It doesn't crash nor hang. It does not output garbage data. It works as advertised, but to install the program initially requires technical expertise beyond the scope of most libraries. It requires the installation of a database program. MySQL is the current favorite, but there are all sort of things that can go wrong with a MySQL installation. Similarly, MyLibrary is written in Perl. Installing Perl from source usually requires answering a host of questions about your computer's environment, and in all nine or ten years of compiling Perl I still don't know what some of those questions mean and I simply go with the defaults. Then there are all the Perl modules MyLibrary requires. They are a real pain to install, and unless you have done these sorts of installs before the process can be quite overwhelming. In short, getting MyLibrary installed is not like the Microsoft wizard process; you have to know a lot about your host computer before you can even get it up and running and most libraries do not employ enough people with this sort of expertise to make the process comfortable.

This workbook brings together much of my experience with open source software. It describes sets of successful open source software projects and tries to enumerate the qualities of successful project. The workbook has been in the hopes people will read it, give the exercises a whirl, learn from the experience, and share their newly acquired expertise with the world at large. Through this process I hope we can make the world we live in just a little bit better place. Idealist? Maybe. A worthy goal? Definitely.

OSS is both a philosophy and a process. As a philosophy it describes the intended use of software and methods for its distribution. Depending on your perspective, the concept of OSS is a relatively new idea being only four or five years old. On the other hand, the GNU Software Project -- a project advocating the distribution of "free" software -- has been operational since the mid '80's. Consequently, the ideas behind OSS have been around longer than you may think. It begins when a man named Richard Stallman worked for MIT in an environment where software was shared. In the mid '80's Stallman resigned from MIT to begin developing the GNU -- a software project intended to create an operating system much like Unix. (GNU is pronounced "guh-NEW" and is a recursive acronym for GNU's Not Unix.) His desire was to create "free" software, but the term "free" should be equated with freedom, and as such people who use "free" software should be:

Put another way the term "free" should be equated with the Latin word "liberat" meaning to liberate, and not necessarily "gratis" meaning without return made or expected. In the words of Stallman, we should "think of 'free' as in 'free speech,' not as in 'free beer.'"[1]

Fast forward to the late '90's after Linus Torvalds successfully develops Linux, a "free" operating system on par with any commercial Unix distribution. Fast forward to the late '90's when globally networked computers are an every day reality and the .com boom is booming. There you will find the birth of the term "open source" and it is used to describe how software is licensed:

OSS is also a process for the creation and maintenance of software. This is not a formalized process, but rather a process of convention with common characteristics between software projects. First and for most, the developer of a software project almost always is trying to solve a specific computer problem commonly called "scratching an itch." The developer realizes other people may have the same problem(s), and consequently the developer makes the project's source code available on the 'Net in the hopes other people can use it too.

If there seems to be a common need for the software, a mailing list is usually created to facilitate communication, and the list is hopefully archived for future reference. Since the software is almost always in a state of flux, developers need some sort of version control software to help manage the project's components. The most common version control software is called CVS (Concurrent Versions System). Co-developers then "hack away" at the project adding features they desire and/or fixing bugs of previous releases. As these features and fixes are created the source code's modifications, in the form of "diff" files -- specialized files explicitly listing the differences between two sets of programming code -- are sent back to the project's leader. The leader examines the diff files, assesses their value, and decides whether or not to integrate them into the master archive. The cycle then begins anew. Much of a project's success relies on the primary developer's ability to foster communication and a sense of community around a project. Once accomplished the "two heads are better then one" philosophy takes effect and the project matures.

Writing computer programs is only one part of the software development. Software development also requires things such as usability testing, documentation, beta-testing, and a knowledge of staff issues. Consequently, in any environment where computers are used on a daily basis are places where the techniques of OSS can be practiced. Knowledge of computer programming is not necessary. In fact, a lack of computer programming is desireable. You do not have to know how to write computer programs in order to participate in OSS development.

Anybody who uses computers on a daily basis can help develop OSS. For example, you can be a beta-tester who tries to use the software and finds its faults. You can write documentation instructing people how to use the software. You can conduct usability tests against the software discovering how easy the software is to use or not use, and how it meets people's expectations. If computer software is intended to make our lives easier, you can evaluate the use of the software and see what sorts of things can be eliminated or how resources can be reallocated in order to run operations more efficiently. All of these things have nothing to do with computer programming, but rather, the use of computers in a work place.

One the most definitive sets of writings describing OSS is Eric Raymond's The Cathedral and the Bazaar.[3] These texts, available online as well as in book form, compare and contrast the software development processes of monolithic organizations (Cathedrals) with the software processes of less structured, more organic collections of "hackers" (Bazaars).[4] The book describes the environment of free software and tries to explain why some programers are willing to give away the products of their labors. It describes the "hacker milieu" as a "gift culture":

Raymond alludes to the definition of "gift cultures", but not enough to satisfy my curiosity. The literature, more often than not, refers to information about "gift exchange" and "gift economies" as opposed to "gift cultures." Probably one of the earliest and more comprehensive studies of gift exchange was written by Marcell Mauss.[6] In his analysis he says gifts, with their three obligations of giving, receiving, and repaying, are in aspects of almost all societies. The process of gift giving strengthens cooperation, competitiveness, and antagonism. It reveals itself in religious, legal, moral, economic, aesthetic, morphological, and mythological aspects of life.[7]

As Gregory states, for the industrial capitalist economies, gifts are nothing but presents or things given, and "that is all that needs to be said on the matter." Ironically for economists, gifts have value and consequently have implications for commodity exchange.[8] He goes on to review studies about gift giving from an anthropological view, studies focusing on tribal communities of various American indians, cultures from New Guinea and Melanesia, and even ancient Roman, Hindu, and Germanic societies:

The key to understanding gift giving is apprehension of the fact that things in tribal economics are produced by non-alienated labor. This creates a special bond between a producer and his/her product, a bond that is broken in a capitalistic society based on alienated wage-labor.[9]

Ingold, in "Introduction To Social Life" echoes many of the things summarized by Gregory when he states that industrialization is concerned:

exclusively with the dynamics of commodity production. ... Clearly in non-industrial societies, where these conditions do not obtain, the significance of work will be very different. For one thing, people retain control over their own capacity to work and over other productive means, and their activities are carried on in the context of their relationships with kin and community. Indeed their work may have the strengthening or regeneration of these relationships as its principle objective.[10]

In short, the exchange of gifts forges relationships between partners and emphasizes qualitative as opposed to quantitative terms. The producer of the product (or service) takes a personal interest in production, and when the product is given away as a gift it is difficult to quantify the value of the item. Therefore the items exchanged are of a less tangible nature such as obligations, promises, respect, and interpersonal relationships.

As I read Raymond and others I continually saw similarities between librarianship and gift cultures, and therefore similarities between librarianship and OSS development. While the summaries outlined above do not necessarily mention the "abundance" alluded to by Raymond, the existence of abundance is more than mere speculation. Potlatch, a ceremonial feast of the American Indians of the northwest coast marked by the host's lavish distribution of gifts or sometimes destruction of property to demonstrate wealth and generosity with the expectation of eventual reciprocation, is an excellent example.

Libraries have an abundance of data and information. I won't go into whether or not they have an abundance of knowledge or wisdom of the ages. That is another essay. Libraries do not exchange this data and information for money; you don't have to have your credit card ready as you leave the door. Libraries don't accept checks. Instead the exchange is much less tangible. First of all, based on my experience, most librarians just take pride in their ability to collect, organize, and disseminate data and information in an effective manner. They are curious. They enjoy learning things for learning's things sake. It is a sort of Platonic end in itself. Librarians, generally speaking, just like what they do and they certainly aren't in it for the money. You won't get rich by becoming a librarian.

Even free information is not without financial costs. Information requires time and energy to create, collect, and share, but when an information exchange does take place, it is usually intangible, not monetary, in nature. Information is intangible. It is difficult to assign information a monetary value, especially in a digital environment where it can be duplicated effortlessly:

An exchange process is a process whereby two or more individuals (or groups) exchange goods or services for items of value. In Library Land, one of these individuals is almost always a librarian. The other individuals include tax payers, students, faculty, or in the case of special libraries, fellow employees. The items of value are information and information services exchanged for a perception of worth -- a rating valuing the services rendered. This perception of worth, a highly intangible and difficult thing to measure, is something the user of library services "pays", not to libraries and librarians, but to administrators and decision-makers. Ultimately, these payments manifest themselves as tax dollars or other administrative support. As the perception of worth decreases so do tax dollars and support. [11]

Therefore when information exchanges take place in libraries librarians hope their clientele will support the goals of the library to administrators when issues of funding arise. Librarians believe that "free" information ("think free speech, not free beer") will improve society. It will allow people to grow spiritually and intellectually. It will improve humankind's situation in the world. Libraries are only perceived as beneficial when they give away this data and information. That is their purpose, and they, generally speaking, do this without regards to fees or tangible exchanges.

In many ways I believe OSS development, as articulated by Raymond, is very similar to the principles of librarianship. First and foremost with the idea of sharing information. Both camps put a premium on open access. Both camps are gift cultures and gain reputation by the amount of "stuff" they give away. What people do with the information, whether it be source code or journal articles, is up to them. Both camps hope the shared information will be used to improve our place in the world. Just as Jefferson's informed public is a necessity for democracy, OSS is necessary for the improvement of computer applications.

Second, human interactions are a necessary part of the mixture in both librarianship and open source development. Open source development requires people skills by source code maintainers. It requires an understanding of the problem the computer application is trying to solve, and the maintainer must assimilate patches with the application. Similarly, librarians understand that information seeking behavior is a human process. While databases and many "digital libraries" house information, these collections are really "data stores" and are only manifested as information after the assignment of value are given to the data and inter-relations between datum are created.

Third, it has been stated that open source development will remove the necessity for programers. Yet Raymond posits that no such thing will happen. If anything, there will an increased need for programmers. Similarly, many librarians feared the advent of the Web because they believed their jobs would be in jeopardy. Ironically, librarianship is flowering under new rubrics such as information architects and knowledge managers.

OSS also works in a sort of peer review environment. As Raymond states, "Given enough eyeballs, all bugs are shallow." Since the source code to OSS is available for anybody to read, it is possible to examine exactly how the software works. When a program is written and a bug manifests itself, there are many people who can look at the program, see what it is doing, and offer suggestions or fixes.

Instead of relying on marketing hype to promote an application, OSS relies on its ability to satisfy particular itches to gain prominence. The better a piece of software works, the more people are likely to use it. User endorsements are usually the way OSS is promoted. The good pieces of software float to the top because they are used the most often. The ones that are poorly written or do not satisfy enough itches sink to the bottom.

In a peer review process many people look at an article and evaluate its validity. During this evaluation process the reviews point out deficiencies in the article and suggest improvements. The reviewers are usually anonymous but authoritative. The evaluation of OSS often works in the same vein. Software is evaluated by self-selected reviewers. These people examine all aspects of the application from the underlying data structures, to the way the data is manipulated, to the user interface and functionality, to the documentation. These people then offer suggestions and fixes to the application in an effort to enhance and improve it.

Some people may remember the "homegrown" integrated library systems developed in the '70's and '80's, and these same people may wonder how OSS is different from those humble beginnings. There are two distinct differences. The first is the present-day existence of the Internet. This global network of computers enables people to communicate over much greater distances and it is much less expensive than twenty-five years ago. Consequently, developers are not as isolated as they once were, and the flow of ideas travels more easily between developers -- people who are trying to scratch that itch. Yes, there were telephone lines and modems but the processes for using them was not as seemlessly integrated into the computing environment (and there were always long-distance communications charges to contend with.[12])

Second, the state of computer technology and its availability has dramatically increased in the past twenty-five years. Twenty-five years ago computers, especially the sorts of computers used for large-scale library operations, were almost always physically large, extremely expensive, remote devices whose access was limited to a group of few specialized individuals. Now-a-days, the computers on most people's desktops have enough RAM, CPU horsepower, and disk space to support the college campus of twenty-five years ago.[13]

In short, the OSS development process is not like the homegrown library systems of the past simply because there are more people with more computers who are able to examine and explore the possibilities of solving more computing problems. In the times of the homegrown systems people were more isolated in their development efforts and more limited in their choice of computing hardware and software resources.

There are quite a number of mainstream OSS applications. Many of these applications literally run the Internet or are used for back-end support. The Apache Project is one of the more notable (www.apache.org). Apache is a World Wide Web (HTTP) server. It started out its life in the mid '90's as NCSA's httpd application, the Web server beneath the first graphical Web browser. The name for the application -- Apache -- is a play on words. It has nothing to do with indians. Instead, in an effort to write a more modular computer program, the original httpd application was rewritten as a set of parts, or patches, and consequently the application is called "a patchy server." Few experts would doubt the popularity of the Apache server. According to Netcraft, more HTTP servers are Apache HTTP server than any other kind. [14]

MySQL is a popular relational database application. It is very often used to support database-driven websites. It adhears to the SQL standard while adding a number of features of its own (as does Oracle and other database vendors). MySQL is known for its speed and stability. The canonical address for MySQL is www.mysql.org.

Sendmail is an email (SMTP) server used on the vast majority of Unix computers. This application, developed quite a number of years ago is responsible for trafficing much of the email messages sent throughout the world. Sendmail is a good example of an application supported by both a commercial institution as well as a non-profit organization. There is a free version of sendmail, complete with source code, as well as a commercial version that comes with formal support. See www.sendmail.org.

BIND is an acronym for the Berkeley Internet Name Domain, a program converting Internet Protocol (IP) numbers, such as 17.112.144.32 into human-readable names such as www.apple.com. It is a sort like an old fashioned switchboard operator associating telephone numbers with the telephones in people's homes. BIND is supported by the Internet Software Consortium at www.isc.org.

Perl is a programming language written by Larry Wall in the late '80's. It too runs much of the Internet since it is used as the language of many common gateway interface (CGI) scripts of the internet. Wall originally created Perl to help him do systems administration task, but the language worked so well others adopted it and it has grown significantly. Perl is supported at www.perl.com.

Linux is the most familiar OSS application. This program is really an operating system -- a program directly responsible converting human-readable commands into computer (machine) language. It is the software that really makes computers run. Linux was originally conceived by Linus Torvols in the late '80's because he wanted to run a Unix-sort of operating system on Intel-based computer. Linux is becoming increasingly popular with many information technology (IT) professionals as an alternative to Windows-based server applications or proprietary versions of Unix. See www.linux.org.

Daniel Chudnov has been the library profession's OSS evangelist for the past three or four years. He is also the original author of the open source program jake (jointly administered knowledge environment). Chudnov has done a lot to raise the awareness of OSS in libraries. To that end he and others help maintain a website called OSS4Lib (www.oss4lib.org). The site lists library-related applications including applications for document delivery, Z39.50 clients and servers, systems to manage collections, MARC record readers and writers, integrated library system, and systems to read and write bibliographies. For more information visit OSS4Lib and subscribe to the mailing list.

The state of OSS in libraries is more than sets of computer programs. It also includes the environment where the software is intended to be used -- a socio-economic infrastructure. Any computing problem can be roughly divided into 20% technology issues and 80% people issues. It is this 80% of the problem that concerns us here. Given the current networked environment, the affinity of OSS development to librarianship, and the sorts of projects enumerated above what can the library profession do to best take advantage of the currently available OSS? I posed this question to the OSS4Lib mailing list in April and May of 2000 and it generated a lively discussion. [15] A number of themes presented themselves, each of which are elaborated upon below.

This essay has described what OSS is and it compared OSS to the principles of librarianship. The balance of the book details particular systems of OSS for libraries. After reading this book I hope you go away understanding at least one thing. OSS provides the means for the profession to take greater control over the ways computers are used in libraries. OSS is free, but it is free in the same way freedom exists in a democracy. With freedom comes choice. With freedom comes the ability to manifest change. At the same time, freedom comes at a price, and that price is responsibility. OSS puts its users in direct control of computer operations, and this control costs in terms of accountability. When the software breaks down, you will be responsible for fixing it. Fortunately, there is a large network at your disposal, the Internet, not to mention the creator of the software who has the same problems you do and has most likely previously addressed the same problem. Open source provides the means to say, "We are not limited by our licensed software because we have the ability to modify the software to meet our own ends." Instead of blaming vendors for supporting bad software, instead of confusing the issues with contractual agreements and spending tens of thousands of dollars a year for services poorly rendered, OSS offers an alternative. Be realistic. OSS is free, but not without costs.

This being the case, what sorts of things need to happen for OSS to become a more viable computing option in libraries? What are the next steps? The steps fall into two categories: 1) making people more aware of OSS and 2) improving the characteristics of OSS.

Librarians need to become more aware of the options OSS provides. This can be done in a number of ways. For example, a formal study analyzing the desirability and feasibility of libraries making a formal commitment to OSS might demonstrate to other libraries the benefits of OSS. Library boards and directors need feel comfortable commiting funds to OSS installation and development, but before doing so the boards and directors need to know what OSS is and how its principles can be applied in libraries. By mentoring existing librarians to become more computer literate the concepts of OSS will become easier to understand. Similarly, by mentoring librarians to be more aware of the ways of administration these same librarians will have more authority to make decisions and direct energies to OSS development. All librarians should not be afraid of the idea of open sources software because they think computer programming experience is necessary. There is much more to software development than writing computer programs. Simple training exercises will also make more people aware of the potential of open sources software. Finally, communication -- testimonials -- will help disseminate the successes, as well as failures, of OSS.

OSS itself needs to be improved. The installation processes of OSS are not as simple as the installation procedures of commercial software. This is area that needs improvement, and if done, fewer people would be intimidated by the installation process. Additionally, there are opportunities for commercial institutions to support OSS. These institutions, like Red Hat or O'Reilly & Associates, could provide services installing, documenting, and trouble shooting OSS. These institutions would not be selling the software itself, but services surrounding the software.

The principles of OSS of very similar to the principles of librarianship. Let's take advantage of these principles and use them to take more control of over our computing environments.

1. The ideas behind GNU software and its definition as articulated by Richard Stallman can be found at http://www.gnu.org/philosophy/free-sw.html. Accessed April 25, 2002.

2. Much of the preceeding section was derived from Dave Bretthaur's excellent article, "OSS: A History" in Information Technology and Libraries 21(1) March, 2002. pg. 3-10.

3. The Cathedral and the Bazaar is also available online at http://www.tuxedo.org/~esr/writings/cathedral-bazaar/. Accessed April 25, 2002.

4. It is important to distinguish here the difference between a "hacker" and a "cracker". As defined by Raymond, a hacker is person who writes computer programs because they are "scratching an itch" -- trying to solve a particular computer problem. This definition is contrasted with the term "cracker" denoting a person who maliciously tries to break computer systems. In Raymond's eyes, hacking is a noble art, cracking is immoral. It is unfortunate, the distinction between hacking and cracking seems to have been lost on the general population.

5. Raymond, E.S., The cathedral and the bazaar: musings on Linux and open source by an accidental revolutionary. 1st ed. 1999, [Sebastopol, CA]: O'Reilly. pg. 99.

6. Mauss, M., The gift; forms and functions of exchange in archaic societies. The Norton library, N378. 1967, New York: Norton.

7. Lukes, S., Mauss, Marcel, in International encyclopedia of the social sciences, D.L. Sills, Editor. 1968, Macmillan: [New York] volume 10, pg. 80.

8. Gregory, C.A, "Gifts" in Eatwell, J., et al., The New Palgrave : a dictionary of economics. 1987, New York: Stockton Press. volume 3, pg. 524.

9. Ibid.

10. Ingold, T., Introduction To Social Life, in Companion encyclopedia of anthropology, T. Ingold, Editor. 1994, Routledge: London ; New York. p. 747.

11. Morgan, E.L., "Marketing Future Libraries", http://www.infomotions.com/musings/marketing/. Accessed April 25, 2002.

12. As an interesting aside, read "Stalking the wily hacker" by Clifford Stoll in the Communications of the ACM May 1988 31(5) pg. 484. The essay describes how Clifford tracked a hacker via a 75 cent error in his telephone bill. It is on the Web in many places. Try http://eserver.org/cyber/stoll2.txt. Accessed April 25, 2002

13. It is believed a past chairman of IBM, Thomas Watson, said in 1943, "I think there is a world market for maybe five computers."

14. See http://www.netcraft.com for more information. Accessed April 25, 2002.

15. An archive of the oss4lib mailing list is available at this URL http://www.geocrawler.com/lists/3/SourceForge/6067/0/. Accessed April 25, 2002.

This short essay examines more closely the concept of a "gift culture" and how it may or may not be related to librarianship. After this examination and with a few qualifications, I still believe my judgments about open source software and librarianship are true. Open source software development and librarianship have a number of similarities -- both are examples of gift cultures.

I have recently been reading a book about open source software development by Eric Raymond. [1] The book describes the environment of free software and tries to explain why some programers are willing to give away the products of their labors. It describes the "hacker milieu" as a "gift culture":

Gift cultures are adaptations not to scarcity but to abundance. They arise in populations that do not have significant material scarcity problems with survival goods. We can observe gift cultures in action among aboriginal cultures living in ecozones with mild climates and abundant food. We can also observe them in certain strata of our own society, especially in show business and among the very wealthy. [2]

Raymond alludes to the definition of "gift cultures", but not enough to satisfy my curiosity. Being the good librarian, I was off to the reference department for more specific answers. More often than not, I found information about "gift exchange" and "gift economies" as opposed to "gift cultures." (Yes, I did look on the Internet but found little.)

Probably one of the earliest and more comprehensive studies of gift exchange was written by Marcell Mauss. [3] In his analysis he says gifts, with their three obligations of giving, receiving, and repaying, are in aspects of almost all societies. The process of gift giving strengthens cooperation, competitiveness, and antagonism. It reveals itself in religious, legal, moral, economic, aesthetic, morphological, and mythological aspects of life. [4]

As Gregory states, for the industrial capitalist economies, gifts are nothing but presents or things given, and "that is all that needs to be said on the matter." Ironically for economists, gifts have value and consequently have implications for commodity exchange. [5] He goes on to review studies about gift giving from an anthropological view, studies focusing on tribal communities of various American indians, cultures from New Guinea and Melanesia, and even ancient Roman, Hindu, and Germanic societies:

The key to understanding gift giving is apprehension of the fact that things in tribal economics are produced by non-alienated labor. This creates a special bond between a producer and his/her product, a bond that is broken in a capitalistic society based on alienated wage-labor.[6]

Ingold, in "Introduction To Social Life" echoes many of the things summarized by Gregory when he states that industrialization is concerned:

exclusively with the dynamics of commodity production. ... Clearly in non-industrial societies, where these conditions do not obtain, the significance of work will be very different. For one thing, people retain control over their own capacity to work and over other productive means, and their activities are carried on in the context of their relationships with kin and community. Indeed their work may have the strengthening or regeneration of these relationships as its principle objective. [7]

In short, the exchange of gifts forges relationships between partners and emphasizes qualitative as opposed to quantitative terms. The producer of the product (or service) takes a personal interest in production, and when the product is given away as a gift it is difficult to quantify the value of the item. Therefore the items exchanged are of a less tangible nature such as obligations, promises, respect, and interpersonal relationships.

As I read Raymond and others I continually saw similarities between librarianship and gift cultures, and therefore similarities between librarianship and open source software development. While the summaries outlined above do not necessarily mention the "abundance" alluded to by Raymond, the existence of abundance is more than mere speculation. Potlatch, "a ceremonial feast of the American Indians of the northwest coast marked by the host's lavish distribution of gifts or sometimes destruction of property to demonstrate wealth and generosity with the expectation of eventual reciprocation", is an excellent example. [8]

Libraries have an abundance of data and information. (I won't go into whether or not they have an abundance of knowledge or wisdom of the ages. That is another essay.) Libraries do not exchange this data and information for money; you don't have to have your credit card ready as you leave the door. Libraries don't accept checks. Instead the exchange is much less tangible. First of all, based on my experience, most librarians just take pride in their ability to collect, organize, and disseminate data and information in an effective manner. They are curious. They enjoy learning things for learning's things sake. It is a sort of Platonic end in itself. Librarians, generally speaking, just like what they do and they certainly aren't in it for the money. You won't get rich by becoming a librarian.

Information is not free. It requires time and energy to create, collect, and share, but when an information exchange does take place, it is usually intangible, not monetary, in nature. Information is intangible. It is difficult to assign it a monetary value, especially in a digital environment where it can be duplicated effortlessly:

An exchange process is a process whereby two or more individuals (or groups) exchange goods or services for items of value. In Library Land, one of these individuals is almost always a librarian. The other individuals include tax payers, students, faculty, or in the case of special libraries, fellow employees. The items of value are information and information services exchanged for a perception of worth -- a rating valuing the services rendered. This perception of worth, a highly intangible and difficult thing to measure, is something the user of library services "pays", not to libraries and librarians, but to administrators and decision-makers. Ultimately, these payments manifest themselves as tax dollars or other administrative support. As the perception of worth decreases so do tax dollars and support. [9]

Therefore when information exchanges take place in libraries librarians hope their clientele will support the goals of the library to administrators when issues of funding arise. Librarians believe that "free" information ("think free speech, not free beer") will improve society. It will allow people to grow spiritually and intellectually. It will improve humankind's situation in the world. Libraries are only perceived as beneficial when they give away this data and information. That is their purpose, and they, generally speaking, do this without regards to fees or tangible exchanges.

In many ways I believe open source software development, as articulated by Raymond, is very similar to the principles of librarianship. First and foremost with the idea of sharing information. Both camps put a premium on open access. Both camps are gift cultures and gain reputation by the amount of "stuff" they give away. What people do with the information, whether it be source code or journal articles, is up to them. Both camps hope the shared information will be used to improve our place in the world. Just as Jefferson's informed public is a necessity for democracy, open source software is necessary for the improvement of computer applications.

Second, human interactions are a necessary part of the mixture in both librarianship and open source development. Open source development requires people skills by source code maintainers. It requires an understanding of the problem the computer application is trying to solve, and the maintainer must assimilate patches with the application. Similarly, librarians understand that information seeking behavior is a human process. While databases and many "digital libraries" house information, these collections are really "data stores" and are only manifested as information after the assignment of value are given to the data and inter-relations between datum are created.

Third, it has been stated that open source development will remove the necessity for programers. Yet Raymond posits that no such thing will happen. If anything, there will an increased need for programmers. Similarly, many librarians feared the advent of the Web because they believed their jobs would be in jeopardy. Ironically, librarianship is flowering under new rubrics such as information architects and knowledge managers.

It has also been brought to my attention by Kevin Clarke (kevin_clarke@unc.edu) that both institutions use peer-review:

Your cultural take (gift culture) on "open source" is interesting. I've been mostly thinking in material terms but you are right, I think, in your assessment. One thing you didn't mention is that, like academic librarians, open source folks participate in a peer-review type process.

All of this is happening because of an information economy. It sure is an exciting time to be a librarian, especially a librarian who can build relational databases and program on a Unix computer.

Below are a few paragraphs about each of the indexers reviewed here. They are listed in alphabetical order.

Of the indexes reviewed here, freeWAIS-sf is by far the grand daddy of the crowd, and the predecessor Isite/Isearch, SWISH, and MPS. Yet, freeWAIS-sf is not really the oldest indexer because it owes its existence to WAIS originally developed by Brewster Kahle of Thinking Machines, Inc. as long ago as 1991 or 1992.

FreeWAIS-sf supports a bevy of indexing types. For example, it can easily index Unix mbox files, text files where records are delimited by blank lines, HTML files, as well as others. Sections of these text files can be associated with fields for field searching through the creation "format files" -- configuration files made up of regular expressions. After data has been indexed it can be made accessible through a CGI interface called SFgate, but the interface relies on a Perl module, WAIS.pm, which is very difficult to compile. The interface supports lots o' search features including field searching, nested queries, right-hand truncation, thesauri, multiple-database searching, and Boolean logic.

This indexer represents aging code. Not because it doesn't work, but because as new incarnations of operating systems evolve freeWAIS-sf get harder and harder to install. After many trials and tribulations, I have been able to get it to compile and install on RedHat Linux, and I have found it most useful for indexing two types of data: archived email and public domain electronic texts. For example, by indexing my archived email I can do free text searches against the archives and return names, subject lines, and ultimately the email messages (plus any attachments). This has been very helpful in my personal work. Using the "para" indexing type I have been able to index a small collection of public domain literature and provide a mechanism to search one or more of these texts simultaneously for things like "slave" to identify paragraphs from the collection.

Harvest was originally funded by a federal grant in 1995 at the University of Arizona. It is essentially made up of two components: gatherers and brokers. Given sets of one or more URLs, gatherers crawl local and/or remote file systems for content and create surrogate files in a format called SOIF. After one or more of the SOIF collections have been created they can be federated by a broker, an application indexing them and makes them available though a Web interface.

The Harvest system assumes the data being indexed is ephemeral. Consequently, index items become "stale", are automatically removed from retrieval, and need to be refreshed on a regular basis. This is considered a feature, but if your content does not change very often it is more a nuisance than a benefit.

Harvest is not very difficult to compile and install. It comes with a decent shell script allowing you to set up rudimentary gatherers and brokers. Configuration is done through the editing of various text files outlining how output is to be displayed. The system comes with a Web interface for administrating the brokers. If your indexed content is consistently structured and includes META tags, then it is possible to output very meaningful search results that include abstracts, subject headings, or just about any other fields defined in the META tags of your HTML documents.

The real strength of the Harvest system lies in its gathering functions. Ideally system administrators are intended to create multiple gatherers. These gatherers are designed to be federated by one or more brokers. If everybody were to index their content and make it available via a gatherer, then a few brokers can be created collecting the content of the gatherers to produce subject- or population-specific indexes, but alas, this was a dream that came to fruition.

MPS seems to be the zippiest of the indexers reviewed here. It can create more data in a shorter period of time than all of the other indexers. Unlike the other indexers MPS divides the indexing process into two parts: parser and indexer. The indexer accepts what is called a "structured index stream", a specialized format for indexing. By structuring the input the indexer expects it is possible to write output files from your favorite database application and have the content of your database indexed and searchable by MPS. You are not limited to indexing the content of databases with MPS. Since it too was originally based on the WAIS code it indexes many other data types such as mbox files, files where records are delimited by blank lines (paragraphs), as well as a number of MIME types (RTF, TIFF, PDF, HTML, SOIF, etc.). Like many of the WAIS derivatives, it can search multiple indexes simultaneously, supports a variant of the Z39.50 protocol, and a wide range of search syntax.

MPS also comes with a Perl API and an example CGI interface. The Perl API comes with the barest of documentation, but the CGI script is quite extensive. One of the neatest features of the example CGI interface is its ability to allow users to save and delete searches against the indexes for processing later. For example, if this feature is turned on, then a user first logs into the system. As the user searches the system their queries are stored to the local file system. The user then has the option of deleting one or more of these queries. Later, when the user returns to the system they have the option of executing one or more of the saved searches. These searches can even be designed to run on a regular basis and the results sent via email to the user. This feature is good for data that changes regularly over time such a news feeds, mailing list archives, etc.

MPS has a lot going for it. If it were able to extract and index the META tags of HTML documents, and if the structured index stream as well as the Perl API were better documented, then this indexer/search engine would ranking higher on the list.

SWISH is currently my favorite indexer. Originally written by Kevin Hughes (who is also the original author of hypermail), this software is a model of simplicity. To get it to work for you all that needs to be done is to download, unpack, configure, compile, edit the configuration file, and feed the file to the application. A single binary and a single configuration file is used for both indexing and searching. The indexer supports Web crawling. The resulting indexes are portable among hosts. The search engine supports phrase searching, relevance ranking, stemming, Boolean logic, and field searches.

The hard part about SWISH is the CGI interface. Many SWISH CGI implementations pipe the search query to the SWISH binary, capture the results, parse them, and return them accordingly. Recently a Perl as well as PHP modules have been developed allowing the developer to avoid this problem, but the modules are considered beta software.

Like Harvest, SWISH can "automagically" extract the content of HTML META tags and make this content field searchable. Assume you have a META tag in the header of your HTML document such as this:

<META NAME="subject" CONTENT="adaptive technologies; CIL (Computers In Libraries);">
				

The SWISH indexer would create a column in its underlying database named "subject" and insert into this column the values "adaptive technologies" and "CIL (Computers In Libraries)". You could then submit a query to SWISH such as this:

					subject = "adaptive technologies"
				

This query would then find all the HTML documents in the index whose subject META tag contained this value resulting in a higher precision/recall ratio. This same technique works in Harvest as well, but since the results of a SWISH query are more easily malleable before they are returned to the Web browser, other things can be done with the SWISH results; SWISH results can easily be sorted by a specific field, or more importantly, SWISH results can be marked up before they are returned. For example, if your CGI interface supports the GET HTTP method, then the content of META tags can be marked up as hyperlinks allowing the user to easily address the perennial problem of "Find me more like this one."

The Yaz/Zebra combination is probably the best indexer/search engine solution for librarians who want to implement an open source Z39.50 interface. Z39.50 is an ANSI/NISO standard for information retrieval based on the idea of client/server computing before client/server computing was popularized:

It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a "help" facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user. --http://lcweb.loc.gov/z3950/agency/markup/01.html

Put another way, Z39.50 tries to facilitate a "query once, search many" interface to indexes in a truly standard way, and the Yaz/Zebra combination is probably the best open source solution to this problem.

Yaz is a toolkit allowing you to create Z39.50 clients and servers. Zebra is an indexer with a Z39.50 front-end. To make these tools work for you the first thing to be done is to download and compile the Yaz toolkit. Once installed you can feed documents to the Zebra indexer (it requires a few Yaz libraries) and make the documents available through the server. While the Yaz/Zebra combination does not come with a Perl API, you there are at least a couple of Perl modules available from CPAN providing Z39.50 interfaces. There is also a module called ZAP! (http://www.indexdata.dk/zap/) allowing you to embed a Z39.50 client into Apache.

There is absolutely nothing wrong with the Yaz/Zebra combination. It is well documented, standards-based, as well as easy to compile and install. The difficulty with this solution is the protocol, Z39.50. It is considered overly complicated and therefore the configuration files you must maintain and the formats of the files available for indexing are rather obtuse. If you require Z39.50, then this is the tool for you. If not, then something else might be better suited to your needs.

Indexers provide one means for "finding a needle in a haystack" but don't rely on it to satisfy people's information needs; information systems require well-structured data and consistently applied vocabularies in order to be truly useful.

Information systems can be defined as organized collections of information. In order to be accessed they require elements of readability, browsability, searchability, and finally interactive assistance. Readability is another word for usability. It connotes meaningful navigation, a sense of order, and a systematic layout. As the size of an information system increases it requires browsability -- an obvious organization of information that is usually embodied through the use of a controlled vocabulary. The browsable categories of Yahoo! are a good example. Searchability is necessary when a user seeks specific information and when the user can articulate their information need. Searchability flattens browsable collections. Finally, interactive assistance is necessary when an information system becomes very large or complex. Even though a particular piece of information exists in a system, it is quite likely a person will not find that information and may need help. Interactive assistance is that help mechanism.

By creating well-structured data you can supplement the searchability aspects of your information system. For example, if the data you have indexed is HTML, then insert META tags into your documents and use a controlled vocabulary -- a thesaurus -- to describe those documents. If you do this then you can use SWISH or Harvest to extract these tags and provide canned field searching access to your documents; freetext searches rely too much on statistical analysis and can not return as high precision/recall ratios as field searches. If your content is saved in a database, then it is an easy process to create your HTML and include META tags. Such a process is described in more detail in "Creating 'Smart' HTML pages with PHP" (http://www.infomotions.com/musings/smart-pages/).

The indexers reviewed here have different strengths and weaknesses. If your content is primarily HTML pages, then SWISH is most likely the application you would want to use. It is fast, easy to install, and since it comes with no user interface you can create your own with just about any scripting language.

If your content is not necessarily HTML files, but structured text files such database dumps, then MPS or the Yaz/Zebra combination may be something more of what you need. Both of these applications support a wide variety of file formats for indexing as well as the incorporation of standards.

This part of the manual outlines the hands-on aspects of the workshop.

The activities outlined below were selected based on the software's popularity, the installation techniques they represent, the length of time and expertise they require, and their applicability to a library setting. This is not a comprehensive list of activities. A glaring omittion may be the installation of a number of GNU Tools, specifically, some sort of text editor, the compiler gcc, and make. Consequently, these activities assume the hosting (your) computer is duly equipped or the activities can be accomplished on top of Windows or Unix/Linux operating systems without compilation.

For the most part, the activities are listed in priority order; many times you must install a previous package before a subsequent package can be installed, but this is not always the case. All the packages to be installed in these exercises are included on the CD. Thus acquiring the software is a matter using the copy command (cp) from the CD to your home directory, or acquiring the software from the distribution site. The choice is yours.

The installation of open source and GNU software follows a pattern. You usually:

Downloading the software is usually done through an FTP or HTTP interface. I like to get the URL of the remote file and feed it to a program called wget which then does all the work.

Uncompressing and un-tarring the is the work of gunzip and tar, respectively.

To configure for compilation there is usually some sort of file called configure or in the case of Perl modules you run the command "perl Makefile.PL". In either case the script examines the contents of your downloaded package to make sure it is complete, examines your computer's hardware and software to make sure you have the necessary tools installed, and finally builds some sort of a "make" file which is a script used to actually make the software. The most often used configuration is "--prefix". This configuration denotes where the software will eventually be installed. By default, most software gets installed in /usr/local. This is usually a good place, but circumstances are not always the same from person to person, so running a configuration like this, ./configure --prefix=/disk1/local, might be just what you need. When in doubt, try ./configure --help for a complete list configuration options.

In almost all cases the next step is to run make and the software is built. If there are problems, then you can usually run "make clean" to remove the mistakes, re-run the configuration script, and try make again.

Once the program is built, hopefully without errors, you might be able to run "make test" which will examine whether or not program works.

Finally, you can run "make install" to put the program onto your file system. Access to /usr/local/bin, /usr/local/man, /usr/local/lib, /usr/local/etc, and /usr/local/include is usually restricted to root-level users. Consequently, you might need root privileges for this last step, but remember the --prefix configuration option. Using this option allows you to save the installation in your home directory. (Hint, hint!)

In this exercise you will install Perl.

You can now run your first Perl script.

Ta da! You have successfully installed Perl and run a Perl program.

Installing MySQL is the goal of this exercise. Be patient.

In this exercise you will install some sample data into MySQL.

Here the basics of installing Apache are outlined.

Now, create your own home page.

In this exercise you will create and install Hypermail. The process is pretty standard.

Do this exercise to create a browsable archive from a standard mail box file.

If you have previously installed swish-e, then you can do this exercise where you will create a searchable index of your browsable archive.

In this exercise you will install MARC::Record.

I wish they were all this straight forward.

Next, you will use MARC::Record to extract author and title information from a set of MARC records.

MARC::Record can also write MARC records. Here is an example demonstrating how:

If you have installed YAZ, then you can do the following exercise to download MARC records from the Library of Congress.

Use this process to install swish-e.

Now, let's index and search some data.

While swish-e can be run from the command line, its real power is demonstrated though one of its programming interfaces. In this exercise you will install swish-e's Perl module and search the index with a Perl script.

In this exercise you will explore MyLibrary.

  1. Open your Web browser to the patron URL given to you in the workshop. Explore the interface noticing how the searching, browsing, and account creation/customization features operate.

  2. In a second browser window, open the administrative interface with the URL given to you in the workshop.

  3. Select the Global Message option from the Administrative interface, and use the resulting form to edit/submit the content of a global message.

  4. Make the patron interface active by selecting your first browser window and reload the page. You should see the edits you made in the administrative interface.

  5. Return to the administrative interface and use the Message from the Librarian option. Create edit/submit the content of the same discipline you chose when creating a MyLibrary account in Step #1.

  6. Return to the patron interface, reload the page, and notice how the content of your page changes.

  7. Return to the administrative interface and create a link to a new information resource by using the Reference Shelf, Databases, or Electronic Journals menu options.

  8. Again, return to the patron interface, customize the content of your page, and notice how the resource you just added in the administrative interface is now an option in the patron interface.

  9. Make the administrative interface active, and use the Create Static Pages option to create browsable lists of the information resources in the underlying MyLibrary database.

  10. Make the patron interface active, and browse the newly created lists by using the All Resources link.

  11. Make the administrative interface active, and use the Discipline Defaults menu option to create the defaults for a discipline of your choice.

  12. Make the patron interface active. Log out. create a new account making sure you select the discipline you just modified, and notice how the defaults you created are manifested.

In this exercise you will install libxml2 and libxslt, the libraries necessary to run xsltproc. The process adheres pretty much to the standard GNU installation process: configure, make, make install.

Now you will make a binary application that uses the libxml2 library, xstlproc.

In this exercise you will transform an XML document into some other type of document using an XSL stylesheet and xsltproc.

You can get a lot of use out of xsltproc, but the fact that it is distributed as a library than can be compiled into other applications make it even more powerful.

Version 2, June 1991

Copyright (c) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.

The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.

When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.

To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.

For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.

We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.

Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.

Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.

The precise terms and conditions for copying, distribution and modification follow.

0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.

1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program.

You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.

2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions:

These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.

Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.

In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.

3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following:

The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.

If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.

4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.

5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.

6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License.

7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program.

If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.

It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.

This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License.

9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.

Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.

10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally.