Adding Internet resources to our OPACs

Abstract

This essay advocates the addition of bibliographic records describing Internet-based electronic serials and Internet resources in general into library online public access catalogs (OPAC), addresses a few implications of this proposition, and finally, suggests a few solutions to accomplish this goal.

OPACs are finding aids and not simple catalogs

Just like any other information format, bibliographic records describing Internet-based electronic serials should be included in our OPACs. The basis of this opinion lies in my definition of a library catalog. The library catalog is a finding aid. More specifically, it is a tool designed to help a defined set of people locate information in a comprehensive collection of data. As we improve the functions of our OPAC software, this finding tool will also become an access tool. This position can be contrasted with the idea of a library catalog as a list of things owned by a library and are held within a library's walls. Put another way, if you were ask me to address the "access versus ownership" issue, then I would fall, for the most part, in the "access" camp.

The decision to define an OPAC as a finding tool as opposed to a simple catalog represented a personal, internal, and professional debate raging inside of me for more than a year. Defining an OPAC as a simple catalog or list of materials owned by a library would have made many of the problems listed below seem irrelevant. It would have made our life as librarians much easier and less complex.

On the other hand, if we limit our OPACs to only items we own, then we are doing our user populations a great disservice. This is because many valid information resources exist beyond our immediate control but still prove very useful to our clientele. If we restrict the inclusion of Internet resources from our OPACs, then we do not evolve with the times and provide the sorts of services our user populations have come to expect and desire. Furthermore, if libraries do not provide these sorts of services, then commercial services will. Consequently, libraries would be lackadaisical in fulfilling their mission of equal access to information, especially considering that many Internet resources are freely available.

Incidentally, taken to the extreme, if the OPAC is a finding tool and Internet resources should be included in the OPAC, then we must ask ourselves why we are not including the data from bibliographic journal article indexes as well. If the OPAC is suppose to be a comprehensive finding aid, then a logical conclusion to this proposition seems to point to the inclusion of bibliographic journal article indexes as well as Internet resources. The differences in controlled vocabularies, the skills needed to limit searches to particular formats, and the mixture of formats themselves are all possible rebuttals to this seemingly logical conclusion. Fortunately (or unfortunately), the development of this consequence is not the topic of the present discussion.

Broken URLs and the hopes for URIs

There are a number of obstacles impeding a library's ability to effectively add bibliographic records of Internet-based serials to its OPAC. The first is the dynamic nature of the Internet, and therefore, the dynamic nature of the serials. We have all experienced the "file not found" errors on our own local computers as well as remote Internet-based computers. If a library were to rely on the addition of uniform resource locators (URL) in the 856 fields of MARC records, then librarians may spend much of their time tracking down "broken" URLs. Hopefully, the concept of the uniform resource identifier (URI) will come to fruition and reduce (if not eliminate) the numerous reasons why the "file not found" error occurred in the first place.

As you may or may not know, URIs are to URLs as the Internet names of computers are to Internet Protocol (IP) numbers. All computers on the Internet are uniquely identified by IP numbers. For example, the IP number of the computer on my desk is 152.1.24.177. This computer has a name as well, emorgan.lib.ncsu.edu. If I were to get a new computer it would be assigned a new IP number, but the domain name service (DNS) of our campus could make sure the new number would be associated with the old name. Thus, I could always tell be people to connect to my computer (emorgan.lib.ncsu.edu) and they would find it available.

URIs will work the same way. There will be a database of URIs. Each one will be associated with one or more URLs. As URLs change, the database is updated. To use the Internet, people would use URIs instead of URLs and consequently, URIs would never be broken as long as the database were kept up-to-date.

Until the concept of a URI becomes a reality, I can imagine a number of short-term solutions to this problem. The least likely solution is the addition of a new feature to OPAC software by vendors. This feature would examine all the records in the database(s) containing 856 field(s) and check for the validity of the URL(s) found there. Invalid URLs would then be added to a list and regularly sent to a database maintenance team.

A more likely solution is to do this ourselves using the report generation services already included in our OPAC software. Another solution, and quite possibly the most implementable, is the creation of a separate, locally maintained database of Internet resources. This database is not necessarily MARC-based, but it would contain the fields essential to create a complete MARC record. More importantly, it would be able to extract the URL of a record and check for its validity. Then, on a regular basis, when all the URLs had been verified, this database would create a report in the form of MARC records, and these records would be imported into the OPAC overwriting any duplicates found there. Realistically, none of these solutions are ideal, but may be necessary for the short-term.

Integrating identification and access

Another impediment to the effective use of a Internet resource in our OPACs is accessing the Internet resources once they has been located. In other words, after a Internet resource has been located in the OPAC, how does the end-user actually get that resource? In most of today's cases, the end-user would have to extract the URL from the MARC record and use their Internet communications software to open up the extracted URL. Not only is this analogous to the writing down call numbers where many end-user mistakes occur, but it should be unnecessary.

A user of the OPAC should be able to access the resource from the same piece of software they use to access the OPAC. Unfortunately, many of the computers used to access the OPAC are not really computers at all. They are "dumb terminals" incapable of opening multiple windows and supporting concurrent applications.

The solution to this problem is three-fold. First, we should eliminate the use of dumb terminals in our libraries and rely on "smart terminals" (computers) to access the OPAC. Second, we should impress upon our OPAC vendors the need for microcomputer-based front-ends to their systems. Third, these front-ends should be aware of 856 fields and allow people to use the URLs found there to access Internet-resources. By "aware of 856 fields", I mean the OPAC software could extract the URLs or (URIs) in an 856 field, interpret the protocol of Internet resource represented there, and then open up a connection to the remote computer using that protocol. This means the OPAC software would also have to be FTP, gopher, email, telnet, hypertext transfer protocol (HTTP), as well as future clients.

An alternative, more likely, solution is the further development of Z39.50 and/or World Wide Web interfaces to OPACs as exemplified by the demonstration interfaces listed in "Library Catalogs with Web Interfaces" and "WWW-to-Z39.50 Gateways." [1, 2] These interfaces can interpret the contents of the 856 field and make them "hot" for our browser software. This is how the Alcuin database works. [3] When searches locate records containing URLs in 856 fields, the interface program extracts the URL and creates an very simple hypertext markup language (HTML) document. Thus, when the document is returned to the client application the URLs are "hot."

Enhancing the controlled vocabulary

While the incorporation of bibliographic records describing Internet resources into our OPACs presents some technical difficulties, this proposition also challenges our controlled vocabulary systems. The controlled vocabularies of our OPACs have always been hallmarks of their usefulness and integrity. Much of the North American academic libraries rely on the Library of Congress Subject Headings (LCSH) for their controlled vocabulary. LCSH was intended to be the vocabulary of the Library of Congress and not necessarily North America; the Library of Congress is not a national library and their vocabulary is designed for their particular needs. Consequently, LCSH does not always include the vocabulary to adequately describe items in our OPACs. This problem is magnified by the length of time necessary to introduce new terms into LCSH.

Since Internet resources "come to market" much faster than traditional information materials, and the types of information they represent are even more specialized than traditional materials, Internet resources can limit the usefulness of our controlled vocabularies even more. Thus, the incorporation of Internet resources into our OPACs will necessitate a faster method for including and updating our controlled vocabularies. I would like to advocate a new and improved source for our controlled vocabularies, but I do not know how to implement such a thing.

Labor intensitivity and new skills

Initially, the addition of Internet resources into our OPACs will be a labor intensive process since many of the records will require original cataloging. Since the Library of Congress is not currently producing 856-aware cataloging records, libraries who want to include these sorts of records will have to create their own. It will take time for our bibliographic utilities to obtain a critical mass of these sorts of bibliographic records.

Until such a time occurs, many records will have to be created individually. This will require more professional catalogers with an in-depth knowledge of the Internet and time to evaluate Internet resources in terms of their bibliographic elements. Since Internet resources do not have title pages and versos, these catalogers will have to reinterpret the rules of cataloging and classification in order to create these new records. (A list of guides and references discussing how to catalog Internet resources can be found in Vianne Tang Sha's "Internet Resources for Cataloging." [4])

Once a critical mass of Internet resources appear in our bibliographic utilities, copy cataloging will again become the norm, but again, since Internet resources "come to market" so much faster than familiar mediums, and since relatively few libraries are contributing 856-aware cataloging copy, there will be the constant need for more original cataloging than is traditionally done by our libraries.

End-user education

With the wide-spread addition of the bibliographic records describing Internet-resources into our OPACs will come a need to educate our populations on the existence of these records in our OPACs. Furthermore, libraries will have to try to distill from the population's mind set that an OPAC is "only a list of books."

Numerous collections of Internet resources are appearing on the Internet. Many of these collections support search features. Experience demonstrates that these same search features rely solely on free-text searching; while Boolean logic and relevance ranking are employed by these services, controlled vocabulary and field searching are not supported. Despite these limitations, these services are extremely popular. People may begin to think they are the only useful collections of Internet resources.

Unless libraries aggressively incorporate Internet resources into their OPACs which are pertinent to the needs of their user populations quickly and effectively, the search services will be come the norm and our user population will not understand the benefits of our OPAC's selectiveness and comprehensiveness. In other words, libraries will continue collecting information resources particularly useful to their clientele and attempting to create a thorough and extensive collection as possible at the same time.

Additionally, since the OPAC is more than "a list of books", but our populations don't comprehend this, our populations will have to be educated on how to use and access any located Internet resources from the collection.

Summary

The inclusion of Internet resources into our OPACs presents fundamental challenges to our conceptions of a library catalog. It also presents technical difficulties, as well as semantic ones. It necessitates a lot of end-user and librarian education and retraining. None of these obstacles are unsurmountable. The solutions require persistence, ingenuity, and a respect for change. In short, they require a propensity for professionalism and a commitment to excellence in service.

Notes

http://www.lib.ncsu.edu/staff/morgan/alcuin/webbed-catalogs.html
http://is.rice.edu/~riddle/webZ39.50.html
http://library.ncsu.edu/drabin/alcuin/
http://asa.ugl.lib.umich.edu/chdocs/libcat/libcat.html

Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: Serials Review 21(4): 70-72, Winter 1995
Date created: 1995-12-21
Date updated: 2004-12-10
Subject(s): cataloging; articles;
URL: http://infomotions.com/musings/adding-internet-resources/