Web Services at OCLC

This text documents some of my experiences on a recent trip to OCLC, December 16 - 17, 2002 to discuss Web Services.

Introduction

On December 16 and 17, 2002 I visited OCLC as a part of the Invitational Web Services Seminar. The Seminar was attended by about fifteen people from OCLC's Office of Research and fifteen librarians with extensive experience applying Web technologies to library activities. The purpose of the Seminar was to get feedback from the librarians on a number of Web Services applications the OCLC's Office of Research has been investigating. See: http://www.oclc.org/research/events/webservices/.

In preparation for the meeting, the Office of Research had supplied us librarians with a few recommended readings describing the concept of Web Services:

Roy Tennant, Digital Libraries-What To Know About Web Services
Tracy Gardner, An Introduction to Web Services
Heather Kreger, Web Services Conceptual Architecture

In a nutshell, Web Services describe techniques for sharing data/information between computers. There are two Web Services techniques: REST and SOAP. REST-based services send a URL to a remote computer over an HTTP connection. The URL is interpreted as input for a computer application. The application computes the results and returns it to the first computer in the form of an XML data stream. It is then up to the first computer to process the results. The best example of a "REST-ful" Web Service is the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). SOAP is identical to REST except the input for the second computer is transmitted as an XML stream, not a URL. Additionally, the transport mechanism is not necessarily HTTP but usually is.

Illustrations and discussion

After framing the purpose of the Seminar, outlining how Web Services work, and positing the large (huge, vast, enormous, etc.) amount of metadata directly available to OCLC, illustrations of various Web Services were provided and discussions were facilitated. Specifically, four Web Services were discussed: harvesting metadata, terminology services, schema transformations, and recombinant metadata.

Harvesting Metadata

This is/was the easiest service to get our minds around. It consists of using the OAI-PMH to collect metadata and create an integrated catalog of available content. Initial content may include theses and dissertations, "learning objects", e-prints, and possibly data sets. Once collected issues to be addressed will be integrating items in the catalog with items in WorldCat.

Terminology Services

Libraries spend much of their time creating and managing vocabularies: subject headings, classification systems, authority files, etc. These vocabularies are usually centrally managed in an integrated library system. By decentralizing such services -- breaking them out of proprietary software -- libraries may be able to more easily expand the usefulness of these vocabularies. Tools may be created that convert subject headings to classification numbers, assist users in traversing subject headings and classification systems, mine metadata collections (such as WorldCat) for "more items like this one", automate classification, enhance/augment metadata descriptions based on statistical analysis.

Schema Transformations

XHTML. Dublin Core. MARC and all of its variants. Encoded Archival Description (EAD). Resource Description Framework (RDF). RDF Site Summary (RSS). DocBook. Text Encoding Initiative (TEI). Each of these things, as well as others, are structures containing metadata. By facilitating methods for translating/converting one metadata structure to another libraries can provide means to share and disseminate information/data to wider audiences. Additionally, by transforming metadata structures between one and another new relationships between information resources may be uncovered.

Recombinant Metadata

Similar metadata exists in multiple metadata repositories. If a person were to work just with names, then they might be able to find relationships about information objects described in repositories using the names as entry points. A name in one database might be associated with a postal address. The same name in a second database might be associated with a number of writings (books, articles, websites, etc.). The same name might even be associated with a list of people with similar educational backgrounds or mentor/protege relationships. Consequently, by using a name and extracting data surrounding the name from various metadata repositories, information about people could be formulated quickly. The success of such a concept relies on transparent access to the repositories, and such access could easily be facilitated using a Web Services technique.

Summary

Libraries presently provide many but not all of the services outlined above. A Web Services approach to making these services available makes a lot of technical sense. Assuming a Web Services implementation, what could libraries do with the wealth of metadata available to them? This meeting articulated just a few of them:

find more like this one
navigate a hierarchy
automate classification
augment metadata descriptions
create collections
augment searching
evaluate the validity of information
provide document delivery
build profiles of people to create personalized services
expose more metadata from libraries
find things written by people with greater authority
organize/classify search results into more meaningful lists

Many of the ideas could easily be integrated into MyLibrary . MyLibrary can easily expose its metadata. As the use of a particular MyLibrary implementation reaches a critical mass, techniques for allowing users to "find more like this one" as well as "show me what my peers use" become increasingly possible using a recombinanting metadata service. Content for a MyLibrary system presently requires manual data entry. If a schema transformation service in combination with a harvesting service were available, then it may be more possible to systematically import data instead. Right now MyLibrary supports a very simple "search this site" mechanism, but it could be enhanced with a terminology service.

In summary, the meeting was intellectually stimulating. It provided the means to learn about Web Services in general and how they may be applied in a library setting. I had a very interesting discussion with Devon Smith about open source software. We seemed to agree on a number things concerning the role of open source software in libraries. Additionally, I enjoyed talking to Ralph LeVan about the latest implementation of Z39.50 as a Web Service.

Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: This text was never published.
Date created: 2003-01-03
Date updated: 2004-12-02
Subject(s): Web Services; OCLC (Online Computer Library Center); travel log;
URL: http://infomotions.com/musings/oclc-2002/