Building your library's portal

Introduction

This text proposes a definition Web portal, describes how information architecture plays a critical role in the development of a library's website, and briefly describes one database-driven website application designed for libraries, MyLibrary. (A one-page, PDF version of this text designed for printing should be available online.)

What is a portal?

The defining characteristic, in my opinion, of a library portal is the user-driven customizability of a website's content. It is a website who's output is tailored for an individual and is retained by the underlying system so the user's customization are echoed on subsequent visits to the site.

There are as many types of websites as there are types of human activity. These sites can range from a single HTML page long to terabytes of data generated from underlying database applications. As the size of a website grows so does the need for better searchability and browsability. As the size of a website grows even more and as the intended audience of the website's content becomes more diversified so does need for user-driven customization and personalization. The ability to create a one-size-fits-all website is inversely proportional to the amount of content a website tries to communicate and the size of the site's intended audience.

A portal is only one possible component to a library's Web presence. A library website can be made up of three types of content:

Information about the library - staff directories, departmental descriptions, maps of the building, hours, etc.
Electronic versions of traditional library services - online tutorials, book renewals, interlibrary loan requests and status reports, requests for purchase, online chat/reference, virtual tours of the building(s), etc.
Access to library content - catalogs, indexes, full-text magazines and journals, digitized special collections, free and commercial ebooks, government documents, freely accessible Internet resources, electronic encyclopedias and dictionaries, licensed content from vendors, etc.

Our goal as librarians who maintain library websites is to implement these services and collections in a functional, scalable, usable, and an aesthetically pleasing manner. A portal is only one possible part of such an implementation. In order to achieve these goals is necessary to first practice a bit of information architecture.

Information architecture

The process of information architecture begins by answering questions about your institution's purpose, your user's needs and desires, and the types of content you have to communicate. Keeping the answers to these questions in mind, information architecture is then about organizing your content, labeling it effectively, providing the means for browsing and searching the content, and maintaining metadata used to describe it. Finally, since the answers to your initial questions change over time, it is necessary to regularly review your information architecture and modify it accordingly. In the world of information architecture it is not possible to "mark it and park it." This process is outlined in more detail below.

Research

The process of creating and maintaining a website can be divided into three phases: research, strategy, and implementation. During the research phase the goal is to attempt to answer a myriad of questions. Your answers will often not be definitive, and they will change over time. This is okay.

The answers to these questions will help formulate a set of milestones for the website. They will set you up for success and help you establish sets of tasks your website is intended to facilitate. The questions fall into three categories: questions about your organization, questions about your intended audience, and questions about your content. Some of these questions include:

What is the mission of your organization, and in turn, what is the purpose of the website?
Who is the primary audience of your website?
What content does the organization have to communicate via the website?
Who in your organization will do the work to create and maintain the website?
What task does your audience expect your website to facilitate, and what technical resources do they have at their disposal?

To answer these question talk your users, read your organization's mission statement, and take a serious inventory of your existing content. Do not rely solely on your professional judgment. Times have changed since you went to library school. The Internet has drastically altered people's expectations about information retrieval. Far less than before, a library is not the center of the information universe. There is infinitely more competition for people's attention, and if librarians to not make an effort to meet user's heightened expectations, then user's will not use libraries. The only way you are going to get a accurate picture of user expectations is to ask them. Conduct extensive focus group interviews. Facilitate surveys. Analyze Web server log files and online public access catalog transactions.

Strategy

The strategy is your website's blueprint.

After you have some sort of answer to the research questions, articulate some sort of plan for putting those answers into practice. For example, if your primary audience are in the K-12 age group, then you will organize and label your content one way. On the other hand, if your audience is highly educated, then you will organize and label your content differently.

You will want to organize your content into over arching categories providing the same functionality as a book's table of contents. This over arching organization is a view of the website from 30,000 feet. Do not organize your content in the same way your library is organized; the content should not parallel your institution's organizational chart. User's don't care about your organizational chart and shouldn't have to know it in order to get their work done. Try applying card sorting techniques for organizing the content and grouping it into broad categories.

Libraries are always a means to an end and not an end in themselves. User's want to accomplish specific tasks to do something else. "I need a list of articles on... Do you have the book whose title is... What is the status of my interlibrary loan request? I need a synopsis of the American Revolutionary War. I want to know why Copernicus's ideas were revolutionary and not evolutionary." Once you have begun to organize your content based on articulated user needs, your next step is to go back to the user's and ask them if it makes sense. Again, focus group interviews are indispensable here. Ask the user.

Next, draw, on pieces of paper, rough outlines and tree structures of your website. In general it is better to create shallow and wide websites as opposed to deep and narrow ones. Shallow websites are easier to browse and provide a better view of a site's total content. Deep websites enable you to get very specific about your content but force the user to make too many choices along the way. If a website is not easy to use, then users won't use it.

As the website becomes larger, say more than twenty five pages, it becomes more important to index the website's content and provide a search engine against the index. The key to successful indexing and searching mechanisms is not "kewl" searching functionality such as Boolean operations, right-hand truncations, field searching, nesting, and set searching. Instead the key to success is accurately describing each item of the indexed content thoroughly with user-centered metadata. The vast majority of users will overwhelmingly only use two- or three-word phrases when searching an index. Remember, people are bringing their general Internet experiences to the library. Google. You will not be able to change their behavior and get them to use all of the fancy search functions we librarians have come to love with the advent of BRS and DIALOG in the late 1970's and early '80's.

As your website becomes even larger, say more than seventy-five pages, consider using a relational database to maintain and create your website's content. By putting your content into a database you will be effectively separating your content from presentation. In turn, this will allow you repurpose your content for different venues. For example, by putting your content into a database and describing it in a user-centric manner, you will be able to create things like:

a comprehensive list of all your content by subject, audience, format, etc.
pathfinders describing very specific subject areas or class assignments
generalized home pages fulfilling the needs of most users
portal applications whose content is tailored to specific individuals
content intended to by syndicated and integrated into your host institution's website

Organizing your content in a relational database application is the key to implementing a portal. The database must contain fields describing users and those same fields must be used to bring together information resources pertinent to their interest. For example, your database might classify users in terms of their status in the organization such as grade or education level. Similarly, you will have to classify your information resources with these same grades or education levels. You might classify users in terms of their primary areas of subject interest or expertise. That's easy, librarians classify information resources with subjects all the time, but remember to keep these classifications user-centric. Do not use things like Library of Congress Subject Headings. Instead use subject terms that are directly identifiable to your users. Seriously consider creating a sort of used bookstore model for your subject classification.

In the strategy part of process talk to the people who will be maintaining the website's content and infrastructure. Plan to update their skills, and plan to use computer technology that matches your budget and your personnel resources. There is no use trying to duplicate the functionality of Yahoo! if you have a staff of three and an outdated laptop computer with a dial-up connection for an Internet connection.

Create a time table for implementation (or reimplementation). Deadlines that are written down set people's expectations. Since work fills the available time, writing down when things are to be completed makes things go smoother.

Implementation

The final phase of the redesign process is the implementation phase. It is in this phase where the strategy is put into practice. For example, ROT (redundant, outdated, and trivial content) will be removed. Tools and processes for creating and maintaining website content will be refined, documented, and taught. Massive amounts of HTML will be "retrospectively converted" into the new design. Indexing will be more systematically applied. During this part of the process usability testing will come to a head and more rigorously applied. The website will also be vigorously marketed and promoted in an effort to make the user population more aware of the changes.

Ironically, this part of the process should be easy. All the thinking was done before hand, and now all you have to do is the work.

MyLibrary

MyLibrary is a database-driven website application designed for libraries. It provides the means for librarians to describe sets of Internet resources in terms of user characteristics. It then provides the means for repurposing this content in the form of home pages, lists of resources organized in various fashions, pathfinders, a portal, and syndicated content in the form of XML streams. MyLibrary also provides the means to report on what resources are being used and by whom. It includes a search engine indexing its content. It facilitates librarians sending targeted email messages to their constituents. It also includes a "virtual new bookshelf" service.

MyLibrary is distributed as open source software. Being freely available, anybody can download the software, examine it, pick it apart, and try it out before making a commitment to using it. Technically speaking, MyLibrary is a set of CGI Perl modules and scripts running on top of an HTTP server and against a relational database. The relational database can be either MySQL or PostgreSQL. It can run on just about any ol' computer, Windows or Unix, and you do not need a big bad machine to host the service. Dollars to donuts you could run a production level MyLibrary service off of one of the extra, unused computers you have lying around your institution some place.

MyLibrary has been available since 1998 and it is in production in about two dozen libraries across the world including the NCSU Libraries, the University of Michigan, Lund University in Sweden and others. It has also been the inspiration for many other database-driven applications and portals including implementations at the Los Alamos National Laboratory and the recently developed MyLibrary service at the University of Rochester. The term MyLibrary is slowly becoming a part of the library vernacular in the same way name brands such as Kleenex and Xerox have become part of our language.

The University Libraries of Notre Dame is in the very beginning stages of redesigning its Web presence. The Libraries' Digital Access and Information Architecture Department is leading this effort and plan to use the processes outlined above to ensure the redesign goes smoothly. Consequently, we are spending the time to learn user needs, explicitly articulating the purpose of the website, and taking a long hard look at the content we have to offer. We will then use some form of a database-driven website to repurpose our content for many venues. One of those venues will most likely be portal application.