Sometimes the question is more important than the answer

In our profession, sometimes the question is more important than the answer. This column explores ways to better articulate methods for devising search strategies in a globally networked computing environment.

Browsing and searching

We all know the advantages and disadvantages of browsing collections of information as opposed to searching for it. Browsing allows the user to bring nothing to the information seeking process accept the desire to know or learn something. They, the user, do not have to articulate an information need. They do not have to know the language of the collection's organizational scheme. They can happily be led via serendipity from one topic or another. Browsing a collection can be quite stimulating.

On the other hand, if the user is looking for specific information, and especially if time is limited, then browsing can prove to be enormously inefficient. Consequently, collections of information are indexed and searching mechanisms are created. The earliest indexes were rudimentary accession lists, tables of contents, and back-of-book indexes. There were then concordances as well as author and title dictionary catalogs. Dewey and Wilson popularized the subject index.

Surrogates

While each of these index tools enhanced access to growing collections of data and information, they did so with sets of intermediary representations otherwise known as surrogates, authority lists, thesaurus terms, controlled vocabularies, or in today's language, meta data. These intermediary representations reflect the thinking processes of the people creating the tools themselves. Corilarily, they do not necessarily reflect the thinking process of the people who are expected to use these tools. Don't fault the indexers too much for these possible discrepancies. Numerous studies have pointed out the difference between user populations and their respective information seeking behaviors. At the present time, there is no practical way for indexers to create intuitive indexes for every set of user populations let alone an index reflecting the thinking processes of individuals.

Free text searching

With the advent of digital computers, free text searching has been seen as an alternative to intermediatory representation indexing methods. Unfortunately, free text searching leads to its own problems best exemplified by the thousands of hits returned from popular Internet search engines. Furthermore, a particular piece of data or information may satisfy a individual users information need, but that data or information may be poetically expressed and therefore difficult to locate via the free text search engine. Sorting and relevance ranking mechanisms attempt to filter out the data and information with less quality, but again, these mechanisms assume the thinking process or algorithm of their creators and not necessarily of their users.

Librarianship

Here's the point. There are advantages and disadvantages to the browsing process. Searching processes have advantages and disadvantages too. Browsing and searching compliment each other. Unfortunately, even when browsing and searching are used together, information may be difficult to find efficiently. Minimizing this difficulty is at the heart of librarianship. The future of libraries is, and always has been, hinged on discovering new ways to use technology coupled with the creation of cognitive models to allow people to find, acquire, and integrate data and information into their being as knowledge and wisdom.

Implementation

With these lofty goals in mind, how can we implement them today? I believe there are two solutions. The first relies on the better marketing of library services, since marketing will make more people aware of the services libraries and librarians have to offer. Second, since the numbers of librarians is limited and the individually based reference interview process is not scalable, I advocate the creation of computer-based "reference assistants" for the purposes of supplementing the reference interview and information seeking process. Through a question-and-answer interaction, these reference assistance will dynamically create search strategies that can be applied against print as well as digital information resources.

Some literature

The professional literature abounds with models ripe for automating the reference interview. William A. Katz's Introduction to Reference Work, Gerald Jahoda's The Librarian and Reference Queries, Charles T. Meadow's Basics of Online Searching, and C. J. Armstrong's Manual of Online Search Strategies are four such texts. Furthermore, there are a number of books describing previously implemented "reference assistants" (expert systems) including John V. Richardson Jr.'s, Knowledge-Based Systems For General Reference Work and Ralph Alberico's Expert Systems For Reference And Information Retrieval. Curiously, much of this literature is more than ten years old and I can not figure out why the idea of a reference assistant or an expert system to supplement the reference process has not continued to attract attention.

A process

If I were to try to supplement the reference interview with an automated process today, then this process would almost definitely be implemented through a Web browser. This would insure the widest possible coverage of people while minimizing operating system specific difficulties.

Next, I would create HTML versions of the forms we all used to require our patrons to complete before performing expensive DIALOG or BRS online searches. Remember those forms? We asked questions like:

Are you looking for a specific item or a list of items surrounding a subject?
In two or three sentences, describe what you would like to know.
What are some key words or phrases describing your topic?
Do you have any sample citations exemplifying the sort of information you seek? If so, what are they?
Do you want a comprehensive search, or just a few citations?
How should the search results be limited?

This process generates lists of words and phrases. It could be supplemented with words from a thesaurus or dictionary. I would then present this list to the user and ask them to prioritize the items of the list.

Next, with my knowledge of different types of information resources (dictionaries, encyclopedias, bibliographies, manuals, handbooks, etc.) I would inquire about the type(s) of information needed. Types of information include, but is not limited to:

definitions
lists of citations
brief overviews
instructions
maps
simple facts
pictures

Next, I would present the user with a short list of broad subject headings. Selected items from this list coupled with the feedback garnered from the previous question would generate lists of possible information sources such as a computing dictionary, a general encyclopedia, an agriculture bibliographic database, or an illustrated book.

Once words or phrase have been acquired and information sources had been generated, the next step would be to translate the words and phrases into search strategies that can be applied to the information sources. In the case of Internet resources, these search strategies can almost always be implemented as complex URLs. You just need to know what data the remote search engine requires.

Finally, a dynamically created HTML page would be presented to the user. It would include a recapitulation of the interview process, a description of how to use printed resources to satisfy the information need, and a list of URLs allowing the user to automatically search electronic resources. This dynamically created HTML page will get them started with a minimum of false positive hits. If the results are less than useful, then the user will be able to back track through the process to regenerate the suggestions. If the suggestions are way off base, then the dynamically created HTML page could be printed and taken to the nearest reference librarian who would be able to see how the reference interview was handled and be able to fill in its gaps. Ideally, this entire automated reference interview process would be saved to a private database, and records in this database could be used to build a knowledge base for future... reference.

Summary

The plan described above is not the end-all of the reference interview. It is intended to supplement the reference interview and provide assistance to those people who do not have access to a reference librarian or do not want to talk to one. The results it generates may not be perfect, but it should at the very least give the patron a head start, and it is scalable. The hard part will be learning how to ask the right questions in order to provide the best possible answers; sometimes the questions are more important than the answers.

Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: This is as pre-edited article originally published in Computers in Libraries.
Date created: 1999-03-16
Date updated: 2004-11-14
Subject(s): expert systems;
URL: http://infomotions.com/musings/search-strategies/