Mass digitization (again)
I recently attended a symposium surrounding the topic of mass digitization, and this blog entry summarizes my person observations from the event.
On the topic of mass digitization I have a number of personal observations. First, like Tim O'Reilly, I have never considered a book to be sets of pages between covers. Books are containers, and libraries are not about books. Libraries are about what is inside the books. Books are merely manifestations of data, information, and knowledge. Yes, some books are special in and of themselves, but for the most part they are simply "content databases", not things to be treasured and hidden away in dark rooms. "Books are for use", and I write in my books all the time. A well-used book naturally opens up to the most important parts. Don't get me wrong. I appreciate the book, a codex, as a technology. I make and bind my own notebooks. They are portable, durable, self-sustained, and last a good long time. At the same time digitized books offer a greater degree of utility than traditional books, as long as the digitized books are not limited by some sort of digital rights management system. Mass digitization will only increases the opportunities for this utility, and with these increases will also come increases in user expectations.
Larger and larger quantities of books and journal articles are being digitized or "born digital". Combine this with the user's ability to locally store gigabytes of data on portable storage devices such as flash drives and iPods. Imagine the ability to carry around the entire corpus of the published literature from the 18th and 19th century on such a device. Imagine the ability to supplement this "collection" with all the relevant literary criticism. In such a world, what is the role of the library and the librarian? Obviously it is not about collections because the user has the collection. Instead, the role of the library and the librarian is about services against the collection. The role is the creation and distribution of tools allowing the student, researcher, or casual reader the ability to make the better use of the collection. Examples might include:
- look up this word in a dictionary or encyclopedia
- find definitions of this word in my collection
- compare and contrast these definitions
- allow me to annotate my work with marginalia
- allow me to read other people's marginalia
- identify the major theme in this work
- find this theme in other works of my collection
- create a list of citations from works outside my collection dealing with this theme
- get me selected works from the list
- trace this concept forward and backward
- sort this theme by date and author
- search my collection for this word or phrase
- identify other people who are doing similar work as me
As the amount of digital content grows so does the likelihood that the content will be duplicated. Increasingly people will be creating their own personal "collections". If we gave people the opportunity to download the entire works of Mark Twain, then it would get downloaded. Once students and researchers create these "collections" then they are going to want to do analysis against them. This holds true for current electronic serial literature as well, and if libraries were not bound by licensing restrictions we would allow such downloads. In this case the scientist will be wanting to compare and contrast too, not just the humanist. Track this author. Track this citation. Trace forwards and backwards this particular chemical model. These kinds of services are not the sole purview of librarians, but they do represent interesting library work, and the development of tools like the ones outlined above represent growth opportunities for libraries. They represent ways libraries can remain relevant. They represent ways librarians can use computers to revolutionize the use of libraries and not just mimic older technologies (the card catalog) with newer ones (the OPAC).
After the time of mass digitization a library's collection will not be as important as it is today. Everybody will be carrying the collection around in their pocket. Instead what people will need are sets of services -- tools -- to apply against the collections making the content more useful. In a digital environment the things of traditional librarianship (books) will give way to their content and this makes services increasingly important. Libraries, especially libraries hosting digital materials need to be about the combination of collections and services. This was alluded to many times throughout the symposium but not very thoroughly. O'Reilly touched on these ideas with this "mash ups" and "collaborative intelligence". Keller briefly mentioned them in his remarks. Guedon postulated the creation of interpersonal relationships. Lynch outlined these ideas in the greatest detail. Unfortunately these ideas seemed to generate no sparks among the audience. I was disappointed.
I can summarize my person observations in this way. Collections without services are useless, and services without collections are empty. You can't have one and not the other and call your thing a library. Librarians need to provide equal amounts of both in order to practice balanced librarianship, especially in a digital environment.
Creator: Eric Lease Morgan <email@example.com>
Source: This essay was originally published on TechEssence at http://techessence.info/node/22.
Date created: 2006-04-04
Date updated: 2007-12-28
Subject(s): mass digitization; TechEssence;