European Conference on Digital Libraries

This travel log documents my experience at the 6th European Conference on Digital Libraries (ECDL), Rome, Italy, September 16-18, 2002. In a sentence, this conference, attended by mostly university computer science faculty, facilitated the sharing of digital library research ideas and experimental results. For more information about the Conference, see: http://ecdl2002.iei.pi.cnr.it/

University Gregoriana
University Gregoriana
Inside University
Inside University
statue
statue
face
face
temple
temple
Round Temple
Round Temple
Bocco and me
Bocco and me
Nice Shadow
Nice Shadow

Presentations

I was able to attend the following presentations.

Hector Garcia-Molina (Stanford University) - "WebBase: Web Capture and Distribution"

In his plenary presentation, Garcia-Molina described the techniques he has explored for collecting and indexing content from the Web. He first became interested in this area a number of years ago when he was trying to implement an online repository of scientific reports. He soon gave up on the project after learning people would not put their items into a central repository, but rather post their reports directly to the Web. This led him to exploring Web crawling techniques. More importantly, he noticed there are no "institutions" involved in Web crawling techniques, just processes. The removal of the institutions (i.e. the library or the computer science department) reduced the number of socio-polical problems surrounding the gathering and disseminating of reports.

As he built his WebBase he articulated a number of challenges: 1) scalability, 2) consistency, 3) dissemination, 4) topic-specific collection building, 5) archiving, 6) intellectual property management, and 7) quality. He went on to describe a crawling technique using parallel processing. He elaborated on the storage and indexing tools he developed. They were based on Berkeley DB. Of a particular note were his possibilities for determining the authenticity of documents. Possibilities included digital signatures, voting (popular opinion), and/or the age of a document (the oldest document is the most authentic).

I did not find the presentation to contain any particularly innovative ideas, but I did find the problems he was addressing were very similar to the problems of traditional library work. The ideas Garcia-Molina expressed are the same concerns of many librarians. Why is there not a greater amount of communication between the Garcia-Molina's of the world and the people who work in libraries?

Inside University (movie)
Inside University (movie)
Roman Forum (movie)
Roman Forum (movie)
St. Peter's Square (movie)
St. Peter's Square (movie)
Inside Coliseum (movie)
Inside Coliseum (movie)

Seung-Kyu Ko (Yonsei University) - "Conversion of eBook Documents Based on Mapping Relations"

The purpose of Ko's research was to demonstrate the possibilities of converting various ebook formats into other ebook formats/standards. He first outlined and described the data structures for a number of popular ebook formats in South Korea (EBKS, OEB PS, and JepaX). He then compared and contrasted the elements of each format to one another in an effort to create a mapping technique. Once the similarities were articulated a set of XSL transformations were applied to ebook data. The transformations proved to be useful but not 100% valid since each ebook format/standard contains unique elements unmappable to any other format.

Ruth Wilson (University of Strathclyde) - "Guidelines for Designing Electronic Books"

Wilson described the process and findings of a project called Electronic Books ON-screen Interface (EBONI): http://eboni.cdlr.strath.ac.uk/

Essentially, Wilson described a usability study applied against sets of electronic books (hardware and software versions). She asked sets of users ("actors") to evaluate various ebook functions ("tasks"), and she was able to list characteristics of preferable/usable ebook designs. Some of her findings include: 1) some people like the format of traditional books, 2) don't use small fonts, 3) text must be in short, scanable chunks, 4) summarize each page, 5) cover pages are important, 6) open graphics in new windows, 7) tables of contents are necessary, 8) provide "bookmarks", 9) indexes are necessary for navigation, 10) provide a search tool, and 11) make the book "closed" so users do not get lost. Concerning hardware devices, Wilson says: 1) make sure the screen is big enough to see, 2) the hardware is light enough to carry, 3) the screen is back-lit, and 4) the hardware should tolerate physical abuse -- be non-breakable.

Trevi Fountain
Trevi Fountain
Pantheon
Pantheon
Fountain Face
Fountain Face
Pantheon
Pantheon

Davide Bolchini (University of Lugano) - "Goal-Oriented Requirements Specification for Digital Libraries"

Through this presentation Bolchini described a method for outlining the use of a digital library. He sees digital libraries as combinations of institutions, content, end-users, and access tools. Every stakeholder of digital libraries have their own set of goals, and designers of digital libraries must create their digital libraries under cost (time, money, resources) limitations. In order to manage a digital library process, digital libraries must be traceable, adhere to specifications, and allow for validation. Combining the goals of the user with the creation of "scenarios" -- a story about use -- describing a user's information need, Bolchini believes it is possible to create a set of requirements used to fulfill the goals. The entire process Bolchini described looked a lot like implementing usability efforts throughout the development process.

Jon Heggland (Norwegian University of Science and Technology) - "OntoLog: Temporal Annotation Using Ad Hoc Ontologies and Application Profiles"

Heggland described a method for annotating temporal data, such as the data contained in a video recording: http://www.idi.ntnu.no/~heggland/ontolog/

He began the presentation by describing the inherent difficulties of time-based data. While it is possible to annotate a video, for example, granularity of the annotations becomes an issue. "What if I want to annotate this aspect of the video or that aspect?" By stratifying a video -- breaking it up into layers such as voices, music, visuals, etc. it is possible to begin to overcome the granularity problem. OntoLog provides a visual map of a video or temporal content. Each aspect of the map can then be described and annotated. He was using RDF as the XML schema to do this description. Like many of the other precentors at the conference, Heggland was creating the means to do real library work -- abstracting and indexing -- and yet he did not consider himself to be a librarian. Interesting.

Marcos André Gonçalves (Virginia Tech) - "An XML Log Standard and Tool for Digital Library Logging Analysis"

Gonçalves proposed a XML log file standard. Many things create log files. HTTP servers create log files on Common Log File Format, and they illustrate how people browsed a website. Search engines generate log files an illustrate the ways people search indexes. To a greater or lesser extent, these types of log files are complimentary. At the same time, they sacrifice a bit of user privacy. Obviously there are issues to be addressed in the way log files are created and evaluated. Gonçalves advocated the creation of log file standard that comprehensive, reflective, easily readable, and precise. He posited that if such a log file structure were implemented, then not only will more meaningful log file analysis be possible, but other services may be implementable as well such a personalization and a comparison of system across domains would be feasible.

tradesmen
tradesmen
King's Monument
King's Monument
statue
statue
facade
facade

Donna Bergmark (Cornell University) - "Notes from the Interoperability Front: A Progress Report on the Open Archives Initiative"

In the absence of either Herbert Van de Somple or Carl Lagoze, Bergmark shared with the audience the latest news about Open Archives Initiative: http://www.openarchives.org/

First off, Bergmark often referred to the Initiative as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). For the most part, she outlined the features of version 2.0 of the protocol, including: 1) errors are no longer communicated via HTTP error codes; OAI now has its own error codes, 2) "recommended" practices have been removed, 3) developed a single XML schema for the protocol, 4) changed the definition of what a resource is, 5) resumption tokens are better defined, 6) date are now UTC-specific and based on local time, 7) times are based on a seconds level of granularity, 8) "mini repositories" are now a feature; a simple XML file enumerating the content of a repository are now allowed eliminating the need to handle many protocol requests.

OAI sure has become a standard in a short period of time. It has become core to many digital library tools even though OAI has only been around for about two years. It behooves us, as librarians, to articulate why OAI has become such a creditable tool and implement its characteristics in other digital library activities.

Suzanne Little (University of Queensland) - "Dynamic Generation of Intelligent Multimedia Presentations through Semantic Inferencing"

Little proposed a semi-automatic method for the creation multimedia presentations. The first step in the method is/was to harvest metadata about Internet-accessible information resources through OAI. Next, relationships between sets of harvested metadata were established through semantic inferencing. This resulted in sets of RDF files. These files were then transformed into SMIL (Synchronized Multimedia Integration Language) files and presented to the user. The key to the process is creating the relationships between the sets of harvested metadata. The example illustrated was one using Abraham Lincoln and Frederick Douglas. A set of harvested metadata contained information about Abraham Lincoln but also described Frederick Douglas. Other sets of harvested metadata contained links to images of Frederick Douglas. Through the consistent use of metadata tags, Little demonstrated the ability to display an image of Frederick Douglas based on the content about Abraham Lincoln. Her experiments proved to be initially successful, but complete success hinges on more complete and richer sets of metadata.

fountain
fountain
Coliseum
Coliseum
arch
arch
Coliseum
Coliseum

Kurt Maly (Old Dominion University) - "Technical Report Interchange through Synchronized OAI Caches"

Working in conjunction with a number of other institutions (NASA Langley Research Center, Los Alamos National Laboratory, Air Force Research Laboratory, Sandia National Laboratory and Old Dominion University) Maly demonstrated the possibilities for sharing technical reports via OAI. While technical reports from these institutions could have been saved in a central repository, or they could have been exposed with a single set of OAI compliant programs, these option were not possible for him because of a number of limitations. First, it was assumed that each participating institution would not have to change its behavior in order to participate in the project. Second, participating institutions would not have to change their database's (vendor's) interface in order to participate. Given these assumptions Maly wrote a number of translation programs that successfully communicated with an OAI service provider. The system is/was functional, but not scalable since the translation program are specific to each participating institution. Furthermore, not all the information in every repository was intended to be available to every participating institution. Consequently, while the experimental prove successful, the idea's implementation will probably not be developed. The biggest problems were not necessarily technical but socio-political in nature.

Martin Halbert (Emory University), Ed Fox (Virginia Tech), and Eric Lease Morgan (University of Notre Dame) - "OCKHAM: Coordinating Digital Library Development with Lightweight Reference Models"

Working much like a tag team, Halbert, Fox, and Morgan facilitated a panel discussion surrounding the topic of "lightweight reference models" -- OCKHAM: http://ockham.library.emory.edu/

Halbert, in the process of redesigning a website, realized he would benefit from the use of tools others had created. He didn't feel the need to reinvent the wheel. Similarly, he wanted to explore ways to reduce costs, foster interoperability, and build bridges between groups. Because of these things he would like to see OCKHAM -- a set of interchangeable digital library parts -- be developed and maintained.

Fox elaborated on the OCKHAM idea by drawing on his years of digital library experience. He notices many of the components (searching, indexing, archiving, collecting, authenticating, authorizing, etc.) he and his team created and developed are redeveloped in each system. Ideally he would like to see these components become modularized. He proposed to build on the strengths of OAI to make such ideas a reality, and specifically, he advocated the implementation of Open Digital Libraries (ODL) as one such way to accomplish the goals of OCKHAM: http://oai.dlib.vt.edu/odl/

Morgan briefly illustrated how the MyLibrary application can begin to participate in an OCKHAM framework by outputting content from its underlying database via OAI streams and/or static XML/RDF files: http://dewey.library.nd.edu/morgan/musings/ockham-ecdl/

The bulk of time was then spent trying to answer the question, "How do we make OCKHAM a reality?", and answers to the question included:

General comments/questions included:

The ideas of OCKHAM seemed to stir a bit of interest, and some of them made it to George Buchanan's ECDL conference report "Report on the Sixth European Conference on Digital Libraries", D-Lib Magazine 8(10) October 2002: http://www.dlib.org/dlib/october02/buchanan/10buchanan.html

Roman Forum
Roman Forum
Ceasar
Ceasar
door
door
St. Peter's Cathedral
St. Peter's Cathedral

From the Proceedings

There were a number of very interesting articles from the published proceedings:

Trond Aalberg (Norwegian University of Science and Technology) - "Navigating in Bibliographic Catalogues"

This article outlined a method for applying the Functional Requirements for Bibliographic Records (FRBR) model to existing catalogs. This is done by creating external sets of relationships between item in the catalog. These relationships include things like: 1) Is realized through, 2) Is embodied in, 3) Has a translation, 4) Has adaption, and 5) Is part of. These characteristics are used to describe item in the catalog and consequently provide a means to alternatively navigate a bibliographic system. Interesting. Defining relationships between cataloged items is a way of adding value to the catalog beyond mere physical description and subject analysis.

Donna Bergmark (Cornell University) - "Focused Crawls, Tunneling, and Digital Libraries"

Going under the assumption that Web crawling and tunneling will be important means for creating future digital libraries, this article outlined method for accomplishing this goal. It was interesting to note that Bergmark started out with seeds garnered from Google's search results for specific topics and begin their crawling with those seeds. The article defined a number of interesting terms: 1) nugget to denote a useful, statistically meaningful document, 2) dud as the opposite of nugget, 3) path is the sequence of links used to get from nugget to nugget, and 4) crawl as the tree of all paths. Based on her research, the techniques described work but is it important to consider the paths to documents in order to tunnel/crawl effectively.

George Buchanan (Middlesex University) - "Exploring Small Screen Digital Library Access with the Greenstone Digital Library"

Buchanan explored the possibilities of using hand held devices, such as PalmPilots, with digital library systems, specifically, Greenstone. In summary, an outline approach to displaying information is most useful in these cases.

copula
copula
Alley of Columns
Alley of Columns
St. Peter
St. Peter
marble table
marble table

Donatella Castelli (Consiglio Nazionale delle Ricerche) - "OpenDLib: A Digital Library Service System"

This article described concepts very similar to OCKHAM. It describes a set of digital library function such as acquisition, storage and preservation, search, browse and retrieval, selection and dissemination of documents, authorization and authentication of users. Castelli describes OpenDLib as a "federation of interoperating services which can be distributed and/or replicated on different servers." The system relies on a "Manager" function that takes input and passes it along to sub components. All communication in and out of the Manager is done using XML and is completely described in Castelli, "OLP: The Open Digital Library Protocol. Istituto di Elaborazione dell'Informazione, Technical Report. 2002" This work was not done in a vacuum. Much of it builds on things like NCSTRL, Deinst, OAI, and ODL (above).

This whole thing sounds like the invention of calculus with Newton and Leibniz, only on a much smaller scale.

Byeong Heui Kwak (Seoul National University of Education) - "A Study on the Evaluation Model for University Libraries in Digital Environments"

Because the library I work for has been going through a lot self-evaluation lately, and since I seriously wonder whether or not item counts conducted by ARL are truly meaningful measures of a library's value, this article was interesting. The article posited the existence of hybrid libraries, libraries that provide traditional library service as well as digital library service. It then created a evaluation model that included not only traditional measures, but measures that took into account digital library services, measures such as but not limited to: 1) metadata management, 2) access to information systems, 3) LAN speed, 4) Internet speed, 5) copyright management, 6) OPAC, 7) library home page, 8) electronic reference services, etc.

To what extent does ARL consider these qualities?

MacKenzie Smith (Massachusetts Institute of Technology) - "DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories"

This article updates the reader on MIT's DSpace project. DSpace is a system for collecting and archiving digital content created by MIT. It is intended to be more like a gatekeeper and not a tool for scholarly communication. The system uses METS and qualified Dublin Core for description. Metadata is provide by submitters, not library staff. The goal of the system is "to provide MIT faculty with a robust, scalable, preservation-quality institutional repository for it born-digital research output, so that has been the initial focus of development rather than support of digitally reformatted library collections."

I wonder why descriptions of items in DSpace can not be exported to the library's catalog.

Vatican Library
Vatican Library
Sphere Within A Sphere
Sphere Within A Sphere
statue
statue
Fountain Figure
Fountain Figure

Hussein Suleman (Virginia Tech) - "Designing Protocols in Support of Digital Library Componentization"

On the heels of Castelli and OCHAM above, this article proposes extensions to OAI that allow for greater modularization of digital library systems. One such extension is a protocol request called PutRecord. Others include: 1) union, 2) filter, 3) search, 4) browse, 5) recent, 6) annotate, 7) review, and 8) submit. Suleman's hope is to change the way people build digital libraries so they can utilize simple and reusable component models based on already established standards.

Summary

I am very glad I had the opportunity to attend this conference. It was eye-opening in many regards. First of all, it certainly was exciting to be in Rome. Second, I met up with a number of people I knew (Gary Marchionini, David Seaman, and of course Martin Halbert and Ed Fox). It was also nice to make an acquaintance of Ian Wittten.

The majority of attendees would not call themselves librarians, but they were doing real library work. They were concerned with creating tools allowing for the collection, organization, archiving, disseminating, and sometimes evaluation of data and information for the purposes of expanding knowledge. The only difference was that their data and information was not manifested in physical mediums but digital mediums.

Additionally, the traditional library professions could learn a lot from these people because they were applying of the scientific method for evaluating and measuring digital library research. Instead of relying heavily on antidotal evidence, these participants used experiments, even social experiments to verify and validate their assumptions. Furthermore, they build their ideas on the documented ideas of their predecessors.

Finally, OAI and its simplicity certainly has made an impression on the digital "librarians" at the conference. Everybody appreciates the mathematical elegance of the OAI protocol. It represents something we can all learn from.

Drinking From A Fountain
Drinking From A Fountain
door
door
door
door
door
door
Taking Lunch
Taking Lunch
mosaic
mosaic
temple
temple
Screaming Man
Screaming Man

Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: This document was never formally published.
Date created: 2002-10-19
Date updated: 2004-11-28
Subject(s): Rome, Italy; ECDL (European Conference on Digital Libraries); travel log;
URL: http://infomotions.com/musings/ecdl-2002/