Archive for December, 2008

Visit to Ball State University

Wednesday, December 17th, 2008

I took time yesterday to visit a few colleagues at Ball State University.

group photo

Ball State, the movie!

Over the past few months the names of some fellow librarians at Ball State University repeatedly crossed my path. The first was Jonathan Brinley who is/was a co-editor on Code4Lib Journal. The second was Kelley McGrath who was mentioned to me as top-notch cataloger. The third was Todd Vandenbark who was investigating the use of MyLibrary. Finally, a former Notre Damer-er, Marcy Simons, recently started working at Ball State. Because Ball State is relatively close, I decided to take the opportunity to visit these good folks during this rather slow part of the academic year.

Compare & contrast

After I arrived we made our way to lunch. We compared and contrasted our libraries. For example, they had many — about say 200 — public workstations. The library was hustling and bustling. About 18,000 students go to Ball State and seemingly many of them go home on the weekends. Ball State was built with money from the canning jar industry, but upon a visit to the archives no canning jars could be seen. I didn’t really expect any.

Shop talk

Over lunch we talked a lot about FRBR and the possibilities of creating work-level records from the myriad of existing item-level (MARC) records. Since the work-related content is often times encoded as free text in some sort of 500 field, I wonder how feasible the process would be. Ironically, an article, “Identifying FRBR Work-Level Data in MARC Bibliographic Records for Manifestations of Moving Images” by Kelley had been published the day before in Code4Lib. Boy, it certainly is a small world.

I always enjoy “busman’s holidays” and visiting other libraries. I find we oftentimes have more things in common than differences.

A Day with OLE

Saturday, December 13th, 2008

This posting documents my experience at Open Library Environment (OLE) project workshop that took place at the University of Chicago, December 11, 2008. In a sentence, the workshop provided an opportunity to describe and flowchart a number of back-end library processes in an effort to help design an integrated library system.

What is OLE

gargoyle

full-scale gargoyle

As you may or may not know, the Open Library Environment is a Mellon-funded initiative in cooperation with a growing number of academic libraries to explore the possibilities of building an integrated library system. Since this initiative is more about library back-end and business processes (acquisitions, cataloging, circulation, reserves, ILL, etc.), it is complimentary to the the eXtensible Catalog (XC) project which is more about creating a “discovery” layer against and on top of existing integrated library system’s public access interfaces.

Why OLE?

Why do this sort of work? There are a few reasons. First, vendor consolidation makes the choices of commercial solutions few. Not a good idea; we don’t like monopolies. Second, existing applications do not play well with other (campus) applications. Better integration is needed. Third, existing library systems are designed for print materials, but with the advent of greater and greater amounts of electronic materials the pace of change has been inadequate and too slow.

OLE is an effort to help drive and increase change in Library Land, and this becomes even more apparent when you consider all of the Mellon-related library initiatives it is supporting: Portico (preservation), JSTOR and ArtSTOR (collections), XC (discovery), OLE (business processes/technical services).

The day’s events

The workshop took place at the Regenstein Library (University of Chicago). There were approximately thirty or forty attendees from universities such as Grinnell, Indiana, Notre Dame, Minnesota, Illinois, Iowa, and of course, Chicago.

After being given a short introduction/review of what OLE is and why, we were broken into four groups (cataloging/authorities, circulation/reserves/ILL, acquisitions, and serials/ERM), and we were first asked to enumerate the processes of our respective library activities. We were then asked to classify these activities into four categories: core process, shifting/changing process, processes that could be stopped, and processes that we wanted but don’t have. All of us, being librarians, were not terribly surprised by the enumerations and classifications. The important thing was to articulate them, record them, and compare them with similar outputs from other workshops.

After lunch (where I saw the gargoyle and made a few purchases at the Seminary Co-op Bookstore) we returned to our groups to draw flowcharts of any of our respective processes. The selected processes included checking in a journal issue, checking in an electronic resource, keeping up and maintaining a file of borrowers, acquiring a firm order book, cataloging a rare book, and cataloging a digital version of a rare book. This whole flowcharting process was amusing since the workflows of each participants’ library needed to be amalgamated into a single processes. “We do it this way, and you do it that way.” Obviously there is more than one way to skin a cat. In the end the flowcharts were discussed, photographed, and packaged up to ship back to the OLE home planet.

What do you really want?

The final, wrap-up event of the day was a sharing and articulation of what we really wanted in an integrated library system. “If there one thing you could change, then what would it be?” Based on my notes, the most popular requests were:

  1. make the system interoperable with sets of APIs (4 votes)
  2. allow the system to accommodate multiple metadata formats (3 votes)
  3. include a robust reporting mechanism; give me the ARL Generate Statistics Button (2 votes)
  4. implement a staff interface allowing work to be done without editing records (2 votes)
  5. implement consortial borrowing across targets (2 votes)
  6. separate the discovery processes from the business processes (2 votes)

Other wish list items I thought were particularly interesting included: integrating the collections process into the system, making sure the application was operating system independent, and implementing Semantic Web features.

Summary

I’m glad I had the opportunity to attend. It gave me a chance to get a better understanding of what OLE is all about, and I saw it as a professional development session where I learned more about where things are going. The day’s events were well-structured, well-organized, and manageable given the time restraints. I only regret there was too little “blue skying” by attendees. Much of the time was spent outlining how our work is done now. I hope any future implementation explores new ways of doing things in order to take better advantage of the changing environment as opposed to simply automating existing processes.

ASIS&T Bulletin on open source software

Friday, December 12th, 2008

The following is a verbatim duplication of an introduction I wrote for a special issue of the ASIS&T Bulletin on open source software in libraries. I appreciate the opportunity to bring the issue together because I sincerely believe open source software provides a way for libraries to have more control over their computing environment. This is especially important for a profession that is about learning, teaching, scholarship, data, information, and knowledge. Special thanks goes to Irene L. Travis who brought the opportunity to my attention. Thank you.

Open Source Software in Libraries

It is a privilege and an honor to be the guest editor for this special issue of the Bulletin of the American Society for Information Science and Technology on open source software. In it you will find a number of articles describing open source software and how it has been used in libraries. Open source software or free and open source software is defined and viewed in a variety of ways, and the definition will be refined and enriched by our authors. However, very briefly, for those readers unfamiliar with it, open source software is software that is distributed under one of a number of licensing arrangements that (1) require that the software’s source code be made available and accessible as part of the package and (2) permit the acquirer of the software to modify the code freely to fit their own needs provided that, (3) if they distribute the software modifications they create, they do so under an open source license. If these basic elements are met, there is no requirement that the resulting software be distributed at no cost or non-commercially, although much widely used open source software such as the web browser Firefox is also distributed without charge. 

In This Issue

The articles begin with Scot Colford’s “Explaining Free and Open Source Software,” in which he describes how the process of using open source software is a lot like baking a cake. He goes on to outline how open source software is all around us in our daily computing lives.

Karen Schneider’s “Thick of the Fray” lists some of the more popular open source software projects in libraries and describes how these sorts of projects would not have been nearly as feasible in an era without the Internet.

Marshall Breeding’s “The Viability of Open Source ILS” provides a balanced comparison between open source software integrated library systems and closed source software integrated library systems. It is a survey of the current landscape.

Bob Molyneux’s “Evergreen in Context” is a case study of one particular integrated library system, and it is a good example of the open source adage “scratching an itch.”

In “The Development and Usage of the Greenstone Digital Library Software,” Ian Witten provides an additional case study but this time of a digital library application. It is a good example of how many different types of applications are necessary to provide library service in a networked environment.

Finally, Thomas Krichel expands the idea of open source software to include open data and open libraries. In “From Open Source to Open Libraries,” you will learn that many of the principles of librarianship are embodied in the principles of open source software. In a number of ways, librarianship and open source software go hand-in-hand.

What Is Open Source Software About?

Open source software is about quite a number of things. It is about taking more complete control over one’s computer infrastructure. In a profession that is a lot about information, this sort of control is increasingly necessary. Put another way, open source software is about “free.” Not free as in gratis, but free as in liberty. Open source software is about community – the type of community that is only possible in a globally networked computer environment. There is no way any single vendor of software will be able to gather together and support all the programmers that a well-managed open source software project can support. Open source software is about opportunity and flexibility. In our ever-dynamic environment, these characteristics are increasingly important.

Open source software is not a panacea for libraries, and while it does not require an army of programmers to support it, it does require additional skills. Just as all libraries – to some degree or another – require collection managers, catalogers and reference librarians, future-thinking libraries require people who are knowledgeable about computers. This background includes knowledge of relational databases, indexers, data formats such as XML and scripting languages to glue them together and put them on the web. These tools are not library-specific, and all are available as open source.

Through reading the articles in this issue and discussing them with your colleagues, you should become more informed regarding the topic of open source software. Thank you for your attention and enjoy.

Fun with the Internet Archive

Wednesday, December 10th, 2008

I’ve been having some fun with Internet Archive content.

The process

cover artMore specifically, I have created a tiny system for copying scanned materials locally, enhancing it with a word cloud, indexing it, and providing access to whole thing. There is how it works:

  1. Identify materials of interest from the Archive and copy their URLs to a text file.
  2. Feed the text file to a wget (wget.sh) which copies the plain text, PDF, XML metadata, and GIF cover art locally.
  3. Create a rudumentary word cloud (cloud.pl) against each full text version of a document in an effort to suppliment the MARC metadata.
  4. Index each item using the MARC metadata and full text (index.pl). Each index entry also includes the links to the word cloud, GIF image, PDF file, and MARC data.
  5. Provide a simple one-box, one-button interface to the index (search.pl & search.cgi). Search results appear much like the Internet Archive’s but also include the word cloud.
  6. Go to Step #1; rinse, shampoo, and repeat.

The demonstration

Attached are all the scripts I’ve written for the as-of-yet-unamed process, and you can try the demonstration at http://dewey.library.nd.edu/hacks/ia/search.cgi, but remember, there are only about two dozen items presently in the index.

The possibilities

There are many ways the system can be improved, and they can be divided into two types: 1) servcies against the index, and 2) services against the items. Services against the index include things like paging search results, making the interface “smarter”, adding things like faceted browse, implementing an advaced search, etc.

Services against the items interest me more. Given the full text it might be possible to do things like: compare & contrast documents, cite documents, convert documents into many formats, trace idea forward & backward, do morphology against words, add or subtract from “my” collection, search “my” collection, share, annotate, rank & review, summarize, create relationships between documents, etc. These sort of features I believe to be a future direction for the library profession. It is more than just get the document; it is also about doing things with them once they are acquired. The creation of the word clouds is a step in that direction. It assists in the compare & contrast of documents.

The Internet Archive makes many of these things possible because they freely distribute their content — including the full text.

InternetArchive++

Snow blowing and librarianship

Sunday, December 7th, 2008

I don’t exactly know why, but I enjoy snow blowing.

snow blower


snow blower

I think it began when I was college. My freshman year I stayed on during the January earning money from Building & Grounds. For much of the time they simply said, “Go shovel some snow.” It was quiet, peaceful, and solitary. It was physical labor. It was a good time to think, and the setting was inspirational.

A couple of years later, in order to fulfill a graduation requirement, I needed to design and complete a “social practicum”. I decided to shovel snow for my neighbors. Upon asking them for permission, I got a lot of strange looks. “Why would you want to shovel my snow?”, they’d ask. I’d say, “Because I am more able to do it than you. I’m just being helpful and providing a social service.” Surprisingly, many people did not take me up on my offer, but a few did.

I now live and work in northern Indian only forty-five minutes from Lake Michigan where “lake effect” snow is common. I own a big, bad snowblower. It gives me a sense of power, and even though it disturbs the quiet, I enjoy the process of cleaning my driveway and sidewalk. I enjoy trying to figure out the most effectient way to get the job done. I enjoy it so much I even snow blow around the block.

Snow blowing and librarianship

What does this have to do with librarianship? In reality, not a whole lot. On the other hand, one of the aspects of librarianship, especially librarianship in public libraries, is community service — providing means for improving society. My clearing of snow for my neighbors is done in a similar vein, and it works for me. I can do something for my fellow man and have fun at the same time. Weird?

P.S. Mowing the grass gives me the same sort of feelings.

Tarzan of the Apes

Monday, December 1st, 2008

This is a simple word cloud of Edgar Rice Burroughs’ Tarzan of the Apes:

[openbook]978-1593082277[/openbook]

tarzan  little  clayton  great  jungle  before  d’arnot  jane  back  about  cabin  mr  toward  porter  professor  saw  again  time  philander  eyes  strange  know  first  here  though  never  old  turned  many  after  black  forest  left  hand  own  thought  day  knew  beneath  body  head  see  young  life  long  found  most  girl  lay  village  face  tribe  wild  away  tree  until  ape  down  must  seen  far  within  door  white  few  much  esmeralda  savage  above  once  dead  mighty  ground  stood  side  last  trees  apes  cried  thing  among  moment  took  hands  new  off  without  almost  beast  huge  alone  close  just  tut  canler  nor  way  knife  small  

I found this story to have quite a number of similarities with James Fenimore Cooper’s The Last of the Mohicans. The central character in both was super human. Both includes some sort of wilderness. In the Last of the Mohicans it was the forest. In Tarzan it was the jungle. In both cases the wilderness was inhabited by savages. Indians, apes, or pirates. Both included damsels in distress who were treated in a rather Victorian manner and were sought after by an unwanted lover. Both included characters with little common sense. David and Professor Porter.

I found Tarzan much more readable and story-like compared to the Last of the Mohicans. It can really be divided into two parts. The first half is a character development. Who is Tarzan, and how did he becomes who he is. The second half is a love story, more or less, where Tarzan pursues his love. I found it rather distasteful that Tarzan was a man of “breeding“. I don’t think people are to bred like animals.