A "Next generation" library catalog – Introduction and assumptions (Part #2 of 5)

This is an introduction and list of assumptions outlining an idea for a “next generation” library catalog. In two sentences, this catalog is not really an catalog at all but more like a tool designed to make it easier for students to learn, teachers to instruct, and scholars to do research. It provides its intended audience with a more effective means for finding and using data and information.

The full text of this document formatted as a single HTML page is available at:

http://dewey.library.nd.edu/morgan/ngc/

Introduction

In library parlance, an OPAC is an “online public access catalog”. The operative word in this phrase is “catalog”. Traditionally speaking, the OPAC is thought of as an index to the things owned or licensed by a library. It is an electronic version of the venerable card catalog. As such it contains “cards” pointing to books, not the books themselves. It contains “cards” pointing to article indexes but not the articles themselves. Because libraries do not own them, library OPAC’s, for the most part, do not contain pointers to Internet resources because they are too dynamic in nature to maintain. These things, along with the bibliographic indexes, are relegated to a library’s website, which, in turn, becomes information silos. Metasearch technology has tried to bring these things together under one search interface but with little success. As a catalog, OPAC’s are akin to inventory lists allowing people to search for and identify things in or available from libraries. Beyond identification, OPAC’s do not provide very many services against holdings besides the ability to borrow.

The OPAC should not be equated with an ILS — the integrated library system. The ILS is a suite of software automating much of the traditional processes of librarianship. For example, there are circulation modules enabling libraries to keep track of who has borrowed what. There are acquisition modules allowing bibliographers to keep track of how much money they have spent and allows other people to do the actual purchasing. There are cataloging modules enabling catalogers to create MARC records and supplement them with controlled vocabularies and authority lists. Other modules provide functionality for interlibrary loan, the reserve book room, serials acquisitions, etc. The OPAC is the public face of a library, and in the big picture, it is traditionally an additional module in the ILS.

In the past decade people’s expectations regarding search and the access to information have dramatically changed. The proliferation of desktop computers, the increasing amount of content “born digital”, and the advent of the Internet have all contributed to this phenomenon. People expect to enter one or two words (or maybe a phrase) into a search box, click a button, get a list of relevancy ranks results returned, select an item from the results, and view/download the information. The process is quick, easy, and immediate. Even though people know the information available from libraries is free and authoritative they often use libraries as a last resort because they see library systems are overwhelmingly associated with books, confusing, difficult to use, time consuming, and inconvenient.

The proposed “next generation” library catalog is intended to address the issues outlined above. It builds on the good work previously done by the library profession through collection, preservation, organization, and dissemination. It takes into account the changing nature of user expectations and tries to meet them. It exploits computer technology and harnesses the knowledge built up by the information retrieval and digital library communities. The goal of the “next generation” library catalog is to create a transparent system enabling library users to get their work done more quickly and efficiently. It will not be seen as an impediment to learning, teaching, and scholarship, but rather as useful tool for getting an education and increasing the sphere of knowledge.

Assumptions

As the saying goes, “There is more than one way to skin a cat.” Similarly, there will be no such thing as the “next generation” library catalog, only variations on a theme. Each implementation of a “next generation” library catalog will by slightly different, and the differences will probably be rooted in the individual implementation’s underlying assumptions. There are a number of assumptions underlying the design of this particular “next generation” library catalog, and they are enumerated below.

It is not a catalog

The “next generation” library catalog is not a catalog, per se.

The “next generation” library catalog is not intended to be searchable list of the things a library owns or licenses. It is more than that. It is a list of the things deemed useful for achieving the goals of the library’s parent organization. The world of information has moved beyond books, magazines, and videos. The computers and the Internet has spawned mailing lists (full of names, addresses, and telephone numbers), data sets, images, full-text books and journal articles, archival finding aids, pre-prints & other gray literature, etc. To various degrees all of these things are important to meeting the needs of library users. To the best of its ability, the “next generation” library catalog includes the full-text of these items or at least metadata describing these items in its collection.

It avoids multiple databases

The “next generation” library catalog avoids multiple databases and indexes, thus increasing simplicity and reducing the problem of de-duplication.

Multiple databases and multiple indexes increase complexity and necessitate an increase in computer infrastructure. They create information silos and make it challenging to create holistic systems. By avoiding the creation of multiple databases and indexes it will be easier to do global searching and refinement. Reducing information overload after returning too many hits will be overcome through the use of faceted browsing, an “intelligent” user interface, and the ability to create tightly focused queries against the index.

It is bent on providing services against search results

The “next generation” library catalog’s user interface is bent on doing things with found items beyond listing and providing access to them. Again, the “next generation” library catalog is not a catalog but a tool.

Just think of all the things you can do with a physical book. You can place a hold on it. You can request it for document delivery (borrow it). You can renew it. With a tiny bit of work we could turn the book’s metadata into a Chicago style citation for inclusion in a paper. We could allow people to review the book. We could allow people to purchase the book from a bookstore. Many of the same things can be applied to printed journal articles, but electronic journal articles are a different story. They lend themselves to a greater number of services such as download, print, save, annotate, index along with a set of other downloaded things. With adequate metadata it would be relatively easy to feed a system a book or journal article and request a list of possible email addresses for the author.

Put another way, the “next generation” library catalog provides services against items in its collection. These are value-added services provided by the library, and in turn, saving the time of the reader and making them more productive.

It is built using things open

This “next generation” library catalog is built using open standards, open source software, and open content in an effort to increase interoperability, modularity, and advocate the free sharing of ideas.

Librarianship is a highly collaborative profession. We share cataloging records. We jointly purchase/license materials. We share in the creation of collections. Our professional meetings measure in the 10’s of thousands of people. Much of this is facilitated through the use of open standards like Z39.50, MARC, and AACR2. Hardware and software are crucial to the provision of the “next generation” library catalog, and it is a well-known fact that “given enough eyeballs, all bugs are shallow.” Therefore open source software will be used in order to keep things transparent. Finally, open content such as articles from open access journals, full-text books from Project Gutenberg, theses & dissertations available from the Networked Digital Library of Theses and Dissertations, and preprints from open archives data repositories will be given priority in an effort to highlight advocate the use of information unencumbered by licensing agreements. This does not mean commercially available content will not be included, but access to the content or its metadata must be provided through some sort of open standard such as OAI-PMH or XML files.