2005

Giving them "Google-like" Searching

Implementing a Federated Search Tool

Speakers:

  • Peter Webster, St. Mary’s University
  • Marvin Pollard, California State University
  • Robert Sathrum, Humboldt State University
  • Joseph Fisher, Boston Public Library

Peter Webster led off the panel with an overview of the basics of federated searching.

First he defined the concept:

  • Too @#& many interfaces
  • “One stop shopping”
  • “Google like searching”
  • Silo busting
  • Cross-file searching

He reminded us that cross-file searching isn’t really a new idea. Remember Dialog? And think of Ovid and FirstSearch. If you buy all your databases from one aggregator, you have it now.

Next he listed the current array of tools:

  • WebFeat “the original federated search” and patent holder
  • Muse Global “the World’s leading federated search tool” claim they began the company in 1997, predating WebFeat
  • CSA Multisearch
  • Serials Solutions Central Search
  • Ovid Search Solver (Muse Global)

Most ILS’s now offer a federated search tool:

  • Ex Libris Metalib
  • Sirsi Single search (Muse Global)
  • Endeavor (Muse Global)
  • Innovative (Muse Global)

Comments:

Should federated searching be patentable?

Practically all libraries are automated, so the ILS vendors have to keep adding new features and services to generate new revenue stream. Federated search is one of the current hot examples.

What’s “under the hood” of a federated search tool?

  • Custom “targets” / “database translators” / “Source packages”
  • You have to translate for each individual database or vendor
  • HTML, Z39.50 (known to be slow with more than 10 different targest) , XML, SQL
    OAI-PMH, API programming
  • Screen scraping

Search issues:

  • Target selection (should you include bib records for books along with virtual content in the same search)
  • Results sort order (first back? Not good)
  • Deduplicating
  • Search feature variability

In a federated search environment, searches are generally only as good as the lowest common denominator, which means you’re often reduced to keyword searching, losing functionality, especially from specialized databases. Search vendors working on this all the time and making improvements, but current offerings still have a long way to go.

What does the future hold?

  • Even better E-content integration
  • We won’t be buying separate tools to bolt on top of our existing content for long
  • Most database providers will be providing XML search gateways and API’s, so as to provide cross functionality between databases.

Examples of current initiatives to watch:

  • Crossref Search Pilot
  • NISO Metasearch Initiative — TG3 (We’re still a long way from standards but they will come)
  • Ontario Scholar’s Portal, CSA’s interface to search all kinds of content
  • Google Scholar (the 5000 pound gorilla)

Future trends:

  • Rapid change
  • More standardized e-content searches and interfaces
  • Simple and diverse cross-search and fed search options
  • Near universally web-searchable e-content indexing

Final comment:
Soon you may not need a federated search tool. In a few years you may be able to link at least some databases without any extra expense or effort via built-in XML, API’s and the like.

Second Speaker:

Marvin Pollard began by briefly detailing the history of federated search efforts in the California State University system. They started trying to build a system back in 1997 when there weren’t any off the shelf applications. Currently they are using MetaLib.

Major challenges have included

  • User interface design
  • Authentication across 23 institutions
  • Configuring searchable resources for 23 institutions
  • Promoting federated searching to librarians

They created a User Services Task Force. Members included:

  • CSU public service librarians with years of experience designing library web sites
  • People with information competence expertise
  • The team was supported by programmers with expertise in web development

What do users want? Full text NOW

Federated search is only half the solution. Open URL link resolution is the other half. Document delivery is made available when full text is not.

The Cal State project had two initial goals:

  1. Searching multiple databases simultaneously
  2. Searching different databases individually, but using the same interface for each

After reviewing all of the currently available applications, Cal State chose MetaLib.

According to Jakob Nielsen’s May 9 Alertbox , user mental models for search are getting firmer. Designs that invoke this mental model but work differently are confusing.

As Pollard succinctly put it, a user interface designed by committee will not be loved by anyone. In the Cal State federated search project, responsibilities are divided. The Chancellor’s office handles such responsibilities as:

  • Licensing for applications
  • Installing upgrades and updates
  • System set up and troubleshooting
  • Analyzing and resolving OpenURL issues
  • Providing first line support and training
  • Liaison with vendor for application support

The individual CSU libraries are responsible for aspects such as:

  • Customizing style sheets and banners (they can do it themselves or have the Chancellor’s office do it to their specification)
  • Configure database categories & types
  • Assign databases/resources to categories
  • Integrate MetaLib into library services
  • Include MetaLib in bibliographic instructions

The Cal State system has one MetaLib server, running 23 instances. Each library can administer their own instance, localizing the knowledgebase to their own resources.

The system provides a Google-like search capability, can dedupe and sort the results by year or by date. A search form that includes database descriptions has also been implemented. The system includes a “find journal” component using SFX which is routed through the link resolver. Users can created their own “My Databases” page replete with alerts, saved searches, journal lists, database lists, and the like.

System development is now moving toward the use of Web API’s. Work remaining to be done includes:

  • Continue refining the user interface
  • Extend federated search capabilities
  • Promote the idea of federated searching
  • Integrate with course management systems on campus

Third Speaker:

Robert Sathrum of Humboldt State University provided a closer look at how an individual library within the Cal State system has implemented the Cal State product. Some of the issues that had to be addressed included basic organization of the resources to be included:

  • Which databases should be included?
  • How many categories should there be?
  • Who would make these decisions?

It was hoped that subject librarians would assist in this process. About half actually did so. After conducting user surveys, they ended up with 10 broad subject categories, with a maximum of 8 resources under each.

Each database configuration has to be tweaked for optimum search functionality and display.

Setup issues include:

  • How many resources to search at once
  • Effects on performance, and on vendor servers
  • Effect on licenses and costs (especially if you are paying on a per search cost, or have licensed a limited number of simultaneous users)
  • User authentication/authorization

How do you integrate federated search into your library’s web site? Some of the options include:

  • Wait for “perfect” system (you may be waiting a LONG time)
  • Fully replace existing tools
  • Incorporate into the site as an additional 51st tool

Fourth Speaker:

Finally, Joseph Fisher of the Boston Public Library presented a case study of using federated searching in a large public library setting. Unfortunately, my notes from this section of the program did not transfer from my handheld to my laptop when I got back to my hotel, and my memory is not accurate enough to reconstruct them at this point.

A brief question and answer session followed the formal presentations. All of the presentation slides will be placed on the LITA web site after the conference.