SUSHI: The NISO Standardized Usage Statistics Harvesting Initiative

The full title of this presentation was ‘Building a Web Service for the Library World, from the Ground Up’ and that’s exactly what the three presenters covered. We heard about the project’s beginnings and current status, what it means for content providers, and how it affects vendors of ERMs (Electronic Resource Management packages). The presenters gave the audience a thorough introduction to SUSHI and its implications for libraries, and I left feeling much better informed about the project and its significance for libraries, publishers, and ERM package vendors.

Adam Chandler, Co-ordinator, Service Design Group, Information Technology and Technical Services, Cornell University Library, and also co-chair of the SUSHI Working Group, started by saying that retrieval of COUNTER usage statistics is currently a bottleneck. Most libraries currently do this by visiting individual publisher or aggregator websites, locating the desired statistics, and either displaying them or downloading an Excel file. The process is inefficient and time consuming. SUSHI (an acronym for Standardized Usage Statistics Harvesting Initiative) is a simple web service designed to offer an them a simpler alternative.

SUSHI began as a result of a conversation in November 2004 about the best way of entering COUNTER statistics into an Electronic Resource Management module. The concept was discussed further in June 2005, and work began in earnest with a range of vendors in July 2005. In October 2005, there was a recommendation to make SUSHI a NISO initiative. From November 2005 through summer 2006, SUSHI 0.1 was developed and tested; the vendors involved were Ebsco, Swets, and Project Euclid. Most recently, SUSHI 0.1 became SUSHI 1.0, and NISO and COUNTER have signed a Memorandum of Understanding saying that NISO/SUSHI will maintain the COUNTER XML schemas.

The SUSHI protocol is simple, supporting only two types of messages: Request and Response. The Request message include information about the requesting organisation, the name and release (version) of the desired report, and the date range. Messages are sent using SOAP (Simple Object Access Protocol), which also provides security features.

Vendors who are (or will be) supporting SUSHI include III, Ex Libris, Serials Solutions, and Endeavor, and ScholarlyStats and Thomson Scientific are also interested.

Outstanding issues include extending SUSHI to cover the COUNTER books and reference codes of practice, promoting it with other vendors, and managing the expectations of libraries. The SUSHI Draft Standard for Trial Use (PDF format) was released on 9/20/06.

Adam’s presentation will be available from:

http://www.people.cornell.edu/pages/alc28/sushi/lita2006AdamChandler.htm.

The NISO/SUSHI web site is:

http://www.niso.org/committees/SUSHI/SUSHI_comm.html

The second presenter, David Ruddy from Cornell’s Center for Innovative Publishing, covered what is involved in implementing SUSHI from a content provider’s perspective. David acknowledged Joshua Santelli’s work in Project Euclid’s SUSHI implementation. He began with an overview of Project Euclid, a Cornell University electronic publishing initiative in the field of theoretical and applied mathematics and statistics. Project Euclid has a complex business model, involving agreements with commercial publishers of hard copy journals in this field.

Publishers first need to collect, store, and provide access to COUNTER usage statistics. Implementing SUSHI involves developing the capability to recognize and respond to SUSHI requests. This includes being able to recognise that an incoming request is a SUSHI Request and interpret the data about the desired report characteristics (report type and date range). They then need to prepare the COUNTER report, which may be static (i.e. already stored on the system) or dynamic (generated upon request), build the XML report, and return the response using a SOAP envelope.

Their experience was that implementation is fairly easy. David concluded by warning us that there are some challenges to counting usage of electronic resources, such as double-clicks and the effects of web harvesters. He also questioned whether it is possible to make cost per use comparison across disciplines, and what the most appropriate time frame is for meaningful analysis.

The third speaker, Ted Koppel from Ex Libris, gave us the ERM vendor perspective on SUSHI. He began by explaining that the purpose of an ERM is to manage the lifecycle of electronic products, including selection, trial, licensing, access, cost, usage, workflow, and renewing/cancelling. Usage statistics need to be available at a range of levels, such as all titles from a specific vendor or a specified title from all vendors. Usage statistics are available from content providers, subscription agents, licensers or resellers, 3rd party specialists such as ScholarlyStats, and link resolvers/metasearch engines (though he cautioned us that the last is by definition incomplete because many users bypass these tools).

The primary value of SUSHI is that it automates a manual system, and requests can be scheduled for appropriate times (such as the middle of the night). The process is simple, and there will be an audit trail.

As we move to a more disaggregated environment for service delivery, we will see more services such as SUSHI, which provide functional interoperability between vendors. He speculated that future developments might include the electronic transfer of license information from the vendor/publisher to the library, or on a more day-to-day level, information about resource availability/downtime.

Q. There is currently a spectrum of compliance, and COUNTER data quality varies. Is there a solution?

A. An auditing process will examine COUNTER compliance, and this should lead to improved quality. Third parties like Scholarly Stats can provide some quality checking.

Q. Can SUSHI be used outside an ERM?

A. Yes, but the library will need to be able to generate their own SUSHI requests and handle responses. An open source project might be the answer for people who can’t afford a commercial ERM.