Pervasive XML for the Digital Library: Tools, Tricks, and Techniques

Beth Goldsmith, Los Alamos National Laboratory

A nice session from yesterday that had the feel of an XML workshop. Beth offered a quick introduction to XML and XSLT (XML stylesheet transformations) and then got into the nitty gritty as to how Los Alamos is applying the technology.

Some of the highlights:

Beth mentioned one of the themes Roy Tennant’s keynote: agility. When speaking of agility Roy was talking about our need to develop applications and systems faster, but to do it in a way where different pieces of our applications can be re-used in different contexts – a modular approach moving away from the proprietary, vendor-specific model. (The poor OPAC was the whipping boy example again.) Anyway, Beth’s point was that XML enables the modular approach and and allows us to take data and re-purpose it.

Currently, Los Alamos work-flow looks something like:

  1. Work with vendors to buy data
  2. Transform the data with XSLT into MARCxml
  3. Create search and retrieval applications to display data to users

Most of us wouldn’t have the personnel, to do this type of work in house, but it’s an intriguing idea. In many ways, we outsource work and are bound to programmers outside of the our industry to make our applications. Beth’s point was that we can start to take more control and empower from within. She pointed out that by spending a day or two teaching Los Alamos metadata librarians how to use XML and XSLT, she was able to bring those parties with intimate knowledge about metadata into the programming process – cutting out intermediaries.

The rest of the session got into the “how to” side of things. I don’t want to make your eyes gloss over…

Strengths of XML – valid, well-formed, can be transformed into many different objects

Drawbacks of XML – bloated file size (a simple, delimited text file transformed to MARCxml blew up to nearly 3 times it’s original size).

XML Tools and tips:

  • Altova XML spy – good for coding and modeling, mapforce component of software that can start to crosswalk between XML schemas
  • Browsers – IE and Mozilla have XML rendering engines that can read XSL stylesheets, great for debugging
  • References: check Standard XML libraries at SourceForge, get stylesheets at TopXML

It was a long and rich session and I’m really only scratching the surface. Check out the full version of the talk at http://library.lanl.gov/lww/articles/LITA2005/Presentation/