MODS, MARC, and Metadata Interoperability PART 2

Speaker 4 (first speaker of second half): Ann Caldwell, Brown University

Overview of digital initiatives @ brown. The CDI was created in Oct. 01; metadata specialist position (Ann’s) created October 2002.

Brown metadata model: Ann’s position includes all metadata, not just descriptive. Using METS to package, chose MODS over DC. Their model enables both shallow and deep discovery. For example: an art image can be searched in native VRA format in Luna, but in central repository as MODS. Everything has a MODS record.

Early projects. Were at first only delaing with library materials – sheet music, etc. Used existing MARC-MODS tools. Still have no metadata creation staff but got interns from Univ. R.I. Library school. 150 hours/sem = 3 credit hours. Many students are on second careers and very focussed on their work. Began using NoteTab Pro, which they had also been using for EAD-creation.

Broadening the base. Word got around campus very quickly that this was going on. Faculty and other groups began coming in with very creative projects, some hybrid of own materials and library materials.

CDI dropped their current work to help faculty. The Scholarly Technology Group (part of IT) were contacted to be sure the CDI was not duplicating their efforts. They weren’t, STG wasn’t doing any metadata work to speak of.

Ho to build MODS records. Some from MARC, some from scratch, some extracted from other dbs (FMPro) to convert to XML.

NoteTab Pro: cheap. Downloaded the EAD “clip library” and modified it. Very flexible. All MODS, all METS records build in NTP. Programmed it to prompt the user through a series of templates. Constantly making changes to this.

VRA & EAD records mapped to MODS transformed with XSLT. VRA records exported from image cataloging system (FMPro-based). Not all elements retained from VRA -> MODS. “subjects in VRA get a little squishy” EAD component-level content captured and converted to MODS on a 1-1 basis.

What’s in the records? Have established a bare minimum, every MODS record validated against stylesheet for minimum and also certain local requirements. Don’t have subject analysis on all records.

Storage and display: records mapped into PHP/MySQL (homegrown). All mapped into relational tables to enable the cross-collection searching. Records retrieved through search are displayed with stylesheets.

Ann had several examples of table displays and a schematic diagram of the system [see her ppt.] She demo’ed searching the Brown repository.

Current status. July ’04 added 1.66 professional position and some additional paraprofessional staff (didn’t catch the number). Still no additional staff for the metadata component. 20+ active projects now. Have started to work with audio and video. Audio hasn’t been a problem but video serving is still being addressed at the university level.

There are now some main Technical Services staff generating MODS.

Future directions: NoteTab works OK for some but some users (particularly outside the library) really want a web interface. The scientific/medical communities at Brown are very interested in adding content but don’t have time for description. Looking at TEI this summer; the STG group have had great success training students to do TEI encoding. Looking at the overall staffing, looking for efficiency opportunities. Digital backlog is now larger than the analog backlog.
Brown digital library site.

DEMO: NoteTab Pro. Showed MODS tools (building MODS through prompts), using NTB to create METS record and package. THIS WAS A VERY COOL DEMO! Can’t really do it justice here in the notes.

Speaker 5 (second speaker from second half): Terry Reese, Oregon State University

Terry is the Digital Production Unit Head @ OSU and was named a 2005 “mover and shaker” by Library Journal. Terry has a software dev. background.

Started by giving some background on metadata interoperability and metadata tools: proliferation of metadata schemes; differences in best practices is also a source of some problems. Cited the Indiana study that showed metadata creation costs of about $3/book for copy cataloging, $27/book original, $20/these .

In some cases, things are being cataloged more than once: things that go into DSpace or ContentDM. Now, they only create one record and derive/repurpose for other uses.

Challenges of interoperability: one-to-many, many-to-one transformations (this is the problem of going from less to greater semantic richness, or vice versa, same problem Moen touched on in his talk). Other problems include different hierarchies and “spare parts” – leftover content that doesn’t fit anywhere. It may be better to discard than to try to make non-fitting data fit?

MARCEdit crosswalking tool uses MARCSML as control schema to facilitate transformations. Due to the nature of its design (network, or star), no more than two tranformations will take place (looks like a wheel).

DEMO of MARCEdit. Transformed an EAD record to MARC. It also has a MARC editor for people who aren’t comfortable editing MODS directly.

Also has an OAI harvester built in to grab OAI records and transform them into MARC. They use it at OSU to grab DSpace records and input into the library catalog.

This was a great Demo and there is a lot more to Terry’s presentation than I’m reflecting in these notes. His PPT will have more detail. It was a very impressive tool and a wonderful way to end this long session; it gave me the sense that non-programmers could get their hands on some tools and actually do some transforming. A great way to become familiar with these various schema. See Terry’s site for links to MARCedit and other goodies.