New Standard: WARC (Web ARCive) file format

LITA has received news of new standards activity of interest from Cindy Hepfer, ALA Representative to NISO (HSLcindy@buffalo.edu)

“A new ballot has been presented to TC46 Ballot Advisory Group: ‘ISO/DIS 28500, WARC file format’. This is a new standard at the Draft International Standard (DIS) stage, and this may be the last chance to provide substantive comments to this standard. If all member bodies vote Yes, this standard can proceed directly to publication.”

NISO’s Summary: The WARC (Web ARChive) file format offers a convention for concatenating multiple resource records (data objects), each consisting of a set of simple text headers and an arbitrary data block into one long file. The WARC format is an extension of the ARC File Format [ARC] that has traditionally been used to store “web crawls” as sequences of content blocks harvested from the World Wide Web. The WARC format is expected to be a standard way to structure, manage and store billions of resources collected from the web and elsewhere. It will be used to build applications for harvesting (such as the open source Heritrix web crawler), managing, access, and exchanging content.

ALA’s vote options are Yes, No, or Abstain. Comments are required for all votes other than Yes. As a result, in the absence of other recommendations, ALA will recommend that NISO vote to confirm ‘ISO/DIS 28500, WARC file format’. The final deadline for ALA to vote is Thursday, Sept. 18, 2008, and Cindy asks that you respond to her at least one week in advanced of this final deadline (which is Wed., Sept. 11).

Cindy also notes that this is not a NISO standard, but is being balloted by ISO’s TC46. ALA is not voting on the standard itself but rather is providing feedback to NISO as to whether to approve or disapprove the standard. NISO will review and consider this feedback prior to submitting the U.S. vote. Cindy can provide copies of this draft to ALA members who wish to review it–please contact her directly for a review copy.

Diane Hillmann
LITA Standards Coordinator