Libraries as Digital Publishers: A New Model for Scholarly Access to Information

This panel featured six speakers who are involved in a new project to digitize books and make them available both online and print-on-demand via Amazon. Two of the speakers, Lotfi Belkhir and Robin Asbury, work for the companies that are behind the project—Kirtas Technologies and BookSurge, respectively—and the other four speakers are with institutions that are digitizing books: Martin Halbert and Lisa Macklin, from Emory University; Joyce Rumery, from the University of Maine, and Linda McKenzie from the Toronto Public Library.

This project differentiates itself from Google’s scanning project by focusing on quality control. As Lotfi explained in his presentation, Google and their partner libraries are privileging quantity—digitizing the most books possible in the shortest period of time—over quality—creating the most complete, accurate, and usable digital copies of books possible. (To demonstrate the problems in the Google method, he showed a set of images of one book that Google scanned that contained a very intricately manicured set of fingernails, and, in some of the images, the entire hand, earning some chuckles from the audience.) In his view, there’s no point in doing a project with such low quality control. The cost of scanning books is only a tiny fraction of the total cost of a digitization process; most of the cost will come in the following years as storage costs. In Lotfi’s opinion, there is no point in scrimping on the scanning and then spending all of that money to store a low-quality product—especially since the institution is unlikely to be able to afford to scan books again any time in the near future.

Robin’s company, BookSurge, is a print-on-demand publisher that was recently purchased by Amazon. BookSurge partners with Kirtas and the libraries to make scanned books available through Amazon, complete with the ability to search inside the book. The book shows up as in stock on Amazon, but no “stock” actually exists—BookSurge prints a copy of the book only when it is ordered. This printed book is branded as being from the library, with such features as the library’s logo on the front cover and images of and text about the library on the back cover. When the book is sold to the customer, Kirtas distributes some of the profits from the sale back to the library, helping to defray the cost of digitization.

Emory University, the University of Maine, and the Toronto Public Library are all doing their digitization work with their own staffs and their own purchased Kirtas automated scanners, giving them complete control over the process. The libraries are able to keep a preservation copy of the digital files, separate from the digital files used to print the books on demand. The libraries maintain control of the digitized books as far as dissemination, access, search, organization, etc. are concerned, and the libraries maintain the right to give the public full access to digital representations of the book.

Martin, the director for digital programs and systems at Emory University, spoke about the process that Emory uses in its digitization project. Emory is focused on digitizing collections, not just books. They have a Digital Collections Steering Committee that identifies and prioritizes collections for digitization (beginning with the Southern Methodist collection), and they provide scholarly contextualizations of the digitized materials. Emory is also planning to work with the Kirtas/BookSurge partnership to publish new digital peer-reviewed scholarly monographs and, in the future, some theses and dissertations written at Emory.

Joyce and Linda both spoke about the collections that are being digitized at their institutions. At the University of Maine (which has partnered with the Maine State Library), they first digitized the old University yearbooks, then moved on to town reports from Maine dating back to the early 1880s. Their next projects are going to be pre-1923 Maine history books, biographies of Maine citizens, and Maine travel books. The Toronto Public Library is going begin by digitizing its Canadiana collection, which contains around 11,000 items, and then move on to other special collections.

Lisa, from the intellectual property rights office at Emory, covered the legal considerations that must be considered when launching a digitization project. The biggest legal risk in a digitization project is being sued for accidentally digitizing something that is still under copyright. Having the digitized books available for sale on Amazon heightens this risk, because that makes it very easy for the copyright owner to discover that his or her work has been digitized. The easiest way to lessen this risk is to digitize only books that were published in the United States before 1923 and that have copyright dates printed on the items, since such books are almost certainly in the public domain. Lisa also emphasized the importance of keeping good metadata about how the institution determined that the item was in the public domain, since the penalties for copyright infringement can vary widely depending on whether the judge determines that the copyright was willfully infringed (damages up to $150,000) or that the infringer was acting on a good faith belief that the copy was not infringing (damages as low as $200).

For any libraries that are interested in joining this book digitization project, the person to contact for more information is Lisa Stasevich at Kirtas: lisa@kirtastech.com.

This panel was taped, so I assume that there will be a video of it available on the Web shortly.


I was supposed to blog a third session, The Future of Information Retrieval, but due to the fact that it takes just shy of forever to get from Bethesda (where I’m staying) into downtown DC during the morning rush hour and the popularity of that session, I was unable to do so. By the time I got there, not only were all of the seats taken, but so was all of the standing room both inside the room and in the hallway outside of it but close enough to the door to hear the speakers. So this will be my last entry on the LITA Blog for ALA 2007. But please feel free to check out my own newly-launched blog at folksweb.blogspot.com.! I’m going to be blogging about the Semantic Web, folksonomies, Wikipedia, Freebase, and all of the other innovative new ways coming out to organize information on the Web.


  1. Maurice York

    (Comments below are based on having been at the program myself, as well as the post above…)

    This partnership and program has a great deal going for it–a very interesting model, lots of good ideas, potentially great materials being made available by exploiting a print-on-demand structure that will likely be the future of publishing.

    I am skeptical about Lotfi’s statement that storage costs more than digitization. Admittedly, I have not done an appropriate cost analysis on mass digitization (perhaps someone who has can tell offer some numbers and tell me I’m way off base), but I am very doubtful that the costs of storage (which I *have* priced out recently) can amount to anything more than a fraction of the Kirtas machine and labor to feed it books, convert the images, create the metadata, analyze the copyright, etc for 15-20 years (the amount of time it would take to digitize Emory’s 200,000 books at the rate of “tens of thousands” a year on one Kirtas machine).

    It also seems to me that Kirtas/Amazon/Emory et al. are doing themselves a disservice by comparing themselves to the Google Library project. I am a great friend of Emory’s, but I have to call it like I see it. I think it is unfortunate that the presenters chose to set their tone as our-model-versus-google-and-we-win. There is a lot of room in the digitization space for everyone to thrive without targeting other libraries and saying their digitization “isn’t worth the cost of storage” (as Lotfi put on the screen in BOLD CAPS referring to the Google Library project) or saying that your aims are more noble because you have the public interest and welfare at heart, unlike the supposed “competition”. I’m sure Michigan, Harvard, Oxford, the University of California, the CIC, etc, etc would be surprised to find out that they have abandoned the public good by partnering with Google. It’s also not great to make false statements in trying to distinguish yourself, such as that Google Book Search has no color images and that the Google libraries don’t have ownership or control of any of the images.

    To be fair, perhaps Lotfi was just trying to say that Google had no interest in the public good because they are a commercial search company. That accusation wears a little thin when it’s hurled at a company that does all of the digitization, conversion, markup, indexing, discovery, and delivery for not one penny on millions of books so far and counting; not only that, but has donated millions of dollars to the Library of Congress for digital library initiatives. It starts to sound like little more than a shrill marketing stance when it comes from a company that is in the business of selling scanners at $100,000 a pop and an online bookseller that wants to sell those out-of-print books for $25 a copy.

    These aren’t criticisms of any of the members of this project per se. I think the Kirtas machine is a very cool device, the collections are certainly worthwhile, Emory is a great library with great people, and it’s a remarkable idea to use Print on Demand technology to such a worthy end. But it is a very different animal than the Google Library project, and if it is going to beg the comparisons, I don’t think it will stand up very well. Let it ride on its own merits.

  2. Biswaroop Todi

    We were supposed to blog a third session. and We take pleasure in introducing ourselves as an organization offering drafting , scanning(raster to vector), raster to vector conversion digitize your old paper drawings & amp; raster images that needs to be updated into perfect CAD output.

  3. Linda Becker

    Excellent recap- Thank you

  4. Pingback: The New Basement Tapes » Blog Archive » Libraries as Digital Publishers: A New Model for Scholarly Access to Information

Comments are closed.