Institutional Repositories: Design and Development, Panel Discussion

Developing and Institutional Repository: Implementation of DigiTool at Colorado State University Libraries

Shu Liu, Colorado State University

Yongli Zhou, Colorado State University

The first panelists, Shu Liu and Yongli Zhou, describe implementing a IR through DigiTool, exLibris’s Digital Repository software and their talk focuses on using an out of the box product. Colorado State used contentdm from 01-present, but will migrate to digitool (which they’ve also been using since 07) by 09.

Digitool has a series of web based client for the user and staff to interface with the the database. There are also access and maintenance components.Aspects of the digitool product can be customized, the icons, menu, header and footer, etc. They also did do some work customizing the metadata display and there were automated and manual ways to do these functions. They also implemented handles for their documents, though it was a bit difficult to implement and took time working with digitool directly. Colorado state purposely chose to limit their customization and wanted it done by experienced programmers.

DigiTool offered a lot of sophisticated tools for metadata including using DC, batching .csv files and using native XML files. In addition to this descriptive metadata, METS was used for structural metadata. This has allowed them to handle different types of complex objects.

The panelists stress the importance of collaboration in a project like this. They appreciate the relatively short and easy implementation (although they think CONTENTdm is easier) and that less tech support is needed once established. Tools for batch upload, the handle, and API are quite useful.

The repository can be found at: http://digitool.library.colostate.edu/R

EPrints as the Cardinal Scholar Institutional Repository at Ball State University — Bringing and Institutional Repository to the Ball State University Community through Cardinal Scholar (CS)

Bradley D. Faust, Ball State University

The second panelist, Bradley Faust from Ball State spoke about using EPrints to create “Cardinal Scholar”, their IR (http://cardinalscholar.bsu.edu/). Ball State originally thought CONTENTdm would be their solution but ultimately decided against it due to a lock of open deposit, difficulties integrating IR objects with existing collections and the need for too much customization.

After reviewing a few other options, Ball State eventually settled on ePrints because it used PERL, Apache web server and mysql, accessible code, and had an active community for development. The cost considerations were also good for this product.

They successfully implemented on fedora core for several months for development. Then built the production on Windows EPrints server in Nov 07. After testing in the first 3 weeks of december, they made a soft release. With feedback they continued to tweak from December to January.

Issues with implementing ePrints mostly involved coordination with the campus computing center. The system needed: a domain name for the system (cardinalscholar.bsu.edu); a system security scan to evalute server config; internet access to the CS server through the campus firewall; an SSL certificate for system; and they needed their informational pages reviewed for consistency with other computing policies.

They did do some customization like modifying informational and deposit pages (for easier use), changing the defaults for new accounts and adding left hand navigation column.

They would like to customize it further especially reordering the deposit fields in, implement LDAP authentication, further develop their strategy for supporting small group access (i.e. shared user accounts); put in some more reporting and tracking functions.

Cardinal Scholar has been running for 10 months in production. The library has uploaded some university archives documents and special collections material. Some personnel issues slowed down development of the project, but they were able to add about 150 digital assets so far.

Libraries as publishers: Using the Open Journal System in a Smaller Academic Library

Tabatha Becker, University of Colorado at Colorado Springs

David Hodgins, University of Colorado at Colorado Springs

Tabitha opened by clarifying that this presentation focused on the creation of an open access journal at UC Colorado Springs. It does have some IR qualities, but is not really an IR project.

UC@CS, as a smaller institution works closely with campus IT for technology projects.When Tabitha started in 2007, she sent out a brief web survey to her faculty about their attitudes to open access. She found that it was misunderstood and not viewed favorably for their own work, but felt it was important for their students. This caused a change in the library’s focus. The switched from faculty work to undergraduate research and created a journal on the topic. Their goals were to promote and showcase student research and collect previously uncollected items. They considered this a precursor to developing an IR, since they were beginning to collect some of these objects before an IR was in place. Ultimately, they also wanted to be more involved in campus research and scholarship.

They felt that they would be successful if:

  • they could efficiently host the journal it couldn’t be too much of a stressor to the library.
  • They also felt they should be able to get content regularly so that they could put this product out.
  • Lastly, they felt they needed to have the faculty and students involved as editors. The library would only play the role of publisher, they would not be involved in the vetting process.

Next they needed to determine how to get the system built. Eventually, they decided to go with OJS and open source, out of the box service, that does allow for easy customization if neeed. They also liked the features of the software for metadata handling, user registration, and statistics. They feel that it will play well with an IR when they develop it. The system can create roles for active participants who interact with the system through the web, so they do not need to coordinate with their editors, reviewers, etc. within the library.

Due to the relationship with the campus IT, the installation of the software was done by campus IT, but collaboration was relatively easy. The library does have complete control of look and feel and navigation using HTML, CSS, and PHP. Content uploading happens through a GUI interface and does not require technical skills. The system comes with out of the box interface as well, so it can be mounted with no customization.

The first issue was released on August 22nd, 2008. Though the submissions were small, the quality was good. The scope has widened slightly to include faculty introductions to themed issues. After the soft release, there was not a lot of access, but usage has increased as it has been promoted.

In terms of their goals, they feel they have efficiently hosted the journal, but their ability to get content has been mixed. Some of the faculty are still uncomfortable with the concept. At first they had very little buyin from the campus, but she has found that the faculty and students are getting more on board. Tabitha feels that with more publicity, things will improve.

The next issue will come out in November and there are plans to work with the Colorado Spring Undergraduate Research Forum to do their proceedings. More publicity plans are in development and faculty collaborations are being sought. Tabitha hopes that this experience may convince faculty that open access for their own work.

http://ojs.uccs.edu/index.php/urj/index

questions were asked regarding use of CONTENTdm (the panelists said there are some good projects using it and it will probably develop more, but the first tool panelists found that it was not the bets for them. Shu also mentioned that her institution would like to only support one repository and they felt that DigiTool offered them more sophisticated tools.

Bradley also clarified that the IR at Ball State will only collection materials from Ball State.

Shu said that many file types are ingested into DigiTool, there are not format limits. With many objects to migrate from CONTENTdm, they are hoping to also retain the same interface pages and just have them point to DigiTool rather than CONTENTdm.

Tabitha addressed the need to have technical support for the faculty/scholar collaboration. She feels that they will really only need to know how to learn the web tools and understand the architecture of the system. They have decided to use the DC scheme for OJS, which does create metadata. An audience member suggests working with cataloging to create records in the catalog

Participation and Power: Combining Community Features with Existing metadata in NextGen Public Interfaces

Participation and Power: Combining Community Features with Existing metadata in NextGen Public Interfaces

Dinah Sanders, Innovative Interface

Kelly M. Vickery, University of Kentucky

Instead of just talking about encore, Dinah will discuss how metadata is exposed for patrons to leverage, how is it extended to cover gaps in controlled vocabulary. The majority of Americans use the interwebs everyday. This means they are coming in with savvy web skills and we can leverage metadata to give them tools that are powerful and that users recognize. They are trying to bring these patron skills together with the library strengths of good metadata. However, there are limits, particularly as was mentioned in the opening session “cookery” is not a common term. Encore tries to bring together the formal controlled vocabulary and folksonomies to rectify these problems. Searching can be done across library metadata and user-supplied tags.

Early attempts at this like penntags, sopac, library thing for libraries – exposed key problems:

  • Have to use parallel interfaces for tags (they aren’t intermixed with library data)
  • Tags are stored and searched separately
  • The participation then is really not IN the library catalog

These initiatives forged new ground but still retained separate systems

Community tagging success comes from:

  • Eliminate hurdle to participation (like creating a profile)
  • Create credibility and local relevance (require authentication with an existing profile)
  • Give an immediate return on investment (immediate indexing)

There was apprehension about letting users in, that they would add irrelevant data. Since implemented in June no tags have been deleted. Staff also add tags. Tags and subject headings are used in display and retrieval interfaces together.

Existing ontologies are under-rated, subject headings aren’t perfect, but there is really good information there that can be useful if it is presented properly. For example, relying on recognition (Or’d results) to give them a way to use the subject headings. The results has links to many ways to broaden results.

In addition to using user tags to enhance metadata, it can also point out areas that traditional cataloging wouldn’t get to. Sun Also Rises has a user tag for “lost generation.”

A tour of the user experience for community tagging in Encore:

  • A search for drm, shows that the other subjects are “intellectual property” “rights management” etc. The results also show which part of the record matched the term, so you can see if the hit was from the title, subject, or tag.
  • Adding a tag is a simple text box. Simple authentication is needed when a tag is added based on an existing profile id. Users can also delete their own tags.
  • Administrators can delete tag, but so far it hasn’t been a problem. There are other administrative tools to allow you to blanket approve a tag (“University of Illinois” would probably never be offensive and would not need to be reviewed). You can whitelist just for a particular book too (so “gay” might be relevant to a book about gender studies, so you wouldn’t want to continually review it, although it might be offensive in other records). You can block an individual from tagging as well.

Encore was built in rapid iterative development with many partners. The customer base (a large part of it) has been working with encore on the development to ensure that it is useful. Encore can plan their development cycle around library planning cycles.

Kelly from UK speaks about integrating encore with Voyager. UK had been considering adopting a discovery layer, but couldn’t find the right product. AquaBrowser wasn’t right, Endeca required too much work. They attempted to adopt Primo but it didn’t work out. Went with encore. Since Encore had been developed with Millenium, adapting it to Voyager required some work. Discussions began January 07, a test interface was ready by May. By September, they were ready to begin nightly updating encore with changes that had been made to voyager and it was offered as an alternative search interface. By December it was integrated with the regular OPAC, but is the default. There were some particular issues with the relevance of serials in results list (they tended to be really low). This was worked out and works well now. Through 2008 development on encore 2.0 has continued and was officially upgraded last month. A few issues are still being worked out with 2.0 though.

The integration they chose (to have both simultaneously, but default to encore) works well for them. Navigation between the two is easy (tabular). Give the power users the option to continue to use voyager. Data from voyager to encore:

  • Marc/serial holdings
  • Item info
  • Item circ.
  • Location/collection
  • Patron records

The nightly update is no good for circ though, so that info is not in encore. There are solutions to this, but they were too difficult to implement. They are still working on integrating LDAP into encore for authentication. Encore sends back to the ILS:

  • Request an item
  • Find out more
  • Advanced search
  • Circ. stats

1 out of 3 use Encore. It has not been promoted very much yet, but this number is encouraging to UK. Undergrads tend to use encore, but faculty tend to switch back to voyager. This seems pretty believable. They have gotten some negative comments, but only a few. The return visitor rate is about 45%. Voyager has a return rate of about 60%.

Tags in the academic environment might be really useful for journals. The subject headings for journals tend to be very general. UK hopes that users will tag journals to make them more useful. Another area that might be very useful is tagging for specific courses. This would also allow the professor to update these continually. The course reserve list could be indicated through tags. Students of course could tag things for courses as well. A third is animal names. Subjects tend to use scientific names – equine, bovine, etc. tags can add horse and cow.

Tagging might start to eliminate the problem where users need to create many separate bibliographies for their discipline (A/V, women’s studies etc.) – and some of these are even made into a shadow database. If the catalog gives groups the ability to dynamically create and resort their own lists, they could do it through the catalog. This would be even better because it could be updates as the catalog is updated.

Dinah closes the session with examples of public library tagging. Fan or community groups tag using their own terms, use informal vocabulary and disambiguation, emerging vocabularies – all of these are represented. Emerging vocabularies is also a good point for academic libraries, disciplines are often evolving before they are made into subject headings. Some of the users are adding very library-centric things like isbns and “young adult fiction.” Tags enable “iterative “berry picking”” approaches since they don’t have to start over and do sophisticated pre-coordinated searches.

U. Glasgow found that only 4% of users did a subject search in the old catalog – the terminology was confusing. Now however, 66% of users use the tag cloud.

To close Dinah notes that not only the materials, but also your community’s conversations are collected. And in an age of increasing digitization, it is the local voice which distinguishes one library from another.

Questions discussed other search suggestion features of Encore (“Did you mean”), catalogers use of the tags (when appropriate), the database for tags (keyed to record id. The tag database stays in Encore, does not go back to Encore. If a new ILS is switched to, encore could map the old bib number to the new); Encore is linked back to from Encore when other ILS functions (like request, etc) are needed. Tags are associated with patrons forever, even if the patron record goes away, however no one has expressed any privacy concerns because it’s completely opt-in. Tags for courses could be problem in this way – if course curriculum changes, things would need to be untagged. Encore provides administrative tools for that, but it happens at the administrative level. At this time, users cannot batch add tags.