technical skills of librarianship

The other day I was asked Someone about ways they could move from reference type of work to more systems sort of work in libraries. I was happy to share my thoughts on the topic, and below is what I said.

Food for thought.

Someone wrote:

> Anyway, I am seeking your advice. I am very hungry to move from my
> reference position into a systems related position, preferably one
> that centered on web application development for libraries….
>
> Can you offer any advice?

I can afford lots o’ advice, and I’m always to help out in these regards. ;-)

I sincerely believe librarianship is overflowing with opportunities for people who want to exploit the use of computers to facilitate library-like activities. While it is difficult to ignore the changes in Library Land with the advent of the globally networked computers — the Internet, I do not think libraries will be disappearing. Certainly the processes of librarianship (collecting, organizing, archiving, and disseminating data and information for the purposes of increasing knowledge and understanding) are still needed and desirable by all sorts of communities. Put another way, there are plenty of opportunities for librarians who want to apply the use of computers to traditional library tasks.

There are five technologies, listed in priority order, I suggest you spend time learning in order to increase your skill set:

1) XML – XML is a sort of modern-day alchemy. It represents a way to turn data into information, as well as a way to unambiguously transmit that information from one computer to anther computer. Take the following string of characters: 1776. It is just an integer. It has no context. One thousand seven hundred and seventy-six what? By marking up the integer in an agreed upon fashion, say 1776, the integer takes on a new meaning. It has value and thus it becomes information. Magic. Since the syntax for marking up XML is simple and agreed upon, there are many communities (government, business, academia, etc.) who use tools to read and use XML information. This represents a giant opportunity for the library community.

All you need to learn about XML is a plain-text editor and a relatively modern Web browser such as Firefox. Mark-up, by hand, things like electronic texts. Associate them with (CSS or XSLT) stylesheets. Process them with an XML processing application like a Web browser, and you end up with human-readable information. No programming. Just information and display.

Attached is an example I was working on yesterday. The XML file is the search results against an SRU-accessible index. The XSL file transforms the XML into HTML and displays it. Save both the XML and XSL file in the same directory, and then open the XML file with Firefox (or some other ‘modern’ Web browser) to see what I mean.

2) Relational databases – Libraries love lists. We maintain lists of books, journals, Internet resources, authority files, controlled vocabulary terms, etc. In an electronic environment lists are best created and maintained using relational databases. Learn how to design, create, and maintain relational databases using the lingua franca of relational database technology — SQL (Structured Query Language). There are many relational databases applications to explore. Microsoft Access. Filemaker Pro. MySQL. Postgres. Access is widely supported in the desktop environment. Filemaker is cross- platform, has a great interface, exports XML, and includes an integrated HTTP (Web) server. Postgres is very standards compliant but technologically a bit challenging. MySQL is widely popular but comes with little or no interface.

If you have access (no puns intended) to Filemaker, then I suggest it first, simply for the interface. Second, I suggest MySQL. It will install nicely on your desktop as well as server computer. Get a book like Databases for Mere Mortals in order to become familiar with the SQL language and normalization techniques. Learn the huge difference between “flat files” and real databases.

Excel and tab-delimited files are not “real” database applications!

3) Indexing – Believe it or not, databases suck as facilitating search, especially considering today’s user expectations regarding relevance ranking. In order to search a (relational) database queries must be applied against specific fields; relational databases do not support free text searching. Databases can fake it by applying queries to many fields, but this gets overly complicated very quickly. Furthermore, relational databases do not have the facility to rank search results according to some statistical analysis; relational databases can only sort results numerically or alphabetically.

Indexers overcome these problems. Indexers read sets of documents, break them up into their atomistic parts (words), and write these parts to a file along with a pointer back to the original document. Searches are then applied to these lists. They work exactly like the back-of-book indexes except all of the words in documents are included, not just the one’s a human thought were important.

To learn about indexing get a program called swish-e. Then acquire bunches of HTML or XML documents and save them in a directory. Turn swish-e loose on the directory to create an index. Use swish-e to search the resulting index. As your experience matures you will learn how to write reports against your database, pipe it to your indexer, and search your database that way. Either way, you will end up with a much more powerful search interface when compared to the use of SELECT statements in SQL.

Databases and indexing are two sides of the same information retrieval coin.

4) Web serving – Increasingly people expect to acquire the information the require for learning, teaching, and research through a Web browser. In order to meet these expectations libraries need to host Web (HTTP) servers.

Believe it or not, this is really easy. Get a copy of Apache, install it on any Internet-accessible computer, and start filling it up with stuff. You do not need a big bad computer for this, and I challenge you to fill it up with so much stuff that it becomes too slow. (Actually, the challenge here will be putting into practice the principles of good information architecture. Specifically asking and then answering questions regarding the context, content, and users of your server.) When your server gets full you will have learned a whole lot and be ready to go to the next step.

5) Programming/scripting – Finally, you will want to “glue” all of the above technologies together into a coherent whole. You will want streamline the data-entry and reporting processes. You will want these process to run regularly and automatically. You will want graphical user interfaces to your XML data, relational databases, searching functionality, etc.

To make this happen you will need to write computer programs. I prefer Perl, but just about any computer language will do. Each have their own strengths and weaknesses. Java is probably on your desktop computer. Perl is particularly strong for manipulating texts. PHP is particularly adept at Web applications. Programming requires a person to think very systematically — almost mathematically. The programmer must understand the step-by-step processes required of a system. There are no leaps of comprehension in computer programs. Computer programing is keen on syntax (as XML is), but once that syntax is mastered real productivity occurs. (Ironically, marking up cataloging records using AACR II requires similar skills and attention to detail.)

Finally, as far as I know, I do not think there are very many accrediting agencies teaching these skills to any great degree. Our profession is aging quickly and there is not a critical mass of library practitioners who can apply these technologies as well as understand the principles of librarianship. At the same time, the processes of librarianship, with the possible exception of archiving, can be closely associated with the technologies outlined above:

* collection – XML, databases, and programming
* organization – XML, databases, HTTP servers, and programming
* preservation – XML (maybe)
* dissemination – XML, indexing, HTTP servers, and programming

Please do not be overwhelmed. All of these things can be learned and practiced on your desktop or home computer. They lend themselves better to server-class operating systems such a Unix/Linux, but learning about these operating systems is challenging in itself and not readily applicable to librarianship. All you need is the ability to read books, the desire to learn, and the time to do it.

Good luck, and I hope this helps.


Eric Lease Morgan
University Libraries of Notre Dame

August 7, 2005

9 thoughts on “technical skills of librarianship

  1. I added lita blog to my home page because your article is interesting. The RSS only shows a persons name and a date…. no title. I will probably remove it because I do not have time to read everything which is why I use RSS.

    On your article: I got my MLS in 1973 and looked for a job for 2 years while doing volunteer work. I never found one and ended up in data processing. I have picked up all the skills you mentioned and you are correct they are not that difficult with the exception of Java which is not just one piece of technology but the tip of an enormous iceberg.

    I have tried to get into the indexing business as my days in corporate America wind down but it is a very closed society.

    I would love to get in to special libraries again but there seems to be a shelf life on the MLS degree also.

  2. Great article which covers the most important bases. In libraryland these days there is much talk about collaborative web interfaces such as wikis, blogs and RSS. I expect that the use of these types of technologies will be growing in the future; they are more peripheral in terms of what your article is about, but worthy of a comment:) Thanks for the article!

  3. I am a current MLIS student, former web software engineer with a bachelor’s degree in CS. I have some interest in pursuing a systems librarian career. So I am happy to see that I am quite familiar with all the technologies you list there (well, less so for indexing).

    But now I want some advice on: How do I find a job as a systems librarian? I’m not even sure where such jobs are listed—most of the general listings of librarian jobs I’ve been looking at don’t seem to have many systems librarian postings.

    What is the field like for systems librarians? What sort of libraries even hire systems librarians? Do public libraries have systems librarians? (No such postings can be found on most public library web pages, although they all seem to continually accept applications for user service librarians). What are systems librarian jobs actually like (I’d guess they are not all the same).

    So… if you wrote a little post about that someday, or if you have any resources to suggest to me, they would be welcome.

  4. Your post on this topic is fantastic — thank you. And you are right that there isn’t enough emphasis on skills such as these in LIS graduate programs. Even in the more “tech-oriented” schools, the emphasis seems to be more on collaborative and other social tools and on the theory of computer technology than on the practical knowledge that you list, which the programs seem to expect you to acquire on your own time as an adjunct (when, as you suggest, this knowledge is in no way supplementary or secondary).

    I suppose the reality is that some people enter library school expecting the standard curriculum of cataloging, reference, and collection development (with some professional practice material tossed in) and may become petrified by the idea of learning anything related to technology. The fear is understandable, but the fear also puts you at a disadvantage. I will save this list and add as many as I am able of these abilities that I don’t already have to my skill set. Thanks again.

    On the same topic, here’s another excellent recent post on acquiring tech skills:

    http://meredith.wolfwater.com/wordpress/index.php/2005/07/21/the-kept-up-distance-learning-librarian/

  5. eric

    im a software developer and former network administrator. just wanted to say i think you’re misleading your readers a bit.

    while the basic concepts of programming are not overly difficult to pick up, the nuts and bolts and quirks of a language are; programming requires constant education, (time consuming with books, expensive with courses and workshops) practice and learning. programming is becoming more technical, not easier.

    xml is a good skill to know, altho i might point out that the easy part to learn you refer to is not “syntax” but “format”. format is the architecture, or specification, of the language; syntax refers to the specific wording of commands in each implementation of the format.

    you’re a bit off on databases as well. a database is simply a collection of data and can just as easily be contained in flat files, spreadsheets, documents or relational databases. a database management system (dbms) such as Access, MySql, Oracle, and Microsoft SqlServer provide ways to manipulate the data, including free text search (e.g., oracle, sqlserver).

    finally, careful about saying that setting up a web server is really easy. acquiring proper hardware with appropriate warranties and technical support, installing and configuring web server software, progeramming it for all those nice things you talk about, providing for security, patching, updating and upgrading, system redundancy, and data recovery and backup (to name a few little chores) require knowledge and experience to do well.

    some people have a knack for these technical aspects, some dont. better take that into account if you want to learn this stuff.

    to get your feet wet, check out http://www.xmlsoftware.com/.

    j

  6. Hey John Baker,

    That’s a way to rain on someone’s parade. Thanks for reinforcing the attitudes & roadblocks that prevent people from exploring topics like programming, server admin, and xml. We librarians use this stuff everyday and usually we have more specialized folks backing us up to help out with the difficult parts. I don’t think this post is misleading readers at all. In fact, this post has been a good list for librarians and like minded people to run through and assess our strengths, weaknesses, & what to read up on in the future as we move our collections from an analog to a digital world. As our systems demand more of this technology, we need to be just as aware of these tools as the programmers and administrators that we deal with every day. That way we can talk to our IT people and they don’t turn into aloof know-it-alls that poo-pooh every suggestion that we might have & try and scare us with the complexity and difficulty of the things that they have specialized training in.

    Best,

    Jason

Comments are closed.