by Kelley McGrath
Coordinating Editor, Issue 11
I sometimes feel myself a bit of an odd duck on the Code4Lib Journal editorial committee as I am not a coder. I am a cataloger.
However, when I think about this a little more, it no longer seems so odd for a cataloger to be on the editorial committee. The future of library cataloging and metadata is inextricably bound up with technology. This is nothing new—as Karen Coyle points out, the card catalog was an innovative technological breakthrough in its day . The pace and degree of technological change only continue to increase. The rising ubiquity of full-text searching, our expanding ability to mine large pools of electronically-manipulable data , the interconnected linking system of the internet, and the rise of online social networks coupled with the ability to analyze the collective output of the hive mind have all changed the information-seeking landscape.
Although many predict doom and obsolescence for library metadata, I do believe that bibliographic data still has a role to play in the universe of information and that libraries bring significant resources to the table. For one thing, we have a large and valuable corpus of largely-standardized bibliographic data created over a long period of time that would be expensive and time-consuming to replicate. Many other entities, such as Google Books and the Open Library project, would like to utilize this data resource. Libraries also have a pool of people well aware of the challenges in creating useful, consistent, accurate, and complete metadata. We are dedicated to helping people connect with information resources that they want or need rather than to commercial aims. We have been thinking about the description and organization of documents for a long time and have a rich tradition of theory and practice.
However, there are significant obstacles that make it difficult for library metadata to reach its full potential in today’s world. Our legacy data is a mixed blessing. We have all this bibliographic data, but our ability to use it in today’s environment is limited by the fact that our data was designed in an era of very different technological constraints. Although people’s expectations have expanded along with the capabilities of technology, the traditional aims of the catalog as set out by figures like Cutter and Ranganathan are still valid. Understandably, but unfortunately, when the library world transitioned from the card catalog to the computer-based catalog, we largely tried to carry over the same means of achieving these aims. If we were starting over today, I think we would do things very differently—not so much in terms of what, but in terms of how. We need to find ways to meet users’ needs that are more compatible with the strengths of contemporary technology. We must transition our underfunded, interdependent, cooperation-based but locally-customized, and monolithic but decentralized systems for creating, structuring, maintaining, and sharing bibliographic data. We must identify practical means to turn them into systems that work in the Internet age.
In many library world dialogues, such as those on NGC4LIB, a list for discussing next generation library catalogs, there often seem to be battle lines drawn between traditional catalogers and the evangelists of the power of new technology. Both sides often seem to be talking past each other, as well as past the budget-stressed, penny-pinching library administrators who are looking for a greater return on their investment in metadata. If we cannot build an effective contemporary framework for creating, structuring, and sharing library metadata, the path of reducing library metadata to the lowest common denominator rather than manifesting its tremendous potential power in an online environment may be chosen for us.
I don’t think the aims of traditional cataloging and the changes that are necessary to optimize library metadata for today’s automated and interconnected world are in conflict. However, the cooperation and knowledge of both technology experts and cataloging and metadata experts are needed to navigate the transition from where we are now to where we need to go. For this, people from all backgrounds and positions must be willing to take part in open-minded conversations and reexamine their basic assumptions. It is necessary to understand the problems that traditional bibliographic control is trying to solve and the assumptions that are built into the historical solutions. It is also necessary to understand the potential of new tools to expose our data and to help users navigate the ever-expanding bibliographic universe. Finally, it is necessary to understand how our existing data is often not well-suited for exploiting the many possibilities or exposing the many connections that computer manipulation could make possible. We need to create data that is functional both for direct human consumption and for automated processes that can enhance and mine that data for value-added forms of human consumption.
I see my role on the editorial committee as a way to play a small part in helping to facilitate this conversation and to forge a path to more effective and powerful bibliographic data. The ethos of the Code4Lib Journal manifests many of the qualities I most admire in the library world, including cooperation, free and honest sharing of ideas and solutions, open-mindedness, a supportive and collegial environment, and a focus on practical solutions. It is one of many venues where ideas about the synergy of metadata and technology are being hashed out and one that I am proud to participate in.
This issue of the Code4Lib Journal once again contains a solid line-up of useful articles, many of which discuss the interaction between metadata and technology in libraries.
In “Interpreting MARC: Where’s the Bibliographic Data?” Jason Thomale discusses the mismatch between the expectations about data structure that he brought from the modern computing environment and the structure that he found in MARC. Thomale aims to help other programmers frame MARC/AACR records in a way that will help them more effectively manipulate library data stored in MARC. The article is a good example of how different underlying paradigms held by catalogers and programmers can obstruct communication and impede effective data use.
Better tools for inputting and editing metadata are desperately needed. In “XForms for Libraries, An Introduction” by Ethan Gruber, Bill Parod, and Scott Prater describe the use of XForms applications to support consistent and correctly-structured metadata.
Teressa M. Keenan discusses how The University of Montana used MarcEdit and XSLT to modify publisher-supplied metadata for the U.S. Congressional Serial Set, 1817-1994 to make it more useful for their users and to better integrate it into their catalog in “Why Purchase When You Can Repurpose? Using Crosswalks to Enhance User Access.” MarcEdit, a free MARC editing tool developed by Terry Reese, is an excellent example of how the power of technology can be harnessed to improve the quality and efficiency of library metadata. MarcEdit also demonstrates how mutual respect and open communication between technology and metadata experts can lead to improved results for the whole community. For example, on MarcEdit-L, the MarcEdit user community can suggest improvements and learn how to get more out of this valuable tool.
In “Hacking Summon,” Michael Klein describes flexible, locally-maintainable techniques for adjusting and tweaking communication between a consolidated discovery interface, such as Summon, and data stores, such as the local catalog, in order to customize library-specific data and produce more desirable results for users.
Najko Jahn, Mathias Lösch, and Wofram Horstmann describe a project to create lists of publications of Bielefeld University faculty with full text links in “Automatic Aggregation of Faculty Publications from Personal Web Pages.” They faced a tough metadata challenge when they tried to transform the unstructured and inconsistent data they harvested into sensible citations.
Finally, in “Managing Library IT Workflow with Bugzilla,” Nina McHale takes a look at how Auraria Library Systems Department used Bugzilla as a low-cost, easy-to-set-up tool to improve their workflow for tracking and solving IT problems in a timely manner.
 Halevy, A., Norvig, P., & Pereira, F. (2009). The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24 (2), 8-12, Retrieved from http://accelerationwatch.com/downloads/HalevyNorvigPereiraUnreasonableEffectivenessofDataIS2009.pdf