Issue 6, 2009-03-30


Conference reports from the 4th Code4Lib conference, held in Providence, RI from February 23 to 26, 2009. The Code4Lib conference is a collective volunteer effort of the informal Code4Lib community of library technologists. Included are four brief reports on the conference from the recipients of conference scholarships.

By Jie Chen, Joanna DiPasquale, Lauren Ko, and Andreas Orphanides.

The 4th Code4Lib Conference was held in Providence, RI, from February 23 to 26, 2009.  As in years past, the Code4Lib community was able to offer scholarships focused on gender diversity and minority representation, this year sponsored by Oregon State University and Brown University.  Following are conference reports from the scholarship recipients.

For more notes, many informal, by other Code4Lib community members, find content on the internet tagged “c4l09″ or “#c4l09″, a label attendees were encouraged to use to aid collocation. For example:

From Jie Chen

As an ILS system administrator, I did not come to this conference with a goal to improve any programming skills. Instead, I wanted to have a better understanding of the thriving and energetic Code4Lib community and find out what’s the latest with open souce library technologies. I was very impressed by what I saw. There’s so much more than mind blowing coding discussions and demos, as Code4Lib offers valuable dialogues of where the libraries stand today and where we need to go.

While I enjoyed all sessions, I found the 3 keynote speeches particularly interesting. A recurrent theme in all three seems to be an emphasis on data openness, which goes beyond simply making use of open source software and tools for libraries. Stefano Mazzocchi talked about the close to zero marginal cost of communication, and how it’s changing the role of libraries. His demo of Freebase showed how collective common knowledge contributed by users can be translated into tremendous value. Sebastian Hammer pointed out that even though standards suck, they are central in allowing data to flow freely and are essential for collaboration; hence a call for systems and organizations to surrender our data. Ian Davis talked about the importance of open data. And I thought he put it very well when he said the goal is not to build a web of data, but to enrich people’s lives through access to information.

A spirited game of ‘Werewolf’ during conference down time.

I loved the conference format — 20 minute sessions, single track and the favorite of many people including me: 5 minute lightening talks. Jodi Schneider and William Denton gave a great discussion on the FRBR model and demanded strong FRBRization in library applications, which made the long-lost cataloger in me want to stand up and applaud. For the same reason, I liked the demo of the open source web based metadata editor ‡ by LibLime. Godmar Back introduced the LibX 2.0 community platform that allows sharing and building upon existing LibX services. I was really excited to see that librarians will get to play a role in this platform — non-programmers like us could be adopters who combine modules into libapps, and reuse and share them as packages. Attending his pre-conference workshop allowed me to see the editing of libapps in action.

In addition to beer and wine, the jokes in the IRC backchannel, and the addictive werewolf game, this conference offered me an insight into programmers’ view of the library world, and it has been really fascinating. Since coders create applications to handle and process data, I think it gives them a very keen sense of how data could be more efficiently shared, linked and retrieved. Just like what Ian said at his keynote, because there is more structured data than unstructured data, therefore people who understand structures matter. I walked away from Code4Lib 2009 with a deeper understanding and appreciation of developers and coders’ contribution to the library community.

From Joanna DiPasquale

The themes of interoperability, portability, scalability, cooperation, and, of course, change were on display at the fourth annual Code4Lib conference in Providence, RI. It was an excellent opportunity for library technologists to come together and showcase their innovative software and brainstorm new ideas. But there was much more, for the conference’s messages were both simple and powerful: this community wants to design to interoperate, it wants to share, it wants to create things that make life easier for library users.

Perhaps Sebastian Hammer, in his keynote address on Wednesday, February 25 said it best: the assumption that the marketplace changes libraries inexorably – where our choices are only to get out of the way or adapt – is coming into serious question as market forces shift.* Code4Lib provides some of the innovation needed to not just be part of the larger conversation of the “next phase” of libraries, but to lead the conversation. It wants the library to win, and it is doing something about it.

The conference provided a wide array of insights into the current and next great library applications. The range of projects was amazing: from geospatial data, to information visualization and dashboard styles, to metadata standards, to more effective searching and indexing, Code4Lib showed the innovation of library and archive technology (see Program, Breakout Sessions, and Lighting Talks for more information). Through the presentations and breakout sessions, the themes above provided us with two main goals, “what we can do right now” and “what we need to do for the future.” The desire to provide ways to interoperate with already-on-hand systems – from Blacklight’s plugin-based customization files for local instances, to the improvements in vendor tools and APIs from OCLC, Ex Libris, and Serials Solutions – gave viable solutions for the present. The push for better standards – from EAD to Semantic Web to SWORD techniques – provided insight to current and future solutions. From the provocative Stefano Mazzocchi challenging us to embrace electronic books, to the experimentations with EAD, Djatoka, and dashboard views, challenges become opportunities.

Yet, as Ian Davis stated in his keynote address, code may change, but data remains. Our standards and interoperability are key; as we learn more ways to expose our library data to the greater community, we also know that we’ll find better solutions in the future to work with our repositories and catalogs. The conference met these challenges head-on by providing the techniques and the thinking behind them to enable all of us to do more.

* If you attended Code4Lib 2008, you probably thought I was going to write, “Perhaps Hammer said it best when he noted: ‘I’m not much of a keynote guy. I always thought that, if you have something to say, you should release it as code.’” But I didn’t (until now). It was an excellent thing to say, but the argument he proceeded to develop was much more powerful than that quip.

From Lauren Ko: Bacon, Kittens, and Brewpubs


My experience at code4lib 2009 began with the Linked Data pre-conference. Speakers focused on the use of RDF for describing and sharing data to enable the linking of resources via the crawling of URIs. The pre-conference was great, not only because my FOAF file won me a book, but because its focus on enabling the sharing and connecting of data was a fitting indication of what would come at the main sessions.

Opening Remarks

The conference began surreally with Roy Tennant, whose words I used to read monthly in Library Journal, showing off his face on a pair of thong underwear. Despite this image, his words were significant as he encouraged all attendees to take part in the conference beyond the role of viewer. Mark Matienzo followed him on stage with a presentation on how to have fun at code4lib 2009. His recommendations included the IRC channel, some sort of bacon-donut (plus food in general), and of course, beer.


The idea and implications of opening up data to the Web pervaded the three keynote addresses of the conference.

First keynote speaker Stefano Mazzocchi began with a history of information technology from cave paintings to electronic publishing. He spoke about the cost ineffectiveness of libraries keeping physical copies of every book that could be improved by institutions moving to the virtually infinite storage space of servers. He then showed us Freebase as an example of marked up data that can be utilized by any number of to-be-developed applications to enrich the lives of users.

Sebastian Hammer, Wednesday’s keynote speaker, issued a call to arms. He implored all library coders to become advocates within their organizations and prevent a death of libraries brought about by dependence on the idea of the book. He also spoke of the problem of APIs forcing loyalty above interoperability and the need to surrender data freely to help remedy such problems.

Ian Davis, the final keynote speaker, continued the push for libraries to set their data free and contribute to building a more useful Semantic Web. Because data outlives code, it is more important for institutions to push for open data over open source.

The Rest of It

Other sessions addressed various areas within the realm of libraries. We heard about indexing collections (Solr, LuSql), tools for facilitating access to library resources (LibX, Jangle), and content management systems (Drupal, VuFind, Blacklight). Being new to the code4lib community and having limited knowledge of some of its topics, I was most satisfied with speakers who began with a general overview of their topic, followed with further description, shared a demo, and concluded with what it means to the community. Particularly fun was Richard Wallis’s presentation (complete with sound effects) on using the JavaScript based framework, JUICE, to extend OPACs through embedding related external content (

The conference facilitated an amazing amount of participation/contribution with its breakout sessions and lightning talks. The breakout sessions allowed attendees to informally discuss current projects, issues, and recommendations. The lightning talks, covering a variety of topics (applications, specifications, protocols, etc.), introduced me to new ideas, resources, and projects for further investigation.

My experience at the code4lib Conference was overwhelmingly positive. While I don’t pretend to understand the fascination/running joke of bacon and kitten images or the love of brewpubs that permeates, I gained a great respect for the community of programmers and librarians that is working together on ways of enriching the lives of its users.

From Andreas Orphanides: Code4Lib 2009 With a Critical Eye

One of the things that struck me about my first Code4Lib conference is the level of bright-eyed idealism that the presenters, and the participants, brought to the conference’s discussions. And certainly, it is a right and proper thing for us Code4Libbers to keep focused on the horizon as we think about technology, data, and the future of libraries. It’s important to remember, though, that while optimism is all well and good, we must also be mindful of the practical problems we may encounter in implementation. With that in mind, I present a few points of critical analysis of the presentations at Code4Lib 2009–places where we should pause to think about how the ideal will fit into a real, and very sloppy, world.

The Linked Data preconference was both an exploration of the possibilities of decentralized, interconnected data repositories and a rallying cry for developers to make use of those repositories. Linked Data as a concept is also one of the areas where reality is likely to end up in conflict with an ideal system. Fundamentally, Linked Data relies on high-quality metadata creation and robust linking technology.

The presenters acknowledged the problems inherent in linking, such as expiring URIs, and suggested as a solution locally caching linked data. This kind of workaround, however, runs right up against the problem of quality metadata creation. What happens as cached metadata ages, or false or incorrect metadata propagates through the system? A decentralized system will either have to accommodate conflicting metadata, which could have nasty side effects, or it will need a heuristic to resolve the conflict, a notoriously difficult prospect. It seems to me that the only viable alternative would be to cede some of the ad-hoc, democratic nature of this system to establish an authority control system of some kind, sacrificing some of the decentralization that make the Linked Data concept so appealing.

Consider also the Freebase project, and its Genderizer tool. Stefano Mazzocchi, in his keynote, demonstrated Freebase, an open-access database that hopes to catalog, well, everything about everything. He also demonstrated the Genderizer tool, a web application where users vote on the gender to objects in the Freebase database. Stefano indicated that a possible goal of the Genderizer would be able to determine the gender of every relevant (i.e., “genderizable”) object for which information exists on the internet, something on the order of four billion objects.

This presents an immense problem of scale: the Genderizer assigns gender to approximately 150,000 objects per month; at the current rate, it would take over 2,200 years to reach that goal. I won’t even address the problems inherent in assigning genders through voting or the potential for dissonance between someone’s self-identified gender and its majority-assigned one. There’s the potential for similar problems in assigning properties in this way to any large collection of objects.

Although I’ve pointed out flaws in these two projects, each one has incredible promise, and each will doubtless serve as the foundation for a host of next-generation information tools. And while I don’t have recommendations for how to address the flaws, I think it’s important to acknowledge them, and to think about potential solutions as we explore the possibilities these tools offer. This same earnest, realistic perspective–balancing the ideal of a perfectly crafted design against a reality where implementation is far from perfect–must be maintained if we are to build technologies that are relevant, practical, and practicable, and which serve as the platform for real-world applications that are useful to the information consumers that we serve.

About the Authors

Yu-Jie Chen is the ILS Librarian at Loudoun County Public Library in Leesburg, Virginia.

Joanna DiPasquale is a web developer for Columbia University Libraries’ Digital Program Division.

Lauren Ko has a bachelor’s degree in computer science and a master’s in information sciences. She is currently Web Archiving Programmer for the University of North Texas Libraries Digital Projects Unit.

Andreas Orphanides has a BA in mathematics from Oberlin College, and after a stint as a high school mathematics teacher, he earned his MSLS from the UNC-Chapel Hill School of Information and Library Science. He is currently a Libraries Fellow at the North Carolina State University Libraries, where he works in the Libraries’ Information Technology and Research & Information Services departments.

Leave a Reply