by Ron Peterson
It is time for the summer edition of the Code4Lib Journal. While you have been making vacation plans, the editors of the Code4Lib Journal have been busy putting together issue 33 of the Code4Lib Journal. If you are looking for something to inspire a summer project or just looking for something to read on the beach, we have a collection of page-turners for you.
You may want to start by reading Netanel Ganin’s post-mortem on the development of a web-based DVD browsing interface, “Emflix – gone baby gone.” Netanel takes us through the triumphs and defeats of his project and reminds us that “enthusiasm is no substitute for experience.” I’m sure that many of us can relate to his experience.
If you have been considering diving into R during the summer, Monica Macelli’s “Introduction to Text Mining with R for Information Professionals” tutorial may be just the thing that you are looking for. Monica shows us how she used the tm: Text Mining Package and RStudio to analyze the text of 51 course catalogs and create visualizations of the data.
In his article, “Data for Decision Making: Tracking Your Library’s Needs With TrackRef,” Michael Carlozzi describes how he developed TrackRef in order to get a better understanding of the types information that users were seeking. With a better understanding of how users were using the library and what issues they were running into, Michael’s library was able to redeploy staff in ways that would better address the needs of their community, such as freeing up staff to work one-on-one on technology.
Looking to add a little more fun to your summer? Read about how the Biodiversity Heritage Library used games to improve access to their digital texts. Max J. Seidman, Dr. Mary Flanagan, Trish Rose-Sandler, and Mike Lichtenberg describe the development of two online games, Smorball and Beanstalk, and how they used them to improve the accuracy of OCRed texts in “Are games a viable solution to crowdsourcing improvements to faulty OCR? – The Purposeful Gaming and BHL experience.”
For those who are looking for ideas on how to improve the efficiency of processing their electronic theses and dissertations, we have, “From Digital Commons to OCLC: A Tailored Approach for Harvesting and Transforming ETD Metadata into High-Quality Records” by Marielle Veve. Marielle takes us through the process that the University of Northern Florida went through to adapt a workflow for transforming DSpace’s Qualified Dublin Core records into MARC records so that it would work with Digital Commons.
Also in this issue, Zsolt Bánki, Tibor Mészáros, Márton Németh, and András Simon describe how they created a namespace for authors of texts written in Hungarian in ther article, “Checking the identity of entities by machine algorithms: the next step to the Hungarian National Namespace.“ They take us step by step through the process of merging and deduplicating data from multiple sources in order to create a usable, semantic web related namespace. They hope that the namespace will provide some foundation for establishing a National Name Authority in Hungary.
For the quintessential beach read, pack your blanket, your e-Reader, and a copy of Corey Harper’s “Metadata Analytics, Visualization, and Optimization: Experiments in statistical analysis of the Digital Public Library of America (DPLA).” The article covers processing the metadata to prepare it for analysis, creating visualizations, and optimizing the data. If you have been looking to dive into statistics more deeply, this is the article for you. Corey discusses various statistical models and their applicability to predicting item usage. You won’t be able to put it down.
I hope that you enjoy the articles presented in issue 33. The authors and the editors have worked very hard to make them available to you.
Follow up on the gender representation in the Code4Lib Journal
As I was writing this introduction, I was reminded of the introduction that I wrote for issue 24, which addressed the gender diversity of the Code4Lib Journal. I thought it would be interesting to see how we are doing now. Sadly, there doesn’t seem to have been much movement towards a more inclusive Journal. Of the 7 articles in this issue, only 3 of them include female authors – 2 solo authors and 1 article authored by both males and females. In contrast, men are included as author on 5 of the 7 articles (3 solo male authors, 1 authored by 4 men, and 1 authored by both).
The makeup of the Editorial Committee has also not changed much since then. The current Editorial Committee is made up of 4 women and 8 men. The number of female editors is down from issue 24, when it was 5. Since issue 24, we have added 3 female editors, but we have lost 4. I was surprised to see that women made up only a third of the Committee because my sense was that they accounted for the majority of the activity for issue 33. As it turns out, my sense was correct and women were the primary editors of 4 of the 7 articles in issue 33.
While these numbers can only tell us part of the story, I do think they are an indication that there is still work to be done to make the Code4Lib Journal better reflect the diversity of the Code4Lib and library communities.