Issue 17, 2012-06-01
Coordinating Editor Tim Lepczyk salutes change in this issue, welcoming new editors to the Journal and announcing his departure.
While creating content in LibGuides in quite easy, link maintenance is troublesome, and the built-in link checker offers only a partial solution. The authors describe a method of using PURLs and a third-party link checker to effectively manage links within LibGuides.
Google Analytics has advanced features for tracking search queries, events such as clicking external links or downloading files, which you can use to track user behavior that is normally difficult to track with traditional web logging software. By tracking behavior, you can use Google Analytics API to extract data and integrate it with data from your digital repository to show granular data about individual items. Using this information, digital libraries can learn how users use the site without extensive HCI studies, and can use this information to improve the user experience.
Using the Martha Berry Digital Archive Project as an exploratory case study, this article discusses experimental methods in digital archive development, describing how and why a small project team is leveraging undergraduate student support, a participatory (crowdsourced) editing model, and free and open source software to digitize and disseminate a large documentary collection.
Using Semantic Web Technologies to Collaboratively Collect and Share User-Generated Content in Order to Enrich the Presentation of Bibliographic Records–Development of a Prototype Based on RDF, D2RQ, Jena, SPARQL and WorldCat’s FRBRization Web Service
In this article we present a prototype of a semantic web-based framework for collecting and sharing user-generated content (reviews, ratings, tags, etc.) across different libraries in order to enrich the presentation of bibliographic records. The user-generated data is remodeled into RDF, utilizing established linked data ontologies. This is done in a semi-automatic manner utilizing the Jena and the D2RQ-toolkits. For the remodeling, a SPARQL-construct statement is tailored for each data source. In the data source used in our prototype, user-generated content is linked to the relevant books via their ISBN. By remodeling the data according to the FRBR model, and expanding the RDF graph with data returned by WorldCat’s FRBRization web service, we are able to greatly increase the number of entry points to each book. We make the social content available through a RESTful web service with ISBN as a parameter. The web service returns a graph of all user-generated data registered to any edition of the book in question in the RDF/XML format. Libraries using our framework would thus be able to present relevant social content in association with bibliographic records, even if they hold a different version of a book than the one that was originally accessed by users. Finally, we connect our RDF graph to the linked open data cloud through the use of Talis’ openlibrary.org SPARQL endpoint.
The GLIMIR project at OCLC clusters and assigns an identifier to WorldCat records representing the same manifestation. These include parallel records in different languages (e.g., a record with English descriptive notes and subject headings and one for the same book with French equivalents). It also clusters records that probably represent the same manifestation, but which could not be safely merged by OCLC’s Duplicate Detection and Resolution (DDR) program for various reasons. As the project progressed, it became clear that it would also be useful to create content-based clusters for groups of manifestations that are generally equivalent from the end user perspective (e.g., the original print text with its microform, ebook and reprint versions, but not new editions). Lessons from the GLIMIR project have improved OCLC’s duplicate detection program through the introduction of new matching techniques. GLIMIR has also had unexpected benefits for OCLC’s FRBR algorithm by providing new methods for identifying outliers thus enabling more records to be included in the correct work cluster.
Case Study: Using Perl and CGI Scripts to Automate a Quality Control Workflow for Scanned Congressional Documents
The Law Library Digitization Project of the Rutgers University School of Law in Camden, New Jersey, developed a series of scripts in Perl and CGI that take advantage of the open-source module PerlMagick to automatically review the image quality of scanned government documents. By implementing these procedures, Rutgers was able to save staff working hours for document quality control by an estimated 25% percent from the previous manual-only workflow. These scripts can be adapted by novice Perl and CGI programmers to review and manipulate large numbers of text and image files using commands available in PerlMagick and ImageMagick.
At Yale University Library (YUL), recorded reference transactions revealed that after finding a book in the catalog patrons had difficulty knowing how to use the call number to find the book on the shelf. The Library created a mobile service to help locate the call number in the library stacks. From any call number of a book in Sterling Memorial Library at YUL, a map will be displayed which highlights that call number’s general area on a floor in the stacks. YUL introduced the mapping application in Yufind, a catalog in place at Yale since 2008 which is based on Vufind.
Amy Unger is one of the recipients of the Gender Diversity Scholarships to attend the Code4Lib 2012 conference. The Journal is pleased to present her conference report here.