Issue 38, 2017-10-18

Editorial: The Economics of Not Being an Organization

Carol Bean

Our successes have caught up with us. Now we get to choose the next step in our evolution.

Usability Analysis of the Big Ten Academic Alliance Geoportal: Findings and Recommendations for Improvement of the User Experience

Mara Blake, Karen Majewicz, Amanda Tickner, Jason Lam

The Big Ten Academic Alliance (BTAA) Geospatial Data Project is a collaboration between twelve member institutions of the consortium and works towards providing discoverability and access to geospatial data, scanned maps, and web mapping services. Usability tests and heuristic evaluations were chosen as methods of evaluation, as they have had a long standing in measuring and managing website engagement and are essential in the process of iterative design. The BTAA project hopes to give back to the community by publishing the results of our usability findings with the hope that it will benefit other portals built with GeoBlacklight.

Using the ‘rentrez’ R Package to Identify Repository Records for NCBI LinkOut

Yoo Young Lee, Erin D. Foster, David E. Polley, and Jere Odell

In this article, we provide a brief overview of the National Center for Biotechnology Information (NCBI) LinkOut service for institutional repositories, a service that allows links from the PubMed database to full-text versions of articles in participating institutional repositories (IRs). We discuss the criteria for participation in NCBI LinkOut for IRs, current methods for participating, and outline our solution for automating the identification of eligible articles in a repository using R and the ‘rentrez’ package. Using our solution, we quickly processed 4,400 open access items from our repository, identified the 557 eligible records, and sent them to the NLM. Direct linking from PubMed resulted in a 17% increase in web traffic.

The Drawings of the Florentine Painters: From Print Catalog to Linked Open Data

Lukas Klic, Matt Miller, Jonathan K. Nelson, Cristina Pattuelli, and Alexandra Provo

The Drawings of The Florentine Painters project created the first online database of Florentine Renaissance drawings by applying Linked Open Data (LOD) techniques to a foundational text of the same name, first published by Bernard Berenson in 1903 (revised and expanded editions, 1938 and 1961). The goal was to make Berenson’s catalog information—still an essential information resource today—available in a machine-readable format, allowing researchers to access the source content through open data services. This paper provides a technical overview of the methods and processes applied in the conversion of Berenson’s catalog to LOD using the CIDOC-CRM ontology; it also discusses the different phases of the project, focusing on the challenges and issues of data transformation and publishing. The project was funded by the Samuel H. Kress Foundation and organized by Villa I Tatti, The Harvard University Center for Italian Renaissance Studies.

Catalog: http://florentinedrawings.itatti.harvard.edu
Data Endpoint: http://data.itatti.harvard.edu

Web-Scraping for Non-Programmers: Introducing OXPath for Digital Library Metadata Harvesting

Mandy Neumann, Jan Steinberg, and Philipp Schaer

Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider’s website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.

DIY DOI: Leveraging the DOI Infrastructure to Simplify Digital Preservation and Repository Management

Kyle Bannerjee and David Forero

This article describes methods for how staff with modest technical expertise can leverage the DOI (Digital Object Identifier) infrastructure in combination with third party storage and preservation solutions to build safer, more useful, and easier to manage repositories at much lower cost than is normally possible with standalone systems. It also demonstrates how understanding the underlying mechanisms and questioning the assumptions of technology metaphors such as filesystems can lead to seeing and using tools in new and more powerful ways.

Direct Database Access to OCLC Connexion’s Local Save File

Rebecca B. French

A feature of OCLC’s Connexion cataloging client unknown to most librarians is the ability to directly work with the Microsoft Access database underlying the local save file. This article provides an overview of the metadata made available through this method, including fields that cannot be accessed through the regular Connexion interface, and discusses factors to be considered when deciding whether to migrate the data to another database system instead of continuing to work with Access. Descriptions of three projects illustrate how this functionality has been applied to efficiently catalog a gift collection, find OCLC numbers for e-books, and create bibliographic records for Early English Books Online/Text Creation Partnership titles using data from multiple sources. With the option to rely only on common, off-the-shelf software, this method of directly accessing the local save file database offers a way to expand Connexion’s functionality for those unable or unwilling to work with OCLC APIs. Other benefits include the ability to import external data and to use SQL for more advanced querying. A number of limitations are also discussed, and their implications for metadata access and use are explored.

Between the Sheets: a Library-wide Inventory with Google

Craig Boman and Ray Voelker

When it comes to taking an inventory of physical items, libraries often rely on their traditional integrated library system’s (ILS) à la carte add ons; outside vendors; or other possibly outdated, complex, and often expensive methods. For libraries with shrinking budgets and other limited resources, high costs can put these methods out of reach.

At the University of Dayton Libraries, we set out to develop an inexpensive and reasonably easy-to-use method for conducting a library-wide physical item inventory. In this article, we explain a custom built Google Sheets-based library inventory system, along with some code for the implementation of a RESTful API (written in PHP) that interacts with our ILS. We will also explain our use of Google Apps scripts in our Google Sheet, which are crucial to our systems.

Although this method used a specific ILS (Innovative Interfaces’ Sierra product) and custom-built RESTful APIs, it may be possible to use similar approaches with other ILS software. Additional notes include areas for improvement and recommendations for interoperability with other ILS systems.

Tools and Workflows for Collaborating on Static Website Projects

Kaitlin Newson

Static website generators have seen a significant increase in popularity in recent years, offering many advantages over their dynamic counterparts. While these generators were typically used for blogs, they have grown in usage for other web-based projects, including documentation, conference websites, and image collections. However, because of their technical complexity, these tools can be inaccessible to content creators depending on their level of technical skill and comfort with web development technologies. Drawing from experience with a collaborative static website project, this article will provide an overview of static website generators, review different tools available for managing content, and explore workflows and best practices for collaborating with teams on static website projects.

Leveraging Python to improve ebook metadata selection, ingest, and management

Kelly Thompson and Stacie Traill

Libraries face many challenges in managing descriptive metadata for ebooks, including quality control, completeness of coverage, and ongoing management. The recent emergence of library management systems that automatically provide descriptive metadata for e-resources activated in system knowledge bases means that ebook management models are moving toward both greater efficiency and more complex implementation and maintenance choices. Automated and data-driven processes for ebook management have always been desirable, but in the current environment, they become necessary. In addition to initial selection of a record source, automation can be applied to quality control processes and ongoing maintenance in order to keep manual, eyes-on work to a minimum while providing the best possible discovery and access. In this article, we describe how we are using Python scripts to address these challenges.

Testing Three Types of Raspberry Pi People Counters

Johnathan Cintron, Devlyn Courtier, and John DeLooper

The Hudson County Community College (HCCC) Library tested three different types of Raspberry Pi based people counters between 6/14/2017 and 7/9/2017. This article will describe how we created each type of counter, will compare the accuracy of each sensor, and will compare them to the college’s existing 3M 3501 gate counters. It will also describe why and how our team decided to make this project, discuss lessons learned, and provide instructions for how other libraries can create their own gate counters.