Issue 37, 2017-07-18

Editorial: Welcome New Editors, What We Know About Who We Are, and Submission Pro Tip!

Sara Amato

Want to see your work in C4LJ? Here’s a pro tip!

A Practical Starter Guide on Developing Accessible Websites

Cynthia Ng and Michael Schofield

There is growing concern about the accessibility of the online content and services provided by libraries and public institutions. While many articles cover legislation, general benefits, and common opportunities to improve web accessibility on the surface (e.g., alt tags), few articles discuss web accessibility in more depth, and when they do, they are typically not specific to library web services. This article is meant to fill in this vacuum and will provide practical best practices and code.

Recount: Revisiting the 42^nd Canadian Federal Election to Evaluate the Efficacy of Retroactive Tweet Collection

Anthony T. Pinter and Ben Goldman

In this paper, we report the development and testing of a methodology for collecting tweets from periods beyond the Twitter API’s seven-to-nine day limitation. To accomplish this, we used Twitter’s advanced search feature to search for tweets from past the seven to nine day limit, and then used JavaScript to automatically scan the resulting webpage for tweet IDs. These IDs were then rehydrated (tweet metadata retrieved) using twarc. To examine the efficacy of this method for retrospective collection, we revisited the case study of the 42nd Canadian Federal Election. Using comparisons between the two datasets, we found that our methodology does not produce as robust results as real-time streaming, but that it might be useful as a starting point for researchers or collectors. We close by discussing the implications of these findings.

Extending Omeka for a Large-Scale Digital Project

Haley Antell, Joe Corall, Virginia Dressler, Cara Gilgenbach

In September 2016, the department of Special Collections and Archives, Kent State University Libraries, received a Digital Dissemination grant from the National Historical Publications and Records Commission (NHPRC) to digitize roughly 72,500 pages from the May 4 collection, which documents the May 1970 shootings of thirteen students by Ohio National Guardsmen at Kent State University. This article will highlight the project team’s efforts to adapt the Omeka instance with modifications to the interface and ingestion processes to assist the efforts of presenting unique archival collections online, including an automated method to create folder level links on the relevant finding aids upon ingestion; implementing open source Tesseract to provide OCR to uploaded files; automated PDF creation from the raw image files using Ghostscript; and integrating Mirador to present a folder level display to reflect archival organization as it occurs in the physical collections. These adaptations, which have been shared via GitHub, will be of interest to other institutions looking to present archival material in Omeka.

Annotation-based enrichment of Digital Objects using open-source frameworks

Marcus Emmanuel Barnes, Natkeeran Ledchumykanthan, Kim Pham, Kirsta Stapelfeldt

The W3C Web Annotation Data Model, Protocol, and Vocabulary unify approaches to annotations across the web, enabling their aggregation, discovery and persistence over time. In addition, new javascript libraries provide the ability for users to annotate multi-format content. In this paper, we describe how we have leveraged these developments to provide annotation features alongside Islandora’s existing preservation, access, and management capabilities. We also discuss our experience developing with the Web Annotation Model as an open web architecture standard, as well as our approach to integrating mature external annotation libraries. The resulting software (the Web Annotation Utility Module for Islandora) accommodates annotation across multiple formats. This solution can be used in various digital scholarship contexts.

The FachRef-Assistant: Personalised, subject specific, and transparent stock management

Eike T. Spielberg, Frank Lützenkirchen

We present in this paper a personalized web application for the weeding of printed resources: the FachRef-Assistant. It offers an extensive range of tools for evidence based stock management, based on the thorough analysis of usage statistics. Special attention is paid to the criteria individualization, transparency of the parameters used, and generic functions. Currently, it is designed to work with the Aleph-System from ExLibris, but efforts were spent to keep the application as generic as possible. For example, all procedures specific to the local library system have been collected in one Java package. The inclusion of library specific properties such as collections and systematics has been designed to be highly generic as well by mapping the individual entries onto an in-memory database. Hence simple adaption of the package and the mappings would render the FachRef-Assistant compatible to other library systems.

The personalization of the application allows for the inclusion of subject specific usage properties as well as of variations between different collections within one subject area. The parameter sets used to analyse the stock and to prepare weeding and purchase proposal lists are included in the output XML-files to facilitate a high degree of transparency, objectivity and reproducibility.

The Semantics of Metadata: Avalon Media System and the Move to RDF

Juliet L. Hardesty and Jennifer B. Young

The Avalon Media System (Avalon) provides access and management for digital audio and video collections in libraries and archives. The open source project is led by the libraries of Indiana University Bloomington and Northwestern University and is funded in part by grants from The Andrew W. Mellon Foundation and Institute of Museum and Library Services.

Avalon is based on the Samvera Community (formerly Hydra Project) software stack and uses Fedora as the digital repository back end. The Avalon project team is in the process of migrating digital repositories from Fedora 3 to Fedora 4 and incorporating metadata statements using the Resource Description Framework (RDF) instead of XML files accompanying the digital objects in the repository. The Avalon team has worked on the migration path for technical metadata and is now working on the migration paths for structural metadata (PCDM) and descriptive metadata (from MODS XML to RDF). This paper covers the decisions made to begin using RDF for software development and offers a window into how Semantic Web technology functions in the real world.

OpeNumisma: A Software Platform Managing Numismatic Collections with A Particular Focus On Reflectance Transformation Imaging

Avgoustinos Avgousti, Andriana Nikolaidou, Ropertos Georgiou

This paper describes OpeNumisma; a reusable web-based platform focused on digital numismatic collections. The platform provides an innovative merge of digital imaging and data management systems that offer great new opportunities for research and the dissemination of numismatic knowledge online. A unique feature of the platform is the application of Reflectance Transformation Imaging (RTI), a computational photographic method that offers tremendous image analysis and possibilities for numismatic research. This computational photography technique allows the user to observe on browser minor details, unseen with the naked eye just by holding the computer mouse rather than the actual object. The first successful implementation of OpeNumisma has been the creation of a digital library for the medieval coins from the collection of the Bank of Cyprus Cultural Foundation.

DuEPublicA: Automated bibliometric reports based on the University Bibliography and external citation data

Eike T. Spielberg

This paper describes a web application to generate bibliometric reports based on the University Bibliography and the Scopus citation database. Our goal is to offer an alternative to easy-to-prepare automated reports from commercial sources. These often suffer from an incomplete coverage of publication types and a difficult attribution to people, institutes and universities. Using our University Bibliography as the source to select relevant publications solves the two problems. As it is a local system, maintained and set up by the library, we can include every publication type we want. As the University Bibliography is linked to the identity management system of the university, it enables an easy selection of publications for people, institutes and the whole university.

The program is designed as a web application, which collects publications from the University Bibliography, enriches them with citation data from Scopus and performs three kinds of analyses:
1. A general analysis (number and type of publications, publications per year etc.),
2. A citation analysis (average citations per publication, h-index, uncitedness), and
3. An affiliation analysis (home and partner institutions)

We tried to keep the code highly generic, so that the inclusion of other databases (Web of Science, IEEE) or other bibliographies is easily feasible. The application is written in Java and XML and uses XSL transformations and LaTeX to generate bibliometric reports as HTML pages and in pdf format. Warnings and alerts are automatically included if the citation analysis covers only a small fraction of the publications from the University Bibliography. In addition, we describe a small tool that helps to collect author details for an analysis.

New Metadata Recipes for Old Cookbooks: Creating and Analyzing a Digital Collection Using the HathiTrust Research Center Portal

Gioia Stevens

The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis.

This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.

Countering Stryker’s Punch: Algorithmically Filling the Black Hole

Michael J. Bennett

Two current digital image editing programs are examined in the context of filling in missing visual image data from hole-punched United States Farm Security Administration (FSA) negatives. Specifically, Photoshop’s Content-Aware Fill feature and GIMP’s Resynthesizer plugin are evaluated and contrasted against comparable images. A possible automated workflow geared towards large scale editing of similarly hole-punched negatives is also explored. Finally, potential future research based upon this study’s results are proposed in the context of leveraging previously-enhanced, image-level metadata.