Issue 34, 2016-10-25

OSS4EVA: Using Open-Source Tools to Fulfill Digital Preservation Requirements

This paper builds on the findings of a workshop held at the 2015 International Conference on Digital Preservation (iPRES), entitled, “Using Open-Source Tools to Fulfill Digital Preservation Requirements” (OSS4PRES hereafter). This day-long workshop brought together participants from across the library and archives community, including practitioners, proprietary vendors, and representatives from open-source projects. The resulting conversations were surprisingly revealing: while OSS’ significance within the preservation landscape was made clear, participants noted that there are a number of roadblocks that discourage or altogether prevent its use in many organizations. Overcoming these challenges will be necessary to further widespread, sustainable OSS adoption within the digital preservation community. This article will mine the rich discussions that took place at OSS4PRES to (1) summarize the workshop’s key themes and major points of debate, (2) provide a comprehensive analysis of the opportunities, gaps, and challenges that using OSS entails at a philosophical, institutional, and individual level, and (3) offer a tangible set of recommendations for future work designed to broaden community engagement and enhance the sustainability of open source initiatives, drawing on both participants’ experience as well as additional research.

by Marty Gengenbach, Shira Peltzman, Sam Meister, Blake Graham, Dorothy Waugh, Jessica Moran, Julie Seifert, Heidi Dowding, and Janet Carleton. Edited by Marty Gengenbach, Shira Peltzman, and Sam Meister

Introduction

Open-source software (OSS) has played an increasingly prominent role in digital preservation over the past two decades.[1] Starting with LOCKSS, DSpace, and DROID, OSS was integrated into a number of high profile preservation initiatives in the early 2000s, and quickly gained traction within the preservation community thereafter for reasons that ran the gamut from greater accessibility to lower costs. In the years that followed, OSS came to play an increasingly integral role in the back-end technology that comprised the preservation landscape. As the adoption of OSS solutions to support the curation of digital collections grew, and as both the number and variety of OSS tools increased, there was a growing need among preservationists to assess how and when to adopt particular tools so that they could better support their institutions’ specific requirements and workflows.

In response to these growing needs, organizers convened a workshop at the 12th International Conference on Digital Preservation (iPRES2015) entitled, “Using Open-Source Tools to Fulfill Digital Preservation Requirements” (OSS4PRES hereafter). OSS4PRES, which was promoted as “a space to talk about open-source software for digital preservation, and the particular challenges of developing systems and integrating them into local environments and workflows,”[2] was a unique event in two ways: whereas most OSS workshops or events typically focus on a specific tool, user group, or region, this one was international in scope and brought together participants from across the OSS community, including practitioners, proprietary vendors, and representatives from open-source projects and tool providers. Not only did the diversity of participants enable a uniquely productive discussion, but it also resulted in an evenhanded consideration of the issues at the heart of the OSS tools discussed.

Another factor that set OSS4PRES apart was its design. The day-long workshop was comprised of morning and afternoon sessions. While the morning was devoted to lightning talks–principally in the form of demos and case studies–that addressed a particular usage or implementation of OSS, the afternoon was devoted to focused group discussions on OSS’ opportunities, gaps, and challenges. Workshop organizers strove to make the event participatory and placed a heavy emphasis on encouraging participants to interact with one another. This format allowed for dynamic and engaging discussions around the selection and integration of OSS into preservation workflows.

The resulting conversations were surprisingly revealing as much for their depth and breadth as for their consistency. The striking universality of the themes, concerns, and questions raised during OSS4PRES indicates a need to take stock of the current state of OSS as it has been applied within the digital preservation community, particularly in order to identify and highlight the challenges and opportunities associated with implementing these tools. While OSS’ significance on both a practical and theoretical level within the preservation landscape is clear, currently there are a number of roadblocks that discourage or altogether prevent its use.[3] The persistence of these challenges suggests that solutions must encompass voices from across and beyond the field, and must specifically address the stumbling blocks that participants have identified in conversation with one another at forums such as OSS4PRES.

This paper will build upon the rich and multi-layered discussions that took place at OSS4PRES by offering a set of recommendations designed to illuminate a path forward for the preservation community. In order to achieve this task, the authors will begin with an analysis of the opportunities and challenges encountered by many organizations implementing OSS, followed by a summary of the gaps in the OSS landscape that have yet to be filled. In closing, the authors will draw on the conversations that took place at OSS4PRES along with additional research to offer a tangible set of recommendations for future work that will: (1) bolster existing open-source initiatives; (2) broaden community engagement; and (3) provide preservationists with a more sustainable model for open-source projects in the future.

Opportunities

The collaborative model upon which the concept of OSS relies offers numerous opportunities for fostering communities of practitioners and promoting knowledge and resource sharing at the local, national, and international levels. One example of this is the BitCurator software environment, a suite of digital forensics tools that have been modified and packaged for increased accessibility. Cal Lee and Kam Woods’ presentation on BitCurator at OSS4PRES focused on the twin goals of community engagement and making digital forensics tools more accessible to digital preservationists.[4] Of particular note with regard to this software is the central role that the BitCurator Consortium (BCC) plays as both the tool’s host and center for administrative, user, and community support. BCC is an independent, community-led membership association geared toward libraries, archives, museums, and other institutions that seek a collaborative approach to exploring and applying forensics solutions to their digital collections. The BCC maintains an active membership of libraries and archives that strive to increase the accessibility of the BitCurator software suite through outreach and documentation.

The ethos of ‘give as you can’ allows institutions to develop specific tools that they can afford, while also providing these tools to organizations that cannot engage in development activities. Open-source projects also provide an avenue for smaller or under-resourced institutions to actively participate in the development and improvement of tools by reporting bugs and providing feedback. In theory, the benefit of this arrangement is that the user community can contribute by developing and documenting tools so that there are resources available for all institutions working with OSS.

This points to another key opportunity for using OSS in preservation environments: the ability to share experiences and build relationships among fellow practitioners and tool developers. Some of the best examples of this at the workshop were conversations focused on the establishment of institutional workflows. Andrew Berger, for example, described the challenges he faced implementing Archivematica, an open-source preservation management tool, at the Computer History Museum.[5] Berger’s lightning talk touched on the usefulness of reaching out to colleagues and institutions working on similar projects and the importance of engaging with and participating in the wider open-source community. In addition to creating a more useful product, participating in a tool’s development contributes to the institution’s increased sense of commitment, both to that OSS tool and to the community supporting it.

There are mutual benefits to participation in an active OSS community: practitioners can have a stronger voice in guiding the development of tools that fit their needs, while developers can refine their work and have more productive development cycles based on clear feedback from users. This relationship between tool users and developers is particularly important within the field of digital preservation, where adherence to community-developed practices and standards is paramount to success. By having direct relationships with developers, practitioners are able to advocate for tools that meet field-specific standards.

OSS4PRES also demonstrated a number of tangible opportunities that exist at an organizational level. Chief among these is the possibility of intra-institutional collaboration. Using OSS, particularly when it comes to installing, hosting, maintaining and upgrading software, often requires the skillsets and expertise of staff across multiple departments. This presents staff with a valuable opportunity to leverage one another’s knowledge in order to work toward a shared goal.

For example, UCLA Library recently embarked on a project to implement Archivematica so that it can be used by several different departments throughout the library.[6] Although the digital archives program within Library Special Collections (LSC) will be the primary user this software, the Library’s Core Operations and Developer Operations teams will be responsible for implementing, virtualizing, and ultimately supporting it. In order for this project to be effective, the two departments have worked together closely. This process has required that Core Ops and Dev Ops staff become familiar with digital preservation tools and concepts like the Reference Model for an Open Archival Information System (OAIS)[7], while LSC staff have had to become familiar with Linux-based operating systems.

There are two results of cross-pollinating the various knowledge bases within an organization. Firstly, interdisciplinary collaboration can help prevent the development of departmental or organizational silos of information. Secondly, projects that incorporate staff from a wide range of positions and departments throughout an organization reinforce the notion that, in order to be successful, digital preservation must be supported at all levels of an organization. The more people involved in preservation activities, the higher preservation’s profile within an organization will be–an important component in advocacy for OSS implementations.

In addition to the opportunities that intra-institutional collaboration affords, there’s even greater potential for collaboration on OSS projects across institutions. In fact, many OSS projects have been the result of inter-institutional collaboration. The ability to combine several organizations’ experience, resources, and existing knowledge bases can be a powerful tool that results in something much greater than the sum of its parts.

A particularly good example of inter-institutional collaboration is the Hydra Project. A multi-institutional collaboration with over 30 listed partners, the Hydra Project “gives like-minded institutions a mechanism to combine their individual repository development efforts into a collective solution.”[8] Banner projects like this one that leverage the resources of multiple institutions are particularly valuable because the final output–in this case a growing suite of tools for digital asset management and digital preservation, among many others–can be used by other organizations in the Hydra community. The motto of the whole project is “if you want to go fast, go alone. If you want to go far, go together.”[9] As one of the Hydra project partners puts it, “working collaboratively makes us work better. Some of the best and brightest technologists in libraries are engaged in Hydra, and the community model puts them on ‘our team,’ states one of the founding organizations.”[10]

In addition to the philosophical and institutional opportunities described above, OSS4PRES workshop participants saw many benefits for digital preservation practitioners related to using open-source tools in both existing and developing workflows. Many participants stated that using OSS helped them share and improve workflows, gave them greater flexibility in establishing their own workflows, and improved the visibility of workflows across institutions, making them easier to comprehend. Participants also cited the ability to freely download and experiment with open-source tools to determine how well they fit into existing workflows. They also appreciated the option to implement open-source tools incrementally, and as the tools become more familiar. The advantage of trying out and adding tools to a workflow as needs and requirements change is that practitioners do not have to implement an entire system at once to start testing and experimenting with OSS, as is the case with some commercial software. As an added benefit, once an open-source tool has been tailored to fit local needs, these customized solutions can be shared with the preservation community at large and adapted by other individuals or organizations with similar needs.

Challenges

For all the opportunities that arise from the community-driven nature of OSS, there are challenges as well. Indeed, the underlying commitment to freely-shared resources and collaborative development can be a double-edged sword: OSS4PRES participants agreed that OSS both fosters and depends on an engaged community of practitioners. Without continued involvement in and support from users, funders, and developers, open-source tools can flounder and may become defunct or orphaned. While there are many variables that can contribute to decreased levels of active community support in OSS projects, OSS4PRES participants identified four primary factors: (1) lack of clear leadership and governance structures for OSS; (2) administrative misunderstanding and/or lack of institutional buy-in; (3) limited or uncertain funding for the project; and (4) the unique challenges of OSS implementation within a specific institutional context.

Of these factors, a lack of clear leadership is perhaps the most philosophical in nature. This is a dilemma that arises as a result of the OSS model of distributed ownership, wherein responsibility for funding, development, and outreach is shared across several groups or institutions. Although there are benefits to such a model, it can also be a source of confusion as to the direction and objectives of the tool in question for both developers and users alike. Similarly, participants observed that the organizations behind existing OSS may exhibit very strong leadership in certain areas, such as technical development, but may lack the skills or resources necessary to secure extended funding or provide training and documentation. This becomes an increasingly serious problem as tools become more widely used, and if software developers don’t have the resources necessary to keep up with demand.

Another challenge that can arise from the collaborative nature of OSS is the perception–particularly among administrators–that because of OSS’ decentralized ownership, these tools are inherently unstable and therefore present a risk. Participants noted that as a result of this belief, administrators often express reluctance to commit resources to an OSS project because they see it as a ‘bottomless pit’ that will drain financial resources.

Conversely, some administrators assume there will be no costs associated with using OSS. This mistaken belief is the genesis of the oft-referenced anecdote about the difference between “free beer” and “free kittens”: namely, that while “free beer” can be unconditionally free, “free kittens” come with associated long-term costs that ensure that the animal survives and thrives.

In both cases, these misperceptions of OSS require practitioners to engage administrators in order to ensure institutional buy-in. Advocacy for and education about open source projects must demonstrate both the return on investment in OSS and the significant benefits to be gained from a collaborative, community-based approach to software development. Thus, a key component of successful institutional OSS adoption lies in ensuring that major stakeholders understand what using a particular open-source tool will mean. This includes an appreciation of the required investment over time to install, maintain, upgrade, and potentially further develop open-source tools.

The last point is especially important because, like any software product, OSS requires regular patches, updates, and upkeep. Ignoring these over time may leave an institution with an OSS tool that is outdated and therefore subject to bugs and security holes. The choice to maintain an unsupported version of a particular open-source tool simply because it meets (or has been customized to meet) an organization’s needs is problematic. For what an institution may stand to gain from this tool in terms of functionality and local integration, it may stand to lose in terms of the stability of a mainstream code release, the risk to information security, and the likelihood that the tool in question will become increasingly less functional and reliable as it ages.

The source and duration of funding are other factors that can threaten the long-term sustainability of OSS. Tools that begin as grant-funded projects face the challenge of identifying and securing lasting sponsorship after the seed money dries up. One solution to this challenge lies in making a successful transition from a grant-funded project to a member supported consortium-based model. Participants pointed to BitCurator as an example of a tool that has made this transition effectively. Within the BCC, an elected executive council comprised of nominees from member institutions helps govern ongoing development and outreach. Distributing this responsibility among consortium members serves to help ensure the tool’s continued upkeep and viability.

The final challenge participants identified is in integrating open-source tools into institutional workflows. Understanding the software dependencies, system requirements, and local configuration necessary to bring many OSS tools online in a production environment can require a considerable investment of both time and resources. Participants were quick to note that while one of the most significant benefits that OSS has to offer is the ability to customize a tool for use within a specific context, the decision to do so comes with its own hurdles: while customization may result in a solution that is ideally suited to that particular organization, it may also reduce the potential value of other OSS strengths, such as the benefit of a wide user community from which to draw experience and insight.

Furthermore, highly customized open-source tools may require a lengthy period of time to properly test, implement, and integrate. Even if this is not the case, open-source tools that have been heavily tailored to a specific institutional context may have limited documentation that applies only to a small subset of users or be vulnerable to single points of technological and/or human failure. This siloization increases the odds that a particular tool will eventually become obsolete and therefore represents a risk to the tool’s long-term sustainability.

Workshop participants noted two related issues: First, there are a limited number of tools and documented workflows for pre-ingest tasks, which is perhaps due to the inherent need for customized implementation during this phase. Particularly with regard to born-digital materials, pre-ingest activities were described by one presenter (to much agreement around the room) as the “wild west” of digital preservation. Secondly, most of the tools that are available have been built on the assumption of simple and increasingly automated workflows. Although automation is useful for organizational contexts with sufficient resources and technical expertise, for many practitioners the reality is far more rudimentary. The heterogeneity of born-digital content requires intervention or assessment at the file level, and the set of activities involved in this process are currently difficult to automate. In addition, many institutions utilize multiple workflows depending on the type of digital material being processed (e.g. digitized material, published and unpublished born-digital, audio/visual material, research data, theses, reports, etc.). In each case, the diversity and specificity of potential workflows make the development of a one-size-fits-all tool nearly impossible.

Gaps

The preceding sections in this article point to a number of gaps in the current landscape for archives, libraries, and museums attempting to implement OSS. Perhaps the most significant of these is the paucity of successful models for OSS leadership and governance. As suggested above, this is likely a result of the inevitable fluctuations in OSS’ funding, development and maintenance. However, there are exceptions to this rule which, albeit still relatively few in number, may serve as a valuable stepping stone to a more widely replicable solution. These include the growth of the Hydra community and the governance models of DuraSpace and the BCC, where development activities are coordinated via steering groups or similar leadership bodies. Building on and extending these governance models could assist in fostering collaboration across the digital preservation OSS development community.

At the same time, there is a desperate need for transparent examples of effective cost-modeling for OSS development projects. It is hard to judge, and therefore plan, the amount of time and resources it may take to either develop or implement an open-source solution for digital preservation. This can result in a lack of appropriate technological, financial, or staffing resources available to institutions–especially those that are less well funded. Identifying and aggregating successful models for cost-sharing development and maintenance of OSS efforts could prove enormously beneficial to practitioners in need of more realistic estimates. This could also be used to advocate for OSS in meetings with administrators who control the budget.

Another gap is the absence of a centralized hub or forum where practitioners could share software and workflow documentation, post case studies, and address technical implementation challenges. Recognizing the existence and value of current forums, such as the Community Owned digital Preservation Tool Registry (COPTR), participants articulated the need for such a space. Shared evaluation of individual tools is valuable, but an illustration of the variety of approaches for sequencing and integrating tools is clearly a current priority for digital preservation practitioners. A centralized resource would assist in informing institutional decision-making about local adoption, implementation, and integration amongst multiple OSS tools options.

When it comes to specifics in functionality and workflows, requirements don’t always translate well. Particular examples cited at OSS4PRES were requirements for tools that address rights management and intellectual property values, redaction and versioning, and security. Along with an expressed lack of standardization around metadata exists the desire and need for an integrated or accessible way to reuse, map, and/or synchronize metadata between systems and tools. Many workshop participants brainstormed ideas for bridging procedural and technical gaps at the local or individual level. Several of the discussions about gaps focused on metadata.

Although a wide range of metadata standards and guidelines exist, some practitioners lamented that these documents do not always provide clear roadmaps for adopting new metadata standards and practices into existing environments. Detailed guides for integrating, or entirely replacing, standards and practices seem either nonexistent or unidentifiable. Moreover, metadata-integration methods between software applications and databases were also identified as a challenge. In order to bridge this gap, participants suggested that stronger communication lines be forged amongst practitioners for sharing issues, obstacles, and problem-solving activities for micro-tasks. One of the core features of, and benefits to, adopting OSS is the willingness to share knowledge. It is up to the professional community to exploit multiple communication methods and platforms, including documentation, in a way that addresses the recurrent metadata-application challenges that surface across institutions.

There is also a need for resources to support succession planning and the long-term sustainability of OSS tools designed to achieve archival goals. This could include successful case studies for succession planning, which may help highlight the necessary components to ensure the sustainability of digital preservation tools over time. Succession planning materials should also look at failed projects, in order to provide as much information as possible into what makes tools succeed or fail in the long run.

In order to be truly effective, succession planning must occur at multiple levels. Put another way, it is not enough for a grant-funded project to have a succession plan; any institution choosing to adopt OSS must understand the tool’s functionality and requirements, and must establish a plan to mitigate any loss of productivity in instances where the institution is no longer able to meet those requirements. Such planning could guard against loss of service by identifying similar tools with different requirements or technical components and providing technological road-mapping for institutions to navigate large-scale infrastructure changes.

Recommendations

Through the wide-ranging and dynamic conversations at OSS4PRES, participants brainstormed a number of different challenges, opportunities, and gaps in the existing digital preservation OSS landscape. By compiling and distilling the resulting notes, the authors of this work have identified three recommendations to ensure the long-term viability of open-source projects. This section will provide additional detail for each of these recommendations, including references to similar resources where applicable.

1. Develop sustainable and replicable models for governance, collaboration, and cost-sharing

This recommendation is most concerned with the administration and organization of both inter- and intra-institutional OSS projects. OSS4PRES participants noted a dearth of resources (or a lack of familiarity with existing resources) for organizations in the planning phases of OSS projects. The lack of material to help organizations plan ahead at this stage is problematic since this early phase is where many of the most fundamental factors to long-term success are determined, including:

  • Deciding whether the project will be independently or collaboratively run–whether it is intra- or inter-institutional–and what that will mean for governance and cost-sharing;
  • Understanding and clearly documenting the roles and responsibilities of each participating party;
  • Establishing clearly defined goals and criteria for success.

To ensure the long-term success of future projects in the digital preservation space, there is a need for more examples of successful models of OSS project management and project planning. One potential starting point for this work is the ARL Spec Kit 340: Open Source Software. The Spec Kit provides examples of OSS contributor agreements that can help define how an institution will contribute to OSS projects.[11] Another useful resource are the many existing examples of tools and strategies to use in project scoping and management–from the mnemonic SMART goals (specific, measurable, assignable, realistic and time-related) to highly sophisticated project management systems such as JIRA or Daptive PPM. What’s missing, however, are documented examples of the implementation of these tools and systems in an organizational setting to support digital preservation OSS.

2. Establish a centralized resource for OSS documentation and workflows

While there is significant potential for sharing experiences about adopting and enhancing OSS, there are still many issues that continue to hinder the implementation of a specific tool in a particular institutional context. Whether it is an improper software configuration, incompatible hardware, or a limited understanding of a successfully installed tool, OSS workflows for digital preservation may break down in any number of ways. One of the most commonly voiced recommendations from OSS4PRES attendees was the desire for a centralized location for technical and instructional documentation, end-to-end workflows, case studies, and other resources related to the installation, implementation, and use of OSS tools. This resource–which participants envisioned as perhaps taking the form of a wiki–would serve as a hub that would enable practitioners to freely and openly exchange information, user requirements, and anecdotal accounts of OSS initiatives and implementations.

Many OSS tools have their own sites dedicated to hosting documentation, and there are other resources that are dedicated to hosting and aggregating wide-ranging technical and user-oriented information about tools, formats, and platforms for digital preservation.[12] However what is notable about the proposed hub is its unique emphasis on support throughout all stages of the OSS lifecycle, including but not limited to:

  • Strategies and resources for successful OSS advocacy with administrators, funders, and other stakeholders, including FAQs about OSS projects;
  • Case studies that include software and hardware requirements, estimated time from testing to production, and project budget information that other organizations could use or re-purpose in scoping or executing similar OSS projects;
  • Long-term cost estimates that take into account factors like technical resources for installation and maintenance, and recommended hardware or software configurations and upgrades;
  • Information on OSS project status–is it currently supported? If not, when was it last updated?;
  • End-to-end workflows for acquiring, processing, preserving, and providing access to born-digital materials using OSS tools.

It’s also important to note that this proposed hub does not need to be standalone; in fact, it would be better integrated with existing listings of digital preservation tools, such COPTR or the Preserving digital Objects with Restricted Resources (POWRR) Project. An example of such a resource for the library community includes the LYRASIS-supported FOSS4LIB site. As XKCD, the “webcomic of romance, sarcasm, math, and language” has observed, attempts to counter proliferation may inadvertently exacerbate the problem they intend to solve.[13] A recent Code4Lib Journal article on the barriers to initiation of OSS projects notes that the use of GitHub as a widely accepted repository for OSS project code has largely mitigated concerns about common platforms for sharing code; the authors of this article now seek a similar resource that will enable the free and open exchange of documentation, user requirements, and workflow information.[14]

3. Develop resources to improve succession planning in OSS projects

The third recommendation is to promote succession planning as an integral part of OSS development, and specifically, to ensure that after the project has ended there is a clearly documented strategy for maintaining the OSS code base. Just as the National Institute of Health, National Science Foundation, and an increasing number of other research-funding enterprises have mandatory data management and reproducibility requirements to ensure the data created by those projects remains available for the long-term, so too should digital preservation-focused OSS have viable succession plans in place from their outset.[15]

Fortunately, there are many data management plans from which the digital preservation community can draw. Ironically, many of these data management requirements have been established by archivists and records managers. This raises broader questions, outside the scope of this work: is succession planning less an issue of adequate resources (guidelines, best practices, etc.), and more an issue of not practicing what we preach? Or, as Bram van der Werf suggested in a 2012 blog post, is it the limited number of skilled practitioners with the knowledge and ability to maintain the tools that are created in the digital preservation community?[16] Regardless of the source of these issues, organizations approaching open-source projects should closely examine those that have made a successful transition from grant-funded standalone initiatives to robustly supported ongoing programs, and consider how to best emulate them.

Conclusion

As the digital preservation field has matured so has the purpose and function of related OSS. In the span of fewer than 15 years, the OSS landscape has evolved from a scattered set of standalone tools designed to accomplish discrete tasks (e.g., generating and validating checksums, directory mapping) to complex software environments that bundle multiple open-source tools together to provide a suite of preservation services (e.g., Archivematica, BitCurator). Projects like these have radically expanded the horizon of possibilities for preservationists and provide exciting new possibilities. Nevertheless, these tools still are not watertight. Reflecting on the work that remains to be done, OSS4PRES participants noted that there are real concerns about open-source tools at both the individual and institutional levels. Often the root cause of these issues stems from the philosophical underpinnings of OSS–particularly its distributed ownership model. Additionally, as participants were quick to point out, many of these challenges are not minor; they pose serious risks to collections both large and small.

But for all these threats, participants also recognized that OSS represents an unprecedented opportunity for practitioners. Implementing the recommendations outlined in this paper will require time and consideration, and will likely entail its own set of challenges. But given everything that is stood to be gained from OSS, it is imperative that progress be made toward improving the arsenal of resources available for implementing and supporting OSS initiatives. It is the authors’ hope that realizing the above recommendations will achieve this, and thereby enable the field at large to move forward toward a better, more sustainable open-source landscape.

Acknowledgement

The authors wish to give sincere thanks to the numerous editors, note-takers, and OSS4PRES participants that made this article possible.

Endnotes

[1] The 2014 ARL SPEC Kit on Open Source Software notes that 53% of 66 academic and public libraries responding to its survey use a locally hosted and supported OSS solution for digital preservation. See Curtis J. Thacker, Charles D. Knutson, and Mark Dehmlow, “ARL SPEC Kir 340: Open Source Software,” Association of Research Libraries (2014). Available: http://publications.arl.org/Open-Source-Software-SPEC-Kit-340/.

[2] Courtney Mumma et al, “Using Open-Source Tools to Fulfill Digital Preservation Requirements.” Proceedings of the 12th International Conference on Digital Curation. Chapel Hill, NC: University of North Carolina, School of Information and Library Science, 2015. Available: http://phaidra.univie.ac.at/o:429623. Workshop proceedings are available at: http://oss4pres2015.web.unc.edu/.

[3] Other Code4Lib Journal articles have discussed the challenges to implementing and sustaining OSS projects. See: Dale Askey, “We Love Open Source Software. No, You Can’t Have Our Code,” Code4Lib Journal no. 5, December 15, 2008: http://journal.code4lib.org/articles/527#Correction; Sibyl Schaefer, “Challenges in Sustainable Open Source: A Case Study,” Code4Lib Journal no. 9, March 22, 2010. Available: http://journal.code4lib.org/articles/2493; Curtis Thacker and Charles Knutson, “Barriers to Initiation of Open Source Software Projects in Libraries,” Code4Lib Journal no. 29, July 15, 2015. Available: http://journal.code4lib.org/articles/10665; Bret Davidson and Jason Casden, “Beyond Open Source: Evaluating the Community Availability of Software,” Code4Lib Journal no. 31, January 28, 2016. Available: http://journal.code4lib.org/articles/11148.

[4] Cal Lee and Kam Woods, “Demonstration of the BitCurator Environment,” November 6, 2015. Available: http://oss4pres2015.web.unc.edu/contributed-talks/#lee

[5] Andrew Berger, “Implementing Archivematica at the Computer History Museum,” November 6, 2015. Available: http://oss4pres2015.web.unc.edu/contributed-talks/#berger

[6] One of the co-authors of this work is employed by UCLA Library.

[7] Consultative Committee for Space Data Systems. Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-M-2, Magenta Book, Issue 2, June 2012. Available: https://public.ccsds.org/Pubs/650x0m2.pdf

[8] Hydra Project Homepage: https://projecthydra.org/

[9] Ibid.

[10] Why Use Hydra at Stanford?, available: https://projecthydra.org/about-hydra-2/why-use-hydra/.

[11] SPEC Kit 340: Open Source Software, 16.

[12] For example, see: Community Owned digital Preservation Tool Registry (COPTR); Open Preservation Foundation Knowledge Base Wiki; Digital Curation Centre; and Program for Electronic Records Training, Tools, and Standards (PERTTS).

[13] xkcd: Standards, July 20, 2011. Available: https://xkcd.com/927/.

[14] Curtis Thacker and Charles Knutson, “Barriers to Initiation of Open Source Software Projects in Libraries,” Code4Lib Journal no. 29, July 15, 2015. Available: http://journal.code4lib.org/articles/10665.

[15] Butch Lazorchak, “All That Big Data is Not Going to Manage Itself, Part One,” The Signal, May 27, 2014. Available: https://blogs.loc.gov/digitalpreservation/2014/05/all-that-big-data-is-not-going-to-manage-itself-part-one/.

[16] Trevor Owens, “Open Source Software and Digital Preservation: an Interview with Bram van Der Werf of the Open Planets Foundation. The Signal, April 4, 2012. Available: http://blogs.loc.gov/digitalpreservation/2012/04/open-source-software-and-digital-preservation-an-interview-with-bram-van-der-werf-of-the-open-planets-foundation/ van der Werf suggests, “maybe we have a need for tools but we have a much bigger need for skilled people and an active preservation community.”

Bibliography

Askey, Dale. “We Love Open Source Software. No, You Can’t Have Our Code,” Code4Lib Journal no. 5, December 15, 2008: http://journal.code4lib.org/articles/527#Correction.

Consultative Committee for Space Data Systems. Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-M-2, Magenta Book, Issue 2, June 2012. Available: http://public.ccsds.org/publications/archive/650x0m2.pdf.

Davidson, Bret and Jason Casden. “Beyond Open Source: Evaluating the Community Availability of Software,” Code4Lib Journal no. 31, January 28, 2016. Available: http://journal.code4lib.org/articles/11148.

Lazorchak, Butch. “All That Big Data is Not Going to Manage Itself, Part One,” The Signal, May 27, 2014. Available: https://blogs.loc.gov/digitalpreservation/2014/05/all-that-big-data-is-not-going-to-manage-itself-part-one/.

Mumma, Courtney, et. al. “Using Open-Source Tools to Fulfill Digital Preservation Requirements.” Proceedings of the 12th International Conference on Digital Curation. Chapel Hill, NC: University of North Carolina, School of Information and Library Science, 2015. Available: http://phaidra.univie.ac.at/o:429623

Owens, Trevor. “Open Source Software and Digital Preservation: an Interview with Bram van Der Werf of the Open Planets Foundation. The Signal, April 4, 2012. Available: http://blogs.loc.gov/digitalpreservation/2012/04/open-source-software-and-digital-preservation-an-interview-with-bram-van-der-werf-of-the-open-planets-foundation/.

Schaefer, Sibyl. “Challenges in Sustainable Open Source: A Case Study,” Code4Lib Journal no. 9, March 22, 2010. Available: http://journal.code4lib.org/articles/2493.

Thacker, Curtis and Charles Knutson. “Barriers to Initiation of Open Source Software Projects in Libraries,” Code4Lib Journal no. 29, July 15, 2015. Available: http://journal.code4lib.org/articles/10665.

Thacker, Curtis J., Charles D. Knutson, and Mark Dehmlow, “ARL SPEC Kir 340: Open Source Software,” Association of Research Libraries, 2014. Available: http://publications.arl.org/Open-Source-Software-SPEC-Kit-340/.

About the Editors

Martin Gengenbach is an Archivist at the Gates Archive in Seattle, WA. He holds an MSLS with a concentration in Archives and Records Management from the School of Information and Library Science at the University of North Carolina at Chapel Hill.

Shira Peltzman is the Digital Archivist for the UCLA Library where she leads the development of a sustainable preservation program for born-digital material. Shira received her M.A. in Moving Image Archiving and Preservation from New York University’s Tisch School of the Arts and was a member of the inaugural class of the National Digital Stewardship Residency in New York (NDSR-NY).

Sam Meister is the Preservation Communities Manager, working with the MetaArchive Cooperative and BitCurator Consortium communities. Previously, he worked as Digital Archivist and Assistant Professor at the University of Montana. Sam holds a Master of Library and Information Science degree from San Jose State University and a B.A. in Visual Arts from the University of California San Diego. Sam is also an Instructor in the Library of Congress Digital Preservation Education and Outreach Program.

Leave a Reply