Issue 6, 2009-03-30

Integrating Process Management with Archival Management Systems: Lessons Learned

The Integrated Digital Special Collections (INDI) system is a prototype of a database-driven, Web application designed to automate and manage archival workflow for large institutions and consortia. This article discusses the how the INDI project enabled the successful implementation of a process to manage large technology projects in the Harold B. Lee Library at Brigham Young University. It highlights how the scope of these technology projects is set and how the major deliverables for each project are defined. The article also talks about how the INDI system followed the process and still failed to be completed. It examines why the process itself is successful and why the INDI project failed. It further underscores the importance of process management in archival management systems.


Archivists are constantly looking for ways to better manage the resources that they are responsible for. The Integrated Digital Special Collections (INDI) was envisioned as a prototype of a database-driven Web application to automate and manage archival workflow for large institutions and consortia. This article discusses how the INDI project enabled the successful implementation of a process for managing large technology projects in the Harold B. Lee Library at Brigham Young University. It highlights how the scope of these technology projects is set and how the major deliverables for each project are defined. The article also talks about how the INDI project followed the process and still failed to be completed. It examines why the process itself is successful and why the INDI project failed. It further underscores the importance of process management in archival management systems.

The INDI project developed out of ongoing efforts to improve the management of archival and manuscript resources in the L. Tom Perry Special Collections (hereafter Perry Special Collections) at Brigham Young University. The Perry Special Collections is a mid-sized repository that makes available primary source materials (including rare books, manuscripts, photographs, folklore, and corporate archives) to students, faculty and other researchers at Brigham Young University. The Perry Special Collections employs twelve full-time curators with responsibility for a variety of collecting areas ranging from arts and communications to 21st century Utah history. Several paraprofessionals help us with processing manuscript collections and managing our workflow. A large number of students also help process collections, provide reference service and perform a variety of other tasks.

In order to maximize efficiency the staff in the Perry Special Collections developed a distributed workflow designed to leverage the unique abilities and different levels of training of our employees. This workflow segmented traditional archival processes and assigned each segment to specific individuals based on their training and experience. Let’s examine how this workflow occurs with a manuscript acquisition. The manuscript curator contacts a potential donor and works with them to facilitate the transfer of their materials to Special Collections. Once the material is physically in our custody, the curator creates a case file that tracks the various steps that occur before a collection is placed in our stacks. The curator makes sure that a deed of gift or other transfer instrument is obtained and that the donor has been appropriately acknowledged. The curator then assigns a processing student to accession the material, minimally process it, and create a basic finding aid (either a basic inventory or a catalog record) in XML utilizing the Encoded Archival Description (EAD) DTD. The student then takes the case file to our Workflows technician (another student employee). The Workflows technician reviews the case file checklist to make sure that all of the necessary actions have been taken by the curator and the processing student. The technician also performs an editorial check of the finding aid. The technician then retrieves the collection and makes sure that it is labeled, bar-coded and placed in the stacks. The technician then passes the case file off to our manuscripts cataloger so that a collection catalog record can be entered into the library’s on-line catalog. The technician also makes sure that the EAD file is sent to the appropriate individual so that it can be placed on the World Wide Web. If conservation work, digitization work or microfilming is necessary, additional individuals become involved in the process.

Project Background

In its earliest incarnation INDI was a simple automation of this paper-based workflow designed to ensure that established procedures were followed in preparing archival and manuscript collections for use by our research base. We quickly realized that the automation of a paper-based workflow would not meet our underlying needs. These needs included the ability to manage a series of tasks that occur for each archival or manuscript collection, the need to manage the distribution of these tasks among several different contributors before the collection was considered finished and made available to researchers, and the need to ensure that each of these tasks was completed properly as a large portion of them were completed by student employees and interns. We also needed to develop a system that would allow us to work more cooperatively with the other two campuses of the Brigham Young University system.[1] Finally, we needed a system that would gather all of the information that we maintained about our archival and manuscript collections into one central location.  We needed to eliminate the various standalone databases used in the department to accession items, to create Encoded Archival Description (EAD) tagged finding aids, to manage location information, and to handle other associated tasks. These standalone databases were not only unsupported by our library’s Information Technology unit, but they also fostered the redundant gathering of information and wasted time that could be spend on other activities.

Once these needs had been articulated in mid-2005, a project team was formed in the Perry Special Collections to determine whether or not there were any open source systems for the management of archival materials. The project team discovered that there was only one open source tool for archival management under development: the Archivist’s Toolkit.[2] The project team carefully analyzed the documentation for this application and found that many of our needs would be met by the completed version of this project. However, it would not meet what was emerging as one of our most important needs—managing our distributed workflow. Given this information, the project team recommended to the chair of the Perry Special Collections that we approach our library administration with a proposal to build a system internally that would meet all of our needs. The department chair took this proposal to the library administration and we received permission to assemble a new project team and begin planning in September of 2005. The project team was comprised of subject specialists from the Perry Special Collections and technology experts from the Library Information Technology (LIT) division.

The INDI project was the largest programming effort that the LIT division of Harold B. Lee Library had undertaken and they recommended that we utilize project management processes to manage the project. Project management provided us with processes and techniques for defining the project scope, estimating the resources necessary to complete the project, tracking our progress, and developing adequate documentation. The first step in this process is project definition. Project definition has three components: defining the project objective, creating a flexibility matrix, and the identification of appropriate team members. Our project objective was to “streamline and improve special collections (SC) workflow processes and integrate SC best practices into a workflow database system.” The flexibility matrix forced us to decide what our constraints were in building the system. Questions that had to be answered included: is the scope of the project the most flexible, are there sufficient resources available and are these resources limited or expandable, and what deadlines are associated with the project-are they firm or flexible? After much deliberation we decided that our scope was the most flexible, our schedule would allow us a little wiggle room and that the resources available to us were unlikely to change. Finally, we decided that the project team would be composed of LIT staff, Special Collections staff, and non-departmental staff.

Once we completed the project definition, we turned to developing requirements for the INDI system. We used the “Is/Is Not” process to identify and define the major deliverables for the INDI system. The Is/Is Not process works like this in the Harold B. Lee Library. Giant sticky notes are placed on the walls of the meeting location. Each sticky note has one major deliverable written on it and then two columns: Is and Is Not. Team members are given small sticky notes and then write what they think the deliverable should be and place it in the appropriate column. If they are not sure which column to place an item in, then the sticky note is placed on the line dividing the columns for further discussion by the project team. Eventually all of the sticky notes must be placed in either the Is column or the Is Not column. This process played out over the course of several meetings. The major deliverables included the core functionality of the INDI system and the various modules that would allow us to complete important archival tasks. They also included the ability to manage archival business processes. Once the major deliverables were identified we had to define each of them. We utilized the Is/Is Not process to do this as well.

The most important of the major deliverables defined the core functionality of INDI and the series of modules that would enable curators, paraprofessionals, and students to complete their work. It was during this phase that it became clear that one of the most important deliverables of the project was the ability of the INDI system to manage business processes. We were particularly interested in managing workflows. Workflows are a very specific type of business process and can be defined as business processes that deliver services or informational products. By integrating workflow management into the INDI system, the system would deliver the right piece of work to the right resource at the right time. This would enable us to achieve maximum efficiency in our archival processes. Workflow management would enable us to define specific tasks in the system and associate those tasks with specific resources that had specific roles. This would allow us to fully manage our distributed workflow and it would give us the ability to monitor key indicators to ensure that we were meeting our established goals. Integrating workflow management directly into the INDI system would also enable us to avoid having to rekey data multiple times in multiple systems or having to programmatically transfer data from one system to the next. Finally, integration would allow us to monitor who was completing which tasks and design workflow logic that would help the system know who to give a task to once the previous task was completed.

Based on the information gathered during the definition of the major deliverables, we decided to divide the project into a number of phases that would be completed sequentially. We also determined that the Is/Is Not process would be repeated for each module in order to help us carefully define them. Phase I was to consist of Pre-Arrangement (acquisition and accessioning, location guide, import process, reports, workflow management, and contacts management) while Phase II was to consist of Arrangement and Description (finding aid creation, catalog record creation, metadata creation, etc.). Business process management was to be built into the underlying architecture of the system and was included in Phase I.

Once the major deliverables were defined the project team turned to creating documentation to help the programmer understand how archival processes worked within the Perry Special Collections. A number of workflow diagrams were created to graphically illustrate the archival processes and the steps that were required to complete each process (see Figure 1). These workflow diagrams were supplemented by textual descriptions of the processes. These materials were handed over to the programmer assigned to the project and he compiled them into the first version of the INDI Design Document. The design document described the overall project and the inter-relationship between the core functionality and the various modules. The documentation process took most of the calendar year 2006.

Figure 1. Acquisition and Accession workflow diagram

Figure 1. Acquisition and Accession workflow diagram (full-size image).

With the completion of version one of the Design Document, the programmer began to program the INDI system using a pre-existing PHP framework of his own design. The core functionality of INDI was completed and presented to the INDI committee for review during fall semester 2006. The committee requested several changes to the main INDI tools and recommended that the Pre-Arrangement module be separated out into component parts to be built individually. The programmer took this information and went to work building a prototype.  There was very little communication between the project team and the programmer for several months while the prototype was being built. The programmer presented the INDI prototype to the committee in late March 2007 and the committee was satisfied enough with the results to recommend internal testing. The prototype included the three core INDI tools and the Acquisition and Accession module.

Several curators and students were assigned to test the prototype vigorously between April 2007 and August 2007. The curators and students testing the prototype identified several minor bugs which were reported to the programmer and fixed. In August 2007 the INDI committee recommended that the prototype be released for production use in the Perry Special Collections and that planning begin for the programming of the Arrangement and Description module.

Although not perfect, the project management process that the INDI committee followed was extremely useful and helped the project team successfully create enough documentation for the programmer to begin building the INDI system. The process of defining the entire project and its component parts enabled team members to gain a better understanding of what we needed from an archival management system. The process utilized to define the INDI project has been adapted for use by another ongoing technology project in the Perry Special Collections. The critical problem with INDI was not process-based but rather technology-based, a fact that wasn’t discovered until after the INDI prototype had been released.

System Features

At the time of its implementation in the Perry Special Collections, most of the core functionality of the system was complete. This included three main tools: a contact manager, a task manager, and a search application. These were built as plug-ins to the underlying Rich Internet Application (RIA) user interface (UI) used to develop the system. The system also took advantage of other features of the UI, though these were not the focus of our development work.

At the base of the system is sierra-php[3], a PHP framework developed by our programmer prior to his employment at the library. Sierra-php is an open source project that provides INDI with Web services, authentication, database abstraction, and other features. It also provides workflow management functionality, using XML descriptor files. These descriptions (as shown in Sample 1) included data entry, data validation, and task assignment based on decision points in the workflow.

<?xml version="1.0" encoding="ISO-8859-1"?>
<workflow due-date="+0-+0-+30" notify-from="" resource="preArrangement.type" resources="etc/plugins/indi/l10n/pre-arrangement etc/plugins/indi/l10n/indi" role-attr="name" role-attr-members="members" role-entity="OsGroup" start="startPreArrangement" user-attr="userName" user-attr-displ="name" user-attr-email="email" user-entity="OsUser" user-key="user" user-key-type="global">
  <!-- acquisition information -->
  <step key="accessionInfo" entity="Accession" entity-id="accession" next="commitAccession" resource="step.accessionInfo">
    <decision next="bocReviewForm">
      <!-- boc approved !== TRUE -->
      <constraint attr="bocApproved" operator="832" />
      <!-- boc approval applies only to byu -->
      <constraint entity-id="collection" attr="institution" value="byu" />
      <!-- criteria for boc approval -->
      <constraint-group connective="or">
        <!-- changes to the contract -->
        <constraint attr="changesToContract" operator="64" />
        <!-- value gt $1000 -->
        <constraint attr="value" operator="2" value="1000" />
        <!-- size is gt 10 linear feet AND material type is not UA -->
          <constraint attr="linearFeet" operator="2" value="10" />
          <constraint attr="type" entity-id="collection" operator="257" value="ua" />

    <task resource="task.descriptiveSummaryForm" validate="1" view="preArrangementAccessionForm" />
    <task resource="task.acqInfoForm" validate="preArrangementAcquisitionForm" view="preArrangementAcquisitionForm">
      // first assign collection to entity
      $data = array();
      $data&#91;'collection'&#93; =& $collection;
      return TRUE;
    <task resource="task.mergeAppraisal">
      // first assign collection to entity
      if ($appraisal) {
        SRA_Util::mergeObject($accession, $appraisal);
      return TRUE;

Sample 1. XML description of the Acquisition and Accession Workflow.

Descriptive XML files were also used to define the plug-in application data model and to generate forms for data entry (see Sample 2).

<view key="preArrangementAccessionForm" template="model/sra-grid.tpl">
      <param id="class" type="table-attrs" value="indiFormTable"/>
      <param id="colSetLabelClass" type="col-set-config" value="myContactsHelpLink"/>
      <param id="colSetLabelOnclick" type="col-set-config"
        value="Core_HelpManager.load('indi', 'AccessioningHelp', '&#91;entity&#93;.&#91;attr&#93;')"/>

      <param id="colSetFormat" type="col-set-config1-0" value="horz"/>
      <param id="title" type="col-set1-0" value="input"/>
      <param id="collectionArea" type="col-set1-0" value="input"/>

      <param id="colSetFormat" type="col-set-config2-0" value="horz"/>
      <param id="colSetWidth" type="col-set-config2-0" value="2"/>
      <param id="creators" type="col-set2-0" value="input"/>

      <param id="colSetFormat" type="col-set-config3-0" value="horz"/>
      <param id="linearFeet" type="col-set3-0" value="input"/>
      <param id="volumes" type="col-set3-0" value="input"/>

      <param id="colSetFormat" type="col-set-config4-0" value="horz"/>
      <param id="spanDateStart" type="col-set4-0" value="input"/>
      <param id="spanDateEnd" type="col-set4-0" value="input"/>

      <param id="colSetFormat" type="col-set-config5-0" value="horz"/>
      <param id="spanDateStartFormat" type="col-set5-0" value="input"/>
      <param id="spanDateEndFormat" type="col-set5-0" value="input"/>

      <param id="colSetFormat" type="col-set-config6-0" value="horz"/>
      <param id="bulkDateStart" type="col-set6-0" value="input"/>
      <param id="bulkDateEnd" type="col-set6-0" value="input"/>

      <param id="colSetFormat" type="col-set-config7-0" value="horz"/>
      <param id="bulkDateStartFormat" type="col-set7-0" value="input"/>
      <param id="bulkDateEndFormat" type="col-set7-0" value="input"/>

      <param id="colSetFormat" type="col-set-config8-0" value="horz"/>
      <param id="colSetWidth" type="col-set-config8-0" value="2"/>
      <param id="materialCondition" type="col-set8-0" value="input"/>

      <param id="colSetFormat" type="col-set-config9-0" value="horz"/>
      <param id="colSetWidth" type="col-set-config9-0" value="2"/>
      <param id="intellectualDescription" type="col-set9-0" value="input"/>

      <param id="colSetFormat" type="col-set-config10-0" value="horz"/>
      <param id="colSetWidth" type="col-set-config10-0" value="2"/>
      <param id="restrictions" type="col-set10-0" value="input"/>

Sample 2. XML Definition of Accession Data Form

This framework is overlaid by sierra-os[4], a PHP-based RIA UI built as part of the INDI development process. Sierra-os provides user management functions to the system, as well as windowing and search functionality. Sierra-os was designed to allow developers to build applications as plug-ins, which could both extend system features and leverage existing functionality. The system features a core plug-in, which includes a federated search tool, spell checker, help manual, and Unix-style terminal (see Figure 2). INDI itself was built as a plug-in application to sierra-os.

Figure 2. Terminal component of the sierra-os core plug-in

Figure 2. Terminal component of the sierra-os core plug-in

The first of the tools included in the INDI plug-in was INDI Desktop, designed for searching and displaying accession data. This tool works independently of the federated search, and queries information associated with an accession rather than with associated projects. INDI Desktop allows for fielded searches, as well general keyword and saved searches (see Figure 3). As part of its display, the application provides summary information together with the status of pending tasks.

Figure 3. INDI Desktop search results
Figure 3. INDI Desktop search results

The second tool, MyProjects, is a project management tool, tracking and providing access to tasks defined in a workflow. In this project management approach, each workflow is treated as a project and managed under that paradigm. Tasks are assigned to users sequentially as defined in the XML workflow descriptor, and completion time is tracked. MyProject provides views of project information, including discussions, tasks, and files associated with projects. Its dashboard view tracks latest activity, as well as tasks that are upcoming or late (see Figure 4). While data entry forms and the information they hold are managed separately by the system, MyProjects provides users access to them.

In order to further improve access to project data, the MyProjects tool also includes search and serialization services. Basic and advanced search capabilities are included, allowing users to complete keyword and fielded searches of project data. Users may also save search queries, which can then be used to generate RSS and iCalendar feeds.

Figure 4. MyProjects Tool

Figure 4. MyProjects tool (full-size image).

The third tool included in the INDI plug-in is MyContacts, a contact management tool. Based on the vCard standard, and supplemented with fields from the Encoded Archival Context (EAC) beta, MyContacts manages information about individuals and corporate bodies that can then be associated with various projects (see Figure 5). Contacts can also be associated with groups, either for management by the user or based on system linkages. The system also allows users to define relationships between contacts, creating a rich web of references between entries.

Figure 5. View of vCard data in the MyContacts tool
Figure 5. View of vCard data in the MyContacts tool (full-size image).

Using the functionality of the INDI plug-in tools, we have implemented two workflows: appraisal, and acquisition and accession. The appraisal workflow includes tasks involved in assessing potential acquisitions, such as documenting collection information and appraisal decisions. The acquisition and accession workflow walks users through the process of acquiring materials and documenting their arrival at the institution. Planning has been completed for an arrangement and description workflow, and preliminary plans were in place for other managing other related archival activities (collection management, digitization, conservation, etc.).

Project Termination

As the core development of INDI was coming to a close the project began to encounter problems. In September 2007, our programmer announced that he was taking another position effective the end of October and that he would be leaving the Lee Library IT staff.  A new programmer was hired quickly and worked closely with the departing programmer for about a month to learn how to continue the INDI development. Together they built the appraisal workflow, which was tested and approved by the INDI project team. Unfortunately, it quickly became apparent after the original INDI programmer left in late October that the INDI system was so complex that the new programmer would need between six to nine months to begin to understand how to support the system and it wasn’t clear if the programmer would be able to continue development utilizing the existing INDI framework. Between November 2007 and March 2008 the INDI committee and the staff of the Perry Special Collections identified several bugs in the INDI system and reported them to the programmer. The programmer spent a lot of time on trying to fix these bugs and on several occasions the library IT unit had to contract with the previous programmer to fix bugs. Very little time was spent on programming additional components of the INDI system.

By the end of April, due to the inability of the new programmer to further develop the existing system, our IT department recommended halting work on the existing code base and beginning planning for a replacement. The existing INDI system would be maintained until the new application was complete and the data was migrated. Nevertheless, to meet our original project goals, the committee completed basic user documentation for the existing system, and the source code for the project was made available to the archival community in August 2008.[5]

Lessons Learned

During the course of developing INDI, the most significant lesson learned by the project team was the need for adequate documentation throughout the development process. When the project was begun the committee members had never been involved in a large-scale software development project, and did not understand the level of documentation needed to support that effort long-term.

On the committee’s side, critical documentation to guide the programmer’s efforts was lacking. The initial design documents primarily consisted of the project definition and workflow diagrams. No functional requirements document or software requirements specification were provided, leaving the programmer unclear what system features were needed to meet the project’s goals. This directly contributed to the second programmer’s inability to extend the application as the committee hoped, as critical features were not included in the underlying framework.

However, after the departure of our initial programmer it became clear that there was also inadequate documentation for the existing code base as well. While the decision to build INDI on our programmer’s own PHP framework reduced development time significantly early on, the lack of easy documentation for the system hindered further development when he left the project. In order to learn the system, the new programmer had to rely on the code itself—reading in-line comments in the code and XML DTDs—to understand the system’s structure and functions.

It also became clear that there was a greater need for communication between the different parts of the project team. Throughout the project there were disagreements over the development schedule, which were compounded by a lack of communication between the IT staff and the subject specialists. Decisions about the development of the system, such as the use of sierra-php and the temporary use of a proprietary XSL-FO processor, were also made by programming staff without consultation with the rest of the committee. While these development choices were clearly within their purview, the ramifications of these decisions ultimately undermined the project. An open discussion with the committee about how the system was being developed might have led to a different solution.

Developing the INDI system also helped us learn the importance of design in software development. In the course of user testing of the completed INDI system, it became clear that there were many aspects of the interface design that our staff found confusing. Much of this came from the project management concepts used throughout the application, and the flexibility that it provided. In the end, the system did far more than our staff wanted or needed it to do.

Finally, we came to understand the need to plan for sustainability. Throughout the development cycle concerns about the long-term viability of the project often intruded in discussions within the committee. These included the need to plan for platform upgrades, security patches, and the addition of new features. In spite of the fact that these issues were discussed, solutions were never agreed upon and these longstanding questions about the long-term maintenance of code generated during the project contributed directly to the ultimate suspension of development work.


While the development of the INDI system as an archival management application was unsuccessful, the process has been a learning experience for all involved. It has led to a deeper understanding of archival processes and descriptive standards among the subject specialists that served on the committee, and has led to the adoption of development and documentation standards by our Library Information Technology division. It has also resulted in a greater integration between the archives and IT staff within the library as we have come to understand each other’s work better.

As the INDI project came to a close, we began planning for its replacement. We began by developing the functional requirements documents and software requirements specifications that had been missing from the original INDI project, as well as building case studies, refining workflow models, and building a new data model. However, with the recent economic downturn the committee has decided to suspend development work.

In its place we are planning to adopt a two-application solution, using separate workflow management and archival management tools. With a two-system solution, information about the processes would be managed apart from the archival information. In this scenario, a process such as accessioning or processing would be initiated in a business process management (BPM) application such as ProcessMaker or Intalio BPM. This application would assign tasks to archives staff, and could notify subsequent users of pending tasks. But to complete the assigned tasks, staff would need to use a separate archives management tool, such as the Archivist’s Toolkit or ICA-AtoM.[6] We are currently involved in a review of existing products, using our documentation as selection and testing criteria.

Although this configuration may appear workable, it is far from optimal. By maintaining the data in separate silos, the BPM system would be unable to automatically process decision points in the workflow as the information needed to make these decisions would be held independently in the archival management system. In order for the BPM application to complete a process, information about the archival materials would need to be entered into that system, leading to data redundancy and constant rekeying of descriptive information. While it may be possible to integrate these two systems using Web services or APIs, current archival management systems have not implemented such interfaces.

While dividing the workflow from the data is not ideal, there is still no tool available that combines the two for archival management. In other industries system designers are beginning to integrate BPM software into applications to manage workflow.[7] In the library environment, the development of the Web interface for Resource Description and Access (RDA), the proposed replacement of the Anglo-American Cataloging Rules, features workflows as a central component in the use of the proposed standard. The INDI Project prototype has demonstrated the power and flexibility provided to archivists when business process management is integrated into archival management systems. It is our hope that the archival management systems currently available consider how to integrate BPM/workflow management into their systems.

About the Authors

J. Gordon Daines III is the Brigham Young University Archivist and Assistant Department Chair, Manuscripts in the Perry Special Collections at Brigham Young University. He holds a master’s degree in history from the University of Chicago and received archives and records management training at Western Washington University.

Cory L. Nimer is manuscripts cataloger and metadata specialist for the Perry Special Collections at Brigham Young University. He holds a master’s degree in history from Sonoma State University, and a master’s in library and information science from San José State University.


  1. Brigham Young University Hawaii in Laie, Hawaii and Brigham Young University-Idaho in Rexburg, Idaho.
  2. Information about the Archivist’s Toolkit project is available online at
  3. Additional information about sierra-php is available on-line at
  4. Additional information about sierra-os is available on-line at
  5. The source code for the INDI plug-in is available on-line at
  6. Information on the ICA-AtoM project is available on-line at
  7. One example is the integration of Processmaker (BPM software) into KnowledgeTree (Document Management Software).

One Response to "Integrating Process Management with Archival Management Systems: Lessons Learned"

Please leave a response below:

  1. Cory Nimer,

    Additional information about the Integrated Digital Special Collections (INDI) is available at the project Web site:

Leave a Reply

ISSN 1940-5758