Arizona State Library, Archives and Public Records
Using Automated Business Rules to
Process Electronic Records
PeDALSPersistent Digital Archives and Library System
►Arizona State Library, Archives and Public Records
►Florida State Library and Archives
►South Carolina Department of Archives & History
►South Carolina State Library
►New York State Library
►New York State Archives
►Wisconsin State Archives
►Two more partners
►Kudos to Washington State Archives
Curatorial Rationale
►Question traditional, paper-based practices in order totransform them into the digital era
Appraisal
Acquisition
Arrangement and description
Housing and storage
Reference and access
Preservation
►Preserving archival principles of provenance, context,collective control, and authenticity and integrity
Technical Goals
►To demonstrate the use of middleware toimplement business rules in software as anintegrated workflow to process collections ofrecords and publications
►To build “digital stacks” using LOCKSS as the basisof an inexpensive storage network that canpreserve the authenticity and integrity of thematerials.
Additional Goals
►To build a community of shared practicethat meets the needs of a wide range ofrepositories
For best practices ~ appropriate practices,what works, what’s practical
For resource sharing ~ avoid redundant work
►To remove barriers to preservation bykeeping costs as low as possible
Preliminary Results:A New Research Agenda
►Vulcan mind meldImmediate understanding, no confusion
►CloningNo time wasted in job search, known results
►Time travelAll the time you need while meeting deadlines
PeDALS at 50,000 feet
►Based on OAIS Reference Model
►Metadata
Transforms and normalizes received metadata
Enhances received metadata
►Archival Information Package
Creates and stores in LOCKSS
►Dissemination Information Package
Creates and publishes to the web
Appropriate Record Sets
►Ideal scenario
Created, stored in a recordkeeping system
Indexed
►Likely to succeed
Certificates
Email
Indexed documents
►Less likely to succeed
Hard drives with no index
►Sufficient number and consistency to allow rules
Curatorial Rationale
►Focus on why, not just how
►Strategic shift in how we work
Not limited to doing things differently
Doing different things
►Curators work with rules, not records
Describe business processes (rules)
Monitor the process for quality assurance
Metadata and Queries
►Single schema
Administration, discovery, preservation
►Elements common to all government records
Definition and cross-walks
Rationale
►What is it
►Who uses it
►For what purpose
Example: Item Title
►Definition: The word or phrase, taken from aprescribed source by which a work is known
►Rationale: Serves as a "handle" to represent theobject at an abstract level in lists, such as searchresults. A supplied title should contain sufficientinformation to aid patrons in the selection ofmaterials. Because date is preferred and includedin search results by default, the title need notinclude date information.
Integrated Workow
Creation
►Prepare Submission Information Package
Extract records for transmission
Extract metadata
Create shipping manifest
►Negotiation
File formats for records, metadata, shipping manifest
Transfer methodology
Frequency of transfer
Description
►Traditional archival description
Provenance
Series
Acquisition
►Rules-based description
Metadata mapping
AIP schema
DIP schema
Submission
►Transfer
sFTP, disk, tape, sneakernet
Deposit on Point of Ingest server
►Data wrangling
Virus scan
Normalize process during initial transfers
Run manual processes to prep data
Create AIP
►Simple schema for single files
Normalized metadata
Received metadata
Record (typically Base64 encoding)
►Compound schema for multiple files
Normalized metadata
Received metadata
Structural metadata
Files (typically Base64 encoding)
Ingest
►Update administrative catalog
►Encapsulate AIPs in Superpackage
►Expose to LOCKSS
Automatic integrity checking
Automatic error correction
Distributed preservation model
Sustainable business model
Inexpensive
Testing a 16TB system
Dissemination
►Create DIP
Browser friendly format
►Update public catalog
►Publish to website
Simplification
►Community of shared practice
Many hands make light work
Resource sharing
Support network
►Generic, modular processes
Code reuse
►Standard schema
Catalog databases
Packages
Automated Processing
►Open source v. proprietary software
►Middleware
Microsoft BizTalk
►Metadata tools
New Zealand Metadata Extractor
Bag It file transfer validation
►Agile-Scrum project management methodology
Project Status – Completed
►Technical infrastructure installed
►Core metadata defined
►Schema for a simple AIP
►Developed administrative catalog
►AZ marriage certificates ingested, transformed andcreated metadata, packaged as AIPs, anddeposited in LOCKSS
►Demonstrated reuse of code by adapting rules formarriage certificates to SC Public ServiceCommission orders
Project Status – To Do
►Complete Administrative Catalog Interface
►Develop AIP for compound records
►Develop DIP
►Develop Public Catalog web interface
►Write rules to ingest additional records andpublications