Preservation

From Digital Libraries - Fall 2009

Jump to: navigation, search

Contents

[edit] Contact & Liaison Information

Name Role Contact
Liz Chair Elizabeth.Knuth@simmons.edu
Danny Pucci@simmons.edu
Matthew Matthew.Baker@simmons.edu
Sarah Sarah.Burke@simmons.edu


[edit] White Paper

Preservation White Paper

[edit] Preservation Metadata Sheet

Preservation Metadata (to be filled out during scanning)

Naming the Preservation Metadata document

For naming preservation files, please use the same file name you used to name the JPEG/TIFF images (per digitization). However, APPEND "-pres.doc" to the end (and DELETE .jpg or .tif).

Example
helyar-57-0-pres.doc
  • Remember, you need to create a Preservation Metadata sheet for every TIFF image you create (but not for JPEG images).
  • Please be sure to remove/replace the items in RED so that we know the sheet is complete.
  • Return you completed forms to elizabeth.knuth@simmons.edu. The subject line of your email should say “462 Preservation Metadata [name of operator].”

[edit] Meetings

[edit] Minutes from Meeting, 10/1

Liaised with digitization committee to discuss finalized template and scanning instruction for preservation purposes to insure they were in line with digitization committee's plans.

  • Question as to how Simmons Archive is to be recorded under "object identifier (type)"; Marta will send email to Simmons Archive to determine if they have a nationally recorded repository number. If not, identifier will by "Simmons Helyar" per digitization team.

Finalized specifications for digitization and preservation metadata template for distribution to scanner operators.

  • Scanning specifications will be: RGB, 24-bit (8 Bits per Channel in Photoshop), 600 dpi saved as uncompressed TIFF.
  • Decided that we will not record Gamma correction and color calibration in preservation metadata as they do not appear to be applicable in the Simmons Archive scanning equipment setup.

Finalized technical metadata template:

  • Completed metadata template; will send to digitization committee with instructions to have scanner operators complete form electronically and return electronic form to Liz K.
  • Email subject title should be: "462 Preservation Metadata [name of operator]"
  • Final form of metadata specs include 24 semantic elements with 10 items to be entered or confirmed by scanner operator. This will be sent to digitization committee for incorporation into scanning instructions (include technical details about where information can be found using Photoshop software).
  • Discussed "source type" element in preservation metadata; decided to record content as "scrapbook, reflective material"

Finalized "Practical" Scanning Specs:

  • Maintaining the intellectual integrity of the original artifact is an important principle for any reformatting project.
  • With regards to scanning, we need to be sure to crop out the scanner bed but not any of the object's edges (i.e., leave small margin around object).
  • Basically, for the archival copy we want to make images that are as much like the physical artifact as possible, with no enhancements or editions. DO NOT make adjustments/enhancements at time of scanning.
  • Scan all items unfolded UNLESS its folded structure is one that would be compromised and could not be reconstructed (e.g., if it's an origami crane, can the scanner operator refold to original state?).
  • Loose items should be scanned front and back, when possible without removing from page (don't unanchor the original item if it is affixed to page and removal would be irreversible).
  • When relevant, items should be scanned both folded (as on scrapbook page) and unfolded (off page if possible).
  • It is important to remember that we can't predict what information will be useful to future users.

[edit] Minutes from Meeting, 9/24

Discussed requirements/recommendations to present to digitization team.

After conversation with digitization team, determined we ONLY need to provide requirements for the archival master file (any adjustments that team describes to class will be included in digitization instructions).

We determined we will give digitization team TECHNICAL (resolution, etc.) and PRACTICAL (preserving image’s intellectual integrity) requirements.

Conversations around TECHNICAL included: Archival community seems to prefer to store 24-bit RGB Tiff files at least 5000ppi on the long dimension (no manipulation); as far as recommendations, conducted scanner test in archive (see below).

Conversations around PRACTICAL included:

  • Maintenance of intellectual integrity of the objects created (e.g., don’t do things you think will make it look better for public; retain original LOOK of item even if dark, or oddly colored).
  • Cropping: Crop out scan bed, but leave small border around scrapbook page, preserve WHOLE item, including ragged edges, etc.

Also determined we should have each person keep a record of technical metadata; preservation team to provide form for completion at time of scanning (some info will be pre-populated/boilerplate).

  • To be determined where technical metadata is ultimately stored, but preservation metadata file SHOULD travel with archival Tiff image; likely save metadata as PDF (not Word or Excel).


Another detail: We learned that the Simmons Archives saves these archival images on CDs kept in cold storage; could we recommend that they use gold CDs? CDs are not very stable.

[edit] Scan test in Simmons Archive

Note that the complete scanner specs are available at: http://www.epson.com/cgi-bin/Store/consumer/consDetail.jsp?BV_UseBVCookie=yes&infoType=Specs&oid=17065&category=Products

Tested at 24-bit (8 bits per channel) RGB and 48-bit (16 bits per channel) RGB, both at 600ppi (we wanted to end up with at least 6000 pixels on the long dimension when scan is complete)

Test at 24-bit produced ~100mb file, 48-bit produced ~200mb file; in test, could not detect noticeable image quality difference between the two files, but there could be a noticeable in a continuous-tone photo image (tested inkjet print, not continuous-tone photo).

[edit] To Do

  • Create TECHNICAL and PRACTICAL requirements for digitization team (team should use wiki to begin to compile list).
  • come up with a list of preservation metadata we want to capture; create form for members to complete at time of scanning – different from descriptive metadata form that will be completed post-scan with files in hand (Danny has test scans and will create table w/boilerplate info from scan metadata).

[edit] Technical Specifications from NARA Document

Images from: NARA -- Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files – Raster Images (Note that the LC doc "Technical Standards for Digital Conversion Of Text and Graphic Materials" is similarly useful.)

Image:TechSpecsNara.jpg

[edit] Technical Metadata example from NARA Document

Image:TechMetadata.jpg

[edit] Developing bullet points for TECHNICAL and PRACTICAL requirements

[edit] TECHNICAL

  • we will recommend to the digitization team a particular set of requirements for the archival images (as opposed to the various formats of use/access images)
  • we will be scanning in color (RGB)
  • bit depth, resolution, file type are some of the major considerations; we are pretty settled on TIFF as a file type and are still deciding about other requirements
  • decisions will be based on best practices suggested by other institutions AND by the capabilities of the scanner in the Simmons Archives


[edit] PRACTICAL

  • Maintaining the intellectual integrity of the original artifact is an important principle for any reformatting project.
  • With regards to scanning, we need to be sure to crop out the scanning bed but not any of the object's edges.
  • Loose items should be scanned front and back, when possible.
  • When relevant, items should be scanned both folded and unfolded, both closed and open, etc.
  • Basically, for the archival copy we want to make images that are as much like the physical artifact as possible, with no enhancements or editions.
  • It is important to remember that we can't predict what information will be useful to future users.