Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Copy of DigiFair_draft


Keri Thompson

on 9 September 2010

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Copy of DigiFair_draft

Developing Metadata Collection and Workflow Software for In-House Mass Digitization of Texts consortium of 12 natural history museum libraries,
botanical libraries, and research institutions
organized to digitize, serve, and preserve
the legacy literature of biodiversity.

84,000 items/ 31 million pages scanned in 2 years
11,560 items/4.5 million pages scanned by SIL
46TB of files generated so far (hi-res images are JPG2000) Descriptive metadata supplied by partner libraries
Scanning is (mostly) done by the Internet Archive
Digitized books are available at archive.org/details/biodiversity
BHL has it's own interface hosted at Missouri Botanical Garden "Boutique" Scanning Operation one hi-res scanning back camera on a copy stand one person enters title level metadata and administrative workflow information *manually* into an Access database one person manually creates page level descriptive and structure metadata in Excel This works fine for a small book or a few pages of a larger book But what if you want to scan 800-1200 pages a day? If only it were that easy. New system and workflow requirements keep system modular so it can grow and change
connect to existing data sources
accommodate existing workflows
repackage data for use in both internal and external delivery systems Ornithology, Pl. 20
Plate captions
1. Aprosmicturs splendens. Peale;
2. Aprosmictus personatus. (G.R. Gray.); Artist: T.R. [Titian Ramsay] Peale Lithograph Artist: T. House Parrot Fiji Birds United States South Seas Exploring Expedition meet Internet Archive and BHL image and metadata needs
automate capture of as much metadata as possible
make it easy to input that which can't be captured automatically
automate file transfer and manipulation

Title level data Item/piece level data Page level and structural metadata Minimum amount of metadata needed to scan and deliver book-like items Keri Thompson
Smithsonian Institution Libraries the problem(s):
Our mass-scanning (Internet Archive) equipment can't accommodate folios or books that are wider than they are tall
we want to store/manage our own scans for peace of mind and so we can easily repurpose data and images
we need to ramp up our in-house digitizing capability so we can digitize and deliver books for other disciplines ??? Macaw - Metadata Collection and Workflow Thank You an English transcript of this presentation can be provided upon request. Keri Thompson
Smithsonian Institution Libraries
not tied to one specific repository or delivery system
uses common, open source tools
designed to work for books, but is extensible so it can accomodate other models
works with SIRIS but can use other metadata sources Administrative/workflow data capture additional page level (image) metadata for internal uses
capture and re-package metadata and images for use in other systems
embed metadata in images following SI common model
enable creation of METS
make tool flexible and share-able minimum requirements: bonus features: Journal or series title: Smithsonian Miscellaneous Collection
Publisher: Gov't. Printing Office Volume or issue numbering: Vol. 4 Number 6, April 1908
Intellectual Property data: Public Domain scanned
needs conservation
modified 8/2/2009 Page xii
[sequence number]
Full transcript