Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Using GIS to Model Cartobibliographic Data 1.
Transcript of Using GIS to Model Cartobibliographic Data 1.
...and what to do with:
Cote d'Ivoire --why French in this one instance?
in the ESRI shapefile
Paradigm shift in the
way people look for information
Increase Relevance and Discoverability
We need a way to work synchronously.
We have a theme for our talk.
Stating the problem for the audience is a good starting point:
our organizations hinder our users' work.
our collections are barely described.
There must be a way to describe
our corpus of information.
Indexes to large sets
Choropleth maps of collection intensity
Crosslinking between indexes and OPACs and other systems (re: thesis maps in institutional repositories?)
Using GIS to Model Cartobibliographic Data
Using GIS to create access to the library's paper map collection
Huge collections of spatial information bearing objects.
The library catalog is only 1/2 complete. and...
OPAC is 1/2 complete for the sheetmaps after 30 years of effort.
Bespoke MySQL database for airphotos
Deceased ADL database for satellite/tiled raster sets.
No cataloged vector information
The unit of analysis is often wrong
Set level, not sheet level.
Disk, not layer.
The catalog is not built to organize geospatial information, it's built as an inventory and access mechanism for books and other objects.
GeoSpatial Information is best displayed visually. The catalog lacks a visual interface.
users are using increasingly complex visualizations. interactive maps are now an established part of the technoculture.
Airphotos: I need a top down view for planning future systems
This is a collection of aerial photographs, primarily of Southern California, produced by Fairchild Aerial Surveys, Inc. between 1927 and 1965.
Acquired from Whittier College in December of 2012. Transferred to Annex 2 in January, 2013.
The current best description of the holding is the spreadsheet:
MIL:\IMAGERY RELATED\Fairchild-Whittier\Whittier Fairchild Flights Catalog.xlsx
this spreadsheet is maintained in various forms for use as a 'master list' of Whittier-Fairchild flights
Collection is approximately 650,000 images consisting of:
~300,000 paper silver-gelatin prints in ~1,050 document boxes
Flight numbers C-17 (1927) through C-24931 (1965)
A small number of non-C flights: Airborne Systems, Inc; American Aerial; Thompson Aerial, USDA
~350 rolls of safety film
Flight numbers C-1526 (1931ish) through C-24733 (1964)
162 boxes of nitrate film stored at the UCLA Film & TV Archive in Santa Clarita
Flight numbers C-113 (1927) through C-7716 (1942).
Flight list available as a spreadsheet created by extraction from the Whittier Fairchild master list)
7 boxes of 9 X 18" prints and negatives
Flights C-12375 (1948) through C-24544 (1963)
A small number of non-C flights.
(acetate and polyester films)
65 document boxes labeled "copy film"
Flights C-17 (1927) through C-24954 (1963? - no date on photo or list)
A small number of non-C flights.
59 document boxes labeled "camera negatives"
Flights C-7108 (1941) through C-24951 (1963? - no date on photo or list)
9 bankers boxes in sleeves slightly larger than our standard document boxes.
Flights C-2847 (1933) through C-24940 (1963)
Standard library units
Most immediately, I needed to do space planning.
Do we really collect "world topographic coverage" ?
These new countries really need to get busy and produce a national map for me to collect. The question remains open: are these areas covered by my Soviet maps? Only indexing will say.
It took a lot of work to get my data to match the Office country data.
Match country names to shapefile
Still working on temporal data.
Need to extract begin and end dates from old spreadsheet and edit by hand. No fuzziness allowed in ESRI date searching
1966-1988 +/- becomes =LEFT(J430,SEARCH(("-"),J430,1)-1)
This problem exists for those matching MARC metadata to GIS--e.g. Yale, Harvard, others?
My latest and greatest idea is to create website/data set/something that shows all the indexes of the set. Libraries can go to that, use it as a template to index their information. Do all this by point and click--that highlights a box and creates an inventory. I don't know if this is possible to do. Sounds more like an inventory/sales database than something to do with GIS
What's stopping us?
GIS Librarians tend not to be Map Librarians.
Entrenchment in MARC and the OPAC
and possibly a bifurcation between public and technical services attitudes/people.
No accessible sandbox
WAML has a website, but it's a commercial host provider--we need a geoEnabled sandbox.
Maybe ArcGIS Online provides that.
A lack of time and expertise
and vis-a-versa--I worry that in creating all this GIS stuff, I am scaring off my fellow map librarians. They might look at all this and say "that is a lot of work and I don't have the expertise". I was a little disappointed in HI that no one other than you and Jane seemed enthused by my project.
Also, we have to make this thing slick.
Keep your eye out for some commercial website where you click on things to order--more than radio buttons or entering quantities--more like click to highlight. I can try to go rip the code.
A collection-description like this is "library speak"--stuff only librarians understand.
It's from the point of view of inventory control--not the user.
As the world has changed, using interactive maps has become the default method of using a map--the users have caught up to us.
Now we can move towards using our collections visually.
And more and more map librarians are gaining these skills.
While I often think that's a sign of overall lameness
on behalf of the librarian community, I can't really say that, now can I? One thing I secretly wonder: maybe a lot of us aren't actually doing our jobs very well. I fret about it--99% of my materials are never used, or haven't been used in 25 years. Why would users come to the map library? It's almost impossible to use, and the librarians often don't inspire confidence. We tend to emphasize how difficult things are, how understaffed we are, how we are not getting replaced when we retire.
1) here's the base map
2) Click on the maps you own
3) generates list of maps--with all appropriate data you need so as to import to MARC-edit and then to your catalog. Also creates index map in some form--preferably online/dynamic or Google Earth, etc.
I think we're both working on the same problem but from different ends/angles. I'm using GIS to create base indexes. I pull metadata from other places (including the Library's catalog) to help populate the original base indexes. Once I have that done, I am populating the index with info regarding holdings which information I gleamed from our library catalog. My holdings metadata is re-entered by hand. I populate the indexes then use the end result (the highlighted indexes--in whatever form--printed piece of paper, jpg, Google Earth overlay, ArcGIS online map).
Stop me if I'm wrong, but you start with metadata--from a multitude of origins. You plug the metadata into GIS and try to make the data match countries/continents/provinces that have been defined in a base map.
In the end, we are both trying to produce the same thing---a visual (geographic) representation of what our library does and doesn't own. This unto itself is cool, but we are using the results for PRACTICAL purposes including
1) patrons can see what we do/don't have--no more fumbling with the questions of "do you have a map of 'X'?" People can see the area of coverage.
2) This can identify what sheets are missing & where.
3) Makes it easy to identify where holes exist in the collection thus can help guide is in making future acquisitions.
4) Gives us the ability to plan for growth or analyze the need for current space redistribution--goes to "We have nothing on Central Asia, yet I know we are going to buy a lot of stuff of that are in the future. How many sheets cover Kazakhstan? How many drawers does that mean?"
Do I have this right?
What am I missing?
Here's what I think our point is:
The airphoto use case
the kml trick.
Again: printed index maps (multiple kinds)
It's simply a repetition of the quad index process--except with points. Lots and lots of points.
An interim product. PDFs with ESRI's topo basemap, a flightprint, and frame numbers. Point and shapefiles are forming the core of a very large geodatabase
Since Fall of 2010: ~121,000 frames georeferenced
1995-2006(ish): ~89,000 frames georeferenced. Stored in a non-compliant mySQL database that we just cracked open last week. 65,000 points recovered thus far.
When combined: will offer an overview of our most requested California images
>1.5 million images total.
This map is combination of 8 maps (8 different countries in Central America). I created each grid using the fishnet feature in ArcGIS.
After trimming the net and adjusting the boxes, I add the sheet number information. I exported the data to Excel where I populated it with information (metadata) gathered from a variety of sources including library catalogs (mine as well as others), map dealers' websites, and official government agency websites. I added the updated table to produce static maps, as well as KML files for Google Earth. I saved the Google Earth files, and uploaded them to ArcGIS.com. Lastly, I combined the 8 different KML files to make this map. Making the map only took about 10 minutes, but getting all the files in the correct format, and playing with settings on ESRI online took weeks.
Increased relevance and discoverability of both analog and digital collections
For Library users:
Graphic, interactive interface to the collection instead of text-base, confusing catalog.
Easier way to determine what items we own.
Greatly enhanced search features which can zoom to a specific area and discover what map/photo covers that area.
Indexes for map sets -- cooperatively made.
Easier way to determine collection needs -- using limited acquisition budgets in a more targeted way.
Better inventory management.
Enhanced space management and planning.
2 well made websites for maps. Each created by US Geological Survey:
Topoview shows all versions (historic and current) of the USGS topographic maps: http://ngmdb.usgs.gov/maps/TopoView/
National Geologic Mapping Database (NGMD) shows geologic maps for the US:
When creating an index, I add metadata from a variety of sources. These include:
*Map dealers websites
*Library utility databases (OCLC, etc.)
From these, I pull such relevant data as map names, date, publisher, projection, etc.
MOST IMPORTANTLY, if I find that someone (library, publisher) has put a scanned copy of that map online, I add a link to that scan. Thus, when the index is created and tailored to my library's holdings, a user will not only have a link to the sheets that we own, but also can see the map itself.
"What do we do with historical maps once they are on the Web? How can we extend, deepen access to, and make these resources more useful? How and why should we transform such collections into new digital instantiations? Where do scanned cartographic collections and their derivatives fit in an emerging ecosystem of digital data on the Web?"
--Matthew Knutzon, "Unbinding the Atlas: Moving the NYPL Map Collection Beyond Digitization," Journal of Map & Geography Libraries, Vol 9, no.1-2, 2013.
Now that we have some things scanned, what do we do with the scans? Is there anything we can make other than a list of the scans or place links in the library catalog?
Matt talks mostly about creating indexes for his scans and making those into Google Earth overlays, plus taking the scans and warping them to use as Google Earth overlays. The first part deals with what we're working on, the second part, not so much.
Matt also mentions "Building a Globally Distributed Historical Sheet Map Collection"--IMLS-funded project at Yale in 2007-8. http://imlsmap.lib.uconn.edu/index.html
I have yet to read it (as of 6/12). I probably saw it back in 2008 but walked past it.
I think (but don't know) that this was the start of Yale's project (Tanzania, Guatemala, etc.).
Also, he talks about using corner co-ordinates in an Excel sheet to make the bounding boxes for GIS, etc.
IS THIS WHAT YOU ARE DOING???
This is a good idea IF you have the bounding co-ordinates (could be that you have each sheet individually cataloged and can pull them from there.
This issue concentrates on historical maps,
which is great, but I think we're working on stuff
that is more widely available and used (particularly in industry).
DO YOU AGREE??
National Geologic Mapping Database http://ngmdb.usgs.gov/
Zoomed in part way.
Zoomed most of the way in.
"In view" -- list of all the maps shown on the screen
Pop-up of map chosen.
"Image preview" of map.
"Product description"--image of map, plus metadata, ordering info, etc.
"Slider" feature -- reveals corresponding topo underneath.
"Slider" feature -- reveals corresponding satellite image underneath.
Zoomed in about half way.
Use of time-slider--shows maps produced between certain dates.
JPEG of map image. [note: this is a scan of a map owned by the Colorado School of Mines--see the stamps]
"In view" -- lists all the maps of this location.
Metadata of map.
A classic problem in map and data libraries is
the unit of analysis problem
: do you create metadata for the set of maps, or each sheet in a set? The widespread adoption of GIS in libraries has only exacerbated this problem: Do you describe the CD-ROM, or each of the 200 data layers on the CD-ROM? By carefully analyzing the data in hand, as well as the needs of the library users, libraries can create metadata, interactive online maps, and plain-old library catalog records that
increase the relevance and discoverability of both analog and digital collections
once aggregated, the collection can be analyzed
to promote data acquisition. The authors present examples from the domains of remote sensing and geologic mapping, as well as a technique to rapidly
assess large corpora of materials
How people find information has changed dramatically/is changing.
People use maps every day.
Very Large Collections -- intimidating
Libraries don't typically describe every page of a book--only the book.
But what happens when a map has 48,000 sheets?
Or when disk has 250 layers?
Traditional catalog records rarely give you a decent picture of what a set looks like.
And usually no picture at all.
Solution -- used GIS to create access to the library's paper map & photo collections.
All California Imagery
UCSB has about ~3 million images.
120k centerpoints in this new system
90k centerpoints in a legacy ADL system
200k images scanned.
"Maps"! How many maps?
Increase Relevance and
Paradigm shift in the
way people look for information
Bibliographic data (ie: library catalogs) continues to be text driven.
Library catalog information is overwhelming--too much data makes things hard to use.
Shift in user expectations: graphic search is now commonplace.
People use maps every day.
We collect a wide variety of materials.
I'm looking for metadata
wherever I can find it.
I have some to start with.
I'm also creating it.
I'm trying to get a top-down, from 10,000 feet, view of the collection.
I have no idea.
Interactive maps that deliver collections of digital content
Definitely. My biggest users are commercial--for my old airphotos. And our argument for cooperative indexing is for the wider community--a: for institutions that have materials but not necessarily local GIS skills. b: there's so much 20th c stuff, if we can't easily 'see' what we have, we can't make retention decisions.
"Now that we have some things scanned" some of the materials don't need to be so widely available physically, but neither the physical ones nor the digital ones do the users any good if they don't know they exist.
Santa Barbara Co. 1957 - 67
Santa Barbara Co. 1972 - 82
By using the time slider, I am able to determine in just a few seconds that I have complete coverage from 68-89.
Just as quickly, I was able to identify 2 periods where we lack coverage for Vandenberg and the national forest.
Yangze River, 3 Gorges area. 1946. MIL Fairchild Collection. UCSB Library.
Neurotic librarian worries
Let's see about that with ArcGIS for Office.
I think geo-referencing this might be easier than
reading it as a scanned map.
the most useful units of analysis for our collection's users are frame and time.
And this still doesn't tell me HOW MUCH
of any one country we have. I still needed to do the drawer analysis.
Text-based search results in library catalog.
Static, graphic (pdf) index available via Library's catalog.
2.8 million images
350,000 sheet maps (36,000 titles)
Colorado School of Mines Map Room:
172, 000 maps
6,000 books and atlases
36,000 aerial photos of Colorado
What we are not talking about today:
“The Open Geoportal is a collaboratively developed, open source, federated web application to rapidly discover, preview, and retrieve geospatial data. OpenGeoportal.org is also a collaborative effort to share resources and best practices in the areas of application development, metadata, data sharing, data licensing, and data sources in support of geospatial data repositories.”
Catalog records paint incomplete
and confusing picture
Traditional catalog records rarely give you a decent picture of what a set looks like, much less a picture of the map.
Solution -- use GIS to create access to the library's paper map & photo collections.
Standard library units
of measure don't apply to maps or data:
The information in the pop-up window is not overwhelming--sheet #, name, date, library information including a link to the catalog.
Uses of end products:
Digital maps and GPS used to be niche items that only enthusiasts used. Most people were wowed by the technology and had to be trained to use them. Today, digital maps are included in all forms of communication, and GPS is embedded in every mobile devise, and using them is second-nature.
Christopher J.J. Thiry
Mr. Atoz, the Librarian
Hand-colored, photocopied index
Map and sections through the Iron Hill, Leadville Colorado, by A.A. Blow, 1888
The Leadville Mining District, [Lake County, Colorado], by Chas. F. Saunders, 1901
from Atlas de Paris et de la Région Parisienne. 1967
Identifying gaps in collections
Determining coverage levels
Number of drawers @ Annex = 545
Drawers of maps
turned out to be an appropriate unit at this stage. I need to know how much space we're devoting to certain categories of material because we are entering a stage of zero physical growth, and some of my most unique materials are dreadfully overcrowded.
MIL has a large collection of graphic access points
...and too much text
It's a controlled vocabulary issue
We have coverage of these areas, it's just that we call it something else. Country names are neither stable nor well controlled.
From drawers, we derive coverage maps:
Data problem again:
States make sense:
if I don't have more maps of California than anyplace else, I'm in trouble.
25,000 - 50,000
50,000 - 75,000
75,000 - 100,000
100,000 - 125,000
> 125,000 -
Sheetmaps in collection:
stacked geographies obscure underlying data
re-symbolized data obscures variation.
Height: cumulative maps that include that area
Color: number of maps of that geography
Air photo finding aids
Make more indexes & web pages
Focused effort to act cooperatively with other libraries
Explore ways to make processes easier & faster
Connect my lists of indexes to maps and the catalog
Revival of ADL Gazetteer
Preparing FrameFinder for server-side
@WAML's fall meeting
Cooperative indexing unconference
Compiling together as much sheetmap index as we can find and designing a platform (ArcGIS Online?)
Visual representations based on GIS data allow instant evaluation for:
Number of drawers @ Annex = 545
Drawers of maps
Visualizations for decision-making don't need to be fancy. Sometimes simple is better--this was the chart that inspired this whole project. All of the US topographic maps stored off-campus were 'low use.' Discards went to LSU to replace maps damaged by use after Hurrican Katrina and the Gulf Oil Spill.
Sometimes a map helps. We are SUPPOSED to have more maps of California than anywhere else, but do we?
Interactive web indexes
will make collections more accessible at the sheet level
will increase use of items within our collections
Collection analysis with GIS
very useful when data exists at the sheet/frame level
thus far, less useful when data is aggregated at the set / collection level