From Container to Context
How cataloguers can drive a fundamental and necessary change in resource description, with inspiration from Eric Lease Morgan, Cory Doctorow and David Weinberger
»
Metacrap
Putting the torch to seven straw-men of the meta-utopia
Cory Doctorow, 2001
1. People lie
Your mailbox is full of spam with subject lines like
"Re: The information you requested."
your clueless aunt sends you email with no subject line
half the pages on Geocities are called "Please title this page"
your boss stores all of his files on his desktop with helpful titles like "UNTITLED.DOC."
2. People are lazy
3. People are stupid
You can almost always get a bargain on a Plam Pilot at eBay
When Nielsen used log-books to gather information on the viewing habits of their sample families, the results were heavily skewed to Masterpiece Theater and Sesame Street.
Replacing the journals with set-top boxes that reported what the set was actually tuned to showed what the average American family was really watching...
4. Mission: Impossible -- know thyself
5. Schemas aren't neutral
The conceit that competing interests can come to easy accord
on a common vocabulary totally ignores the power of
organizing principles in a marketplace.
It's wishful thinking to believe that a group of people
competing to advance their agendas will be universally
pleased with any hierarchy of knowledge.
The best that we can hope for is a detente in which
everyone is equally miserable.
6. Metrics influence results
7. There's more than one way
to describe something
Reasonable people can disagree forever on how to describe something.
Arguably, your Self is the collection of associations and descriptors you ascribe to ideas.
This laziness is bottomless.
No amount of ease-of-use will end it
naked midget wrestling
America's Funniest Botched Cosmetic Surgeries
Jerry Springer presents: "My daughter dresses like a slut!"
And that's just not right.
Why catalogue?
Cutter, 1876
Find
knowing author, title, subject or category
by given author, subject or kind of literature
Show what's available
Assist in the choice
as to edition or literay or topical character
confirm found entity is the entity sought, or distinguish between similar entities
an appropriate entity
Select
meeting search criteria
FRBR, 1998
Find
Identify
Obtain
access by purchase, loan, electronically
succinct yet precise description, differentiate
versions
Bring together what belongs together
Produce reliable results
Clearly display differences,
present meaningful choices
"known-item" search, consistent description
using well defined attributes
Eversberg, 2002
works by author, editions of a work,
parts of a whole, related resources,
works about a person or another work
From Container to Context
How cataloguers can drive a fundamental and
necessary change in resource description
Kent Fitch
National Library of Australia
Is cataloging
even necessary?
Little Feat, Dixie Chicken video
Melting point of sulphur
Macbeth study guide
"Buddism" introduction
railways society mid west nineteenth century
railways society mid west nineteenth century
railways society mid west nineteenth century
Google Magic...
distributed, page popularity contest, voted for by you (everyone)
sophisticated natural language processing
secret and very clever algorithms
tens of thousands of servers
... all paid for by AdWords
Context
A pattern language : towns, buildings, construction
Christopher Alexander ...
NBD subjects:
Symbolism in architecture
Semiotics
City planning.
Design.
Design methods
Design paradigm
Design pattern
Design pattern (computer science)
Pattern (architecture)
Pattern language
Sense of place
Tags derived from Wikipedia
architecture(36)
design(23)
home design(14)
christopher alexander(13)
patterns(12)
reference(11)
ideas(10)
city planning(6)
nonfic(4)
etech07(2)
expertise(2)
patterns architecture(2)
urban planning(2)
arch bk(1)
book(1)
building construction(1)
city design(1)
community affairs(1)
comprehensive plans(1)
history(1)
Amazon's tags
architecture(36)
design(23)
home design(14)
christopher alexander(13)
patterns(12)
reference(11)
ideas(10)
city planning(6)
nonfic(4)
etech07(2)
expertise(2)
patterns architecture(2)
urban planning(2)
arch bk(1)
book(1)
building construction(1)
city design(1)
community affairs(1)
comprehensive plans(1)
history(1)
interior design(1)
lifestyles(1)
organization(1)
paradigm shifting(1)
pattern languages(1)
planning and zoning(1)
school(1)
structure(1)
sustainable development(1)
LibraryThing's tags
Amazon's cites & cited by
Google Scholar's cited by:
Amazon's reviews
Two problems...
Doctorow: Schema aren't neutral
Weinburger: Control doesn't scale
The choice for libraries
Business as usual ?
Stick with AACR2/RDA/LCSH
Remain a tiny, closed ecosystem
Don't try to scale
Ignore most content
? Adapt and grow
Adopt new approaches
Open to the web
Build scalable descriptive frameworks
Describe everything
Add context to everything
Help everyone participate
I'm assuming...
You're passionate about libraries and facilitating public access to information and culture
You're wondering how libraries will survive as access to information and culture is increasingly commercialised
AdWord revenue increase 08->09: $US1.8bn
Total spend on public libraries in Aus 2006: $A0.76bn
Total spend on CAUL (AUS) libraries 2008 : $A0.58bn
There's more than one way..
LCSH
Not a thesaurus:
Computer programming.
Computer algorithms.
Computational complexity
Computers and intractability : a guide to the theory of NP-completeness / Michael R. Garey, David S. Johnson
NBD subjects
widely held (31 libs -a classic "cited by 29338")
Dervived wikipedia tags
3-partition problem
Bin packing problem
Boolean satisfiability problem
Bottleneck traveling salesman
problem
Clique cover
Clique problem
Complete coloring
Cook–Levin theorem
Cut (graph theory)
David S. Johnson
Degree-constrained spanning tree
Dominating set problem
Edge cover
Edge dominating set
Exact cover
Feedback arc set
Feedback vertex set
Graph coloring
Graph isomorphism
Graph isomorphism problem
Hamiltonian path problem
Independent set problem
Knapsack problem
L (complexity)
Linear programming
List of important publications in theoretical computer science
List of multiple discoveries
List of NP-complete problems
List of PSPACE-complete problems
Longest common subsequence problem
Matching (graph theory)
Maximum common subgraph isomorphism problem
Maximum cut
Metric dimension (graph theory)
Monochromatic triangle
NP-complete
NP-hard
Polynomial hierarchy
Post correspondence problem
PSPACE-complete
Quadratic assignment problem
Quadratic programming
Quadratic residue
Set packing
Set splitting problem
Shortest common supersequence,
Spanning tree
Subgraph isomorphism problem
Subset sum problem,
Travelling salesman problem
True quantified Boolean formula
Vertexcover
computer science
complexity
algorithms
mathematics
computation
computational complexity
np-complete
book
combinatorial algorithms
cse200
graph algorithms
intractability
np complete
np-hard
Amazon tags
LibraryThing tags
ai (1)
algorithms (6)
annotated (1)
college (1)
complexity (7)
complexity theory (4)
computability (2)
computational complexity (1)
computer (1)
computer science (35)
computers (1)
computing (5)
cs(3)
cstheory (1)
essential (1)
informatica (1)
intractability (2)
jensen-msr-2007 (1)
kontoret (1)
logic (3)
mathematics (20)
non-fiction (6)
np stuff (1)
np-complete (8)
textbook (4)
theoretical computer science (3)
theory (3)
theory of computability (1)
theory of computation (2)
wishlist (1)
Broadening and narrowing search is very unreliable
Hard to use (for cataloguers and searchers)
FRBR - editions/versions
Series - author, publisher
Lists
Subject trails
Citation trails
Related
Reviews
Annotations
Ratings
From: Tim Spalding <tim@librarything.com>
Date: Sat, Oct 16, 2010 at 3:44 AM
Subject: Re: [ol-discuss] Series titles: include individual ID or not?
To: Open Library -- general discussion <ol-discuss@archive.org>
FWIW, the matter has been discussed for years on LibraryThing's
"Combiners!" and Series groups, in excruciating depth. The options are
more complex than seems at first sight, because—like so much
library-related metadata—getting it right requires taking account of:
1. Degrees of truth
2. Differences of opinion
3. Awareness of different levels of hierarchy
4. Understanding who says something as an element of the something
5. Understanding that metadata for an item continues to change AFTER
the item is cataloged.
Examples of the concepts in practice:
In what sense are Tom Sawyer and Huck Finn in the same series? When did they become so?
Are all those Bond books, including the recent ones, in the same series? How about sequels to Jane Eyre? Does it change when a publisher packages the book and its faux-sequals together as a sequel?
What is the order of the Narnia books?
Does the Harvard Classics include the Odyssey? Yes. Is the Odyssey part of the series "Harvard Classics." Not so much.
LCSH
Kelley McGrath, Cataloging & Metadata Services Librarian, Ball State University, chair of Online Audiovisual Cataloguers Cataloging Policy Committee
"Facet-Based Search and Navigation with LCSH: Problems and Opportunities" CODE4Lib issue 1, Dec 2007,
Too many top level terms
Compound headings break facets & hierarchical navigation
and makes it difficult to search for:
"Art, Buddhist"; "Cookery, Japanese" ; "Adult children of alcoholics, Writings of,"
communicable diseases (broad) in Kenya (narrow), or AIDS (narrow) in Africa (broad)
Commerce is taking over bibliographic description, discovery and delivery
Google
Amazon
Apple
Shelfari (now owned by Amazon)
LibraryThing (1.2m users, 40% Amazon)
GoodReads (3m users)
null0: http://www.flickr.com/photos/null0/271977303/in/photostream/
cindiann: http://www.flickr.com/photos/trucolorsfly/2970326939/
I'm interested in the social impacts of
railways in the mid-west of America
during the 19th Century.
What are useful ways of investigating this?
What should I read first?
What should I read next?
I want to study critical analysis of the
tactics used by Whitlam in his response
to the supply crisis of 1975 -
where should I start?
I need to keep track of
information about the ongoing disease restistance of genetically modifed canola -
what's the best way to do this?
My 10 year old is very keen
on Egyptian art - how can I find resources to inspire her to keep exploring?
Description using a real thesaurus
Description using tags
Lists
Related resources
Reviews
Annotations
Collection and organisation of snippets
Link out, link in
Find and acquire are library goals,
not the goals of users
Eric Lease Morgan
Paul Hagon: http://www.flickr.com/photos/paulhagon/2965862244/
The fundamental processes of librarianship (collection, preservation, organization, and dissemination) need to be expanded to fit the current digital environment.
...
The next “next generation library catalog” is not about find, instead it is about use.
Eric Lease Morgan
"The Next Next-Generation Library Catalog"
http://infomotions.com/blog/2010/06/the-next-next-generation-library-catalog/
Reading level characterisation
Identification of characteristic text
Citation chaining
Related resources
"People who accessed this also accessed..."
Textual analysis
Link out
Facilitate
Automatically perform
What can cataloguers do?
Specify, build, promote, maintain environments/systems which:
Encourage and combine the
expertise of the community
Thrive on a diversity of description
librarians
subject experts
the general public
Only a tiny percentage of information
will ever be described in AACR2/RDA/
MARC/LCSH
Without librarians and what librarians understand, what we build would likely be less usable, less reliable, less diverse, less provocative.
What we build without librarians would unnecessarily constrict our understanding and imagination, rather than exuberantly expanding them.
This is especially true given that the system is being designed to a large degree by commercial entities.
Knowledge As a Network
ALCTS Midwinter Symposium
January 15, 2010
David Weinberger
Joi Ito: http://www.flickr.com/people/joi/
Support metadata services
versioning
subsets/views/layers
scopes
attribution/provenance
rating
open to reuse
open to arbitrary extension
commercially disinterested
Are easy to use
Are open
Requiring everyone to use the same vocabulary to
describe their material denudes the cognitive landscape,
enforces homogeneity in ideas.
http://openbiblio.net/
Join the Open Knowledge Foundation
Working Group on Open
Bibliographic Data
A better cooperative cataloguing environment?
Visit OpenLibrary http://openlibrary.org
Freedom is actually a bigger game than power.
Harriet Rubin
Power is about what you can control.
Freedom is about what you can unleash.
Australian Committee on Cataloguing Seminar October 2010
http://www.searchenginepeople.com/blog/google-autocomplete-fails.html
Help create an (inter)national
public digital library
"A Library Without Walls"
Robert Darnton,
Director of the Harvard University Library
http://www.nybooks.com/blogs/nyrblog/2010/oct/04/library-without-walls/
650 _0 $aAustralian constitutional crisis, 1975
651_0$aAustralia $xPolitics and government $y1974-1976.
653 __ $aAustralia - Politics and government - 1972-1975
653 __$aConstitutional Crisis 1975
650_7$aAustralia. Political events, October 1975- November 1975. Personal observations $2precis
[ind2=7 means "Source specified in subfield $2 "]
constituinal crissis search misses:
The Australian constitutional crisis of 1975 : facts and law/Institute of Public Affairs (N.S.W.),
Labor and the constitution, 1972-1975 : essays and commentaries on the constitutional controversies of the Whitlam years / edited by Gareth Evans
The Whitlam government 1972-1975 [Gough Whitlam]
The facts and the law : a summary of important documents including a copy of the Australian Constitution relating to the political crisis of 11th November, 1975
The Whitlam years : Australia's Labor legislation, 1972-1975 / with foreword by Malcolm Mackerras
The Whitlam venture / Alan Reid
The leader : a political biography of Gough Whitlam / James Walter
The rise and fall of Gough Whitlam / Pat Farmer
Readings : the Whitlam dismissal / [(ed) Jacqualine Hollingworth.]
26 very relevat results with a subject of * constitutional crisis 1975; but..
Traditionally...
Scope
ProvenanceMore presentations by kent fitch
NLA-IIF-Resistance-is-futile
kent fitch on
NLA Innovative Ideas Forum 2010 Resistance is futile: how libraries must serve society by embracing cloud culture, the end of the information age, and inevitable ...
Scaling up
kent fitch on
Presentation to Australian Computer Society, Canberra, 6 Decemeber 2011 on NLA's Newspaper digitisation and Trove
Not a total experimental failure - an experience report on the Trove architecture
kent fitch on
Update handling, load balancing and fault tolerance in Trove.