Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Institute of Museum and Library Services Grant

  • Transcribe & Encode
  • ~300 Freedom Suits filed in St. Louis Circuit Court
  • ~1800-1864

Judicial Precedent

Create an extension to the TEI consisting of a set of additional elements, with rules described in an XML schema, for the encoding of legal texts within a proscribed domain

  • Translating the structure and function of a discrete set of legal/historical documents into the XML framework
  • Our initial efforts should form the basis of a flexible schema that could be used for related document sets

- e.g. Criminal litigation of the same era

  • The standard should form a nascent part of an incremental process that will expand slowly to other collections

What is the appropriate entity for encoding?

Case vs. Document

  • Near the end of litigation, slave owners who felt they were likely to lose would often sell the slave to a new owner, thus forcing the slave/plaintiff to re-file the litigation and begin the process anew
  • The issues in the cases would usually be identical
  • Because of the sometimes ill-defined boundaries of the cases, the legal consultants proposed using the individual documents as the basic entity to be encoded
  • This would facilitate the distinction between meta-data and information inherent in the basic textual object
  • Further developments in the schema could facilitate the appropriate grouping of documents as needed into cases or related case (or historical) groupings

Nature of Suit (limited types)

  • Freedom
  • Fur Trade
  • Native American
  • Lewis/Clark/Corps of Discovery

Authority

  • Branch of government (Legislature, Judiciary, etc.)

Court

  • Jurisdiction (County, State, Federal)

Case Number

People

  • Type of Participant/Party

Date(s) of Suit

  • Filed, Discovery, Decided, Appealed, etc.

Disposition

  • at various levels, jurisdictions

Case Citation(s)

Type

  • All Litigation
  • Sample Types (pleading, summons, discovery, etc.)
  • Can be scaled to criminal/transactional/etc.

Date(s) Associated with Document

State of Litigation Associated with Document

Source of the Document

  • Author,
  • Court
  • Other

Structure

  • Pleading

All expandable to other types

  • All 500 cases imaged and transcribed
  • 480 cases have had their transcriptions proofread and edited by undergraduate students
  • 420 cases have been encoded using the legal extensions to TEI

Missouri Freedom Suits

Winny v Whitesides

FREEDOM = Living in Free Territory

Winny v Whitesides

Statutory Requirements

  • Northwest Ordinance & 1824 Missouri Statute
  • Petition to Sue for Freedom
  • Trespass, Assault & Battery, False Imprisonment

Other Freedom Suits

Julia v McKinney

Found Records

Lewis & Clark

Fur Trade

Native American

Freedom

Earlier Grants & Projects

Digitized

Dred Scott

Our Grant

National Importance

Transcribe

Encode

Combination Approach

Pleading Requirments

trespass, assault and battery, false imprisonment

manumission

born free

lived on free land

Independent membership consortium hosted by academic institutions in the US and Europe

Produces a set of guidelines which specify encoding methods for machine-readable texts

Digitize, Transcribe, Encode

Text Encoding Initiative (TEI)

St. Louis Circuit Court Historical Records Project and supplementary material in TEI XML

TEI is now the de facto standard for the encoding of electronic texts in the humanities academic community, and is used for literary documents, cultural heritage documents, and many library collections

Guidelines are a widely-used standard for text materials for performing online research and teaching

Technological Goals

IMLS Grant

Develop extensions to the TEI for encoding legal documents

Legal XML

Major Grant Deliverable

to reflect legal function, genres, and roles, and employ these extensions in this collection

  • When Digital Library Services started working on the Revised Dred Scott Project in 2007, there were obstacles to converting the documents to XML

  • Due primarily to a lack of a consistent scheme for representing the documents’ legal function

  • Started looking for appropriate standards to encode legal documents in XML

Interconnected Texts

Create the extensions with their expansion to other domains in mind

Primary Goals

Legal Professional Groups

Combination Approach

  • Less interested in the documents per se, than in the information they contain, or the efficient exchange of that information (LegalXML, GJXML, MetaLex)
  • Often too broad in scope (covering all legal documents for all time, losing specificity)

Start with a specific collection of legal documents, but develop TEI extensions that would apply to a slightly broader domain than just the documents in that collection

Image

Standard can be developed incrementally as other groups implement the standard for encoding legal texts from other domains

Other Efforts at Legal XML

Library and Archival Community

Transcribe

  • Have an interest in representing the documents to some extent as artifacts
  • Often too parochial (only applying to documents in a given collection)
  • In the freedom suits, a case was not a completely discrete entity
  • Cases were often closely related to other litigation with the same plaintiff
  • Multiple cases existed sequentially with the same plaintiff(s) and/or defendant(s)
  • Appellate opinions are frequently not in existence or not recorded
  • Historical records are better indicator of ultimate result

Complicating Factors

Encode

A Case in General

Foundational Issues

User Interface

Full text, Boolean search capability by the end of the year

http://digital.wustl.edu/legalencodingproject/

What is our primary unit of text/textual object?

  • Case or Document?

Specify a domain for application of the TEI extensions

  • temporal, geographical disciplinary (law, history, etc.)
  • divisions within disciplines, civil vs. criminal in law
  • A case in the larger legal sense could include non-textual, non-print objects, usually in the form of exhibits.
  • These could include technological artifacts such as recordings (audio/video), digital files, and other physical entities
  • Subsequent litigation is well-documented

Functionality

Status

Help!

This type of encoding allows us to preserve the documents as documents, but also treat them as records in a database

Once these key pieces of legal information are identified, researchers will be able to perform smart searches - not just free text searches

http://www.digital.wustl.edu

Initial Proposal

The Case in this Corpus

The Core Object

Named Entity Recognition (NER)

Refinement of Document Model

Resource Description Framework (RDF)

  • A case in the Freedom, Fur Trade, Lewis/Clark and similar suits usually consisted of multiple filings, depositions, orders, motions and judgments, among other document types
  • All (or nearly all) material was in textual format written on paper
  • Final outcomes were not always known in re the appellate process

Document Information

Users will be able to browse relationships between the people and places in the suits

  • visualization

Erika came on board - April 2010

Further refinement/expansion of basic outline

Determination of two domains

  • Case Information (Meta-data)
  • Document (Object data)

Review of cases in the collection for common types and structures, determine scope

Other potential categories identified

Not a final product

Other Potential Categories

Case Information

Scalable

Other Types of Documents

  • Legislative, Criminal, Scholarly, Work Product, etc.

Characteristics of Documents

  • Jurisdiction
  • Issuing Authority
  • Dates
  • Structure
  • Other

Moving into the Schema

  • Though basic, the outline provided a starting point for the schema
  • Initial work with the documents prompted a reconsideration of the initial decision to use the documents as the primary units
  • Document model used to create first draft of extensions to the TEI

History of Schema Design

Document Typology

Basic outline developed by legal experts (law librarians)

  • April 2010
  • Rough and unpolished
  • Designed to generate thought

Intial Meetings between various groups

  • Refinement of concepts
  • Differentiation between meta-data (case description) and object-specific data (individual litigation document)
  • The good news - the functions were mappable to modern terminology
  • Filings were not always clearly labeled or described in the docket
  • One filing could fall within multiple categories
  • Clerks of court were not standardized in their handling of litigation materials in the 1800s
  • Decided to use modern terminology absent a specific historical term of well-established meaning
  • Initial outline listed eight major types with subcategories
  • Expanded in the schema to thirty three

Basic Categories

Sample Outline

The Outline was quite simplistic

Three classes of legal documents proposed

  • Transactional
  • Litigation
  • Other (memos, client communications, etc.)

Litigation documents were relevant category for freedom suits

  • Mostly meta-data, i.e. information about the case of which the document was a part
  • Document typology section was the core challenge
  • Litigation filings of that era did not correspond categorically to modern labels
Learn more about creating dynamic, engaging presentations with Prezi