Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

An Introduction to Roma and ODD

An Introduction to Roma and ODD for DiXiT Spring School 2015-04-15, Graz; CC+by Licensed, http://tinyurl.com/jc-2015-04-15-ODD
by

James Cummings

on 15 April 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of An Introduction to Roma and ODD

An Introduction to Roma and ODD
Dr James Cummings
University of Oxford

@jamescummings

How does Roma work?
A PHP web application which:
Queries the TEI source for lists of elements, modules, attributes etc
Presents them in a series of forms
Generates a TEI ODD specification
Sends that ODD to OxGarage, a RESTful web service which uses
a set of XSLT transforms to create the desired output
So what did we just do?
.
<schemaSpec ident="myTEI" docLang="en"
prefix="tei_" xml:lang="en">
<moduleRef key="core"
except="abbr add addrLine address analytic author"/>
<moduleRef key="tei"/>
<moduleRef key="header"/>
<moduleRef key="textstructure"/>
<elementSpec ident="title"
module="core" mode="change">
<attList>
<attDef ident="level" mode="delete"/>
</attList>
</elementSpec>
</schemaSpec>
Modules tab
The [Modules] tab shows the modules available
Selecting a module from it shows the elements within that module, and gives you the choice to:
include
all of them (and then remove some)
exclude
all of them (and then put back the ones you want)
You can also change an element's attribute list, and the values they permit
http://tinyurl.com/jc-2015-04-15-ODD
Some Terminology
The TEI encoding scheme defines a set of
elements

An element definition specifies:
a canonical name (<gi>) for the element, and optionally other names in other languages
a canonical description (also possibly translated) of its function
a declaration of the classes to which it belongs
a definition for each of its attributes
a definition of its content model (what can appear inside it)
usage examples and notes

modules are used to group together sets of elements
a TEI schema specification (<schemaSpec>) is made by selecting
modules or elements and (optionally) modifying their contents
a TEI document containing a schema specification is called an ODD (One Document Does it all)
How do you choose?
Just choose everything: TEI All (not really a good idea)
The TEI provides a small set of predefined combinations (TEI Lite, TEI Bare...)
Or you could roll your own (but then you need to know what you're choosing)
About ODD
TEI ODD is the name of a TEI Customisation file
This documents your files' relationship with the TEI (what version, what is included/excluded, what you have changed)
From this we can generate not only schemas, but project-specific documentation.
Every use of the TEI should involve a TEI ODD Customisation
What is a Module?
A convenient way of grouping together a number of element declarations
These are usually on a related topic or specific application
Most chapters of P5 focus on elements drawn from a single module, which that chapter then defines
A TEI schema can be created by selecting modules and adding or removing elements from them as needed
Modules
analysis:
Simple analytic mechanisms
certainty:
Certainty and uncertainty
core:
Elements common in many TEI documents
corpus:
Corpus texts
dictionaries:
Dictionaries
drama:
Performance texts
figures:
Tables, formulæ, notated music, and figures
gaiji:
Character and glyph documentation
header:
The TEI Header
iso-fs:
Feature structures
linking:
Linking, segmentation and alignment
msdescription:
Manuscript Description
namesdates:
Names and dates
nets:
Graphs, networks, and trees
spoken:
Transcribed Speech
tagdocs:
Documentation of TEI modules
textcrit:
Critical Apparatus
textstructure:
Default text structure
transcr:
Transcription of primary sources
verse:
Verse structures

Roma:
a web-based application designed to make this process
much easier
http://www.tei-c.org/Roma/
Note that:
OxGarage can be used on its own
the same transforms can be run within oXygen
or, linux users can run the transformations on the commandline
Our own customisation
A simple selection of elements, but let's say we also want to allow only certain values for @type on <div>.
How do we do that?

Also we might want to create a new element?

Other constraints are possible — we might want to insist that a <div type="prose"> contains a paragraph, for example, using Schematron.
What did we just do?
<elementSpec ident="div" module="textstructure"
mode="change">
<attList>
<attDef ident="type" mode="change" usage="req">
<desc>characterizes the element
according to its type </desc>
<valList type="closed" mode="replace">
<valItem ident="prose"/>
<valItem ident="verse"/>
<valItem ident="drama"/>
<valItem ident="letter"/>
<valItem ident="other"/>
</valList>
</attDef>
</attList>
</elementSpec>
Defining a new element
When defining a new element, we need to consider:
its name and description
what attributes it can carry
what it can contain
where it can appear in a document

The TEI class system helps us answer all these questions (except the first).
TEI Class System
The TEI class system
The TEI distinguishes over 540 elements:
Having these organised into classes aids comprehension, modularity, and modification.
Attribute class:
the members share common attributes
Model class:
they can appear in the same locations (and are often semantically related)
Classes may contain other classes
An element can be a member of any number of classes, irrespective of the module it belongs to.
Attribute Classes
Attribute classes are given (usually adjectival) names beginning with att.; e.g. att.naming, att.typed
all members of att.canonical inherit from it attributes @key and @ref ;
all members of att.typed inherit from it @type and @subtype
If we want an element to carry the @type attribute, therefore, we add the element to the att.typed class, rather than define those attributes explicitly.
att.global
@xml:id
a unique identifier
@xml:lang
the language of the element content
@n
a number or name for an element
@rend
how the element in question was rendered or presented in the source text.
Model Classes
Model classes contain groups of elements which are allowed in the same place. e.g. if you are adding an element which is wanted wherever the <bibl> is allowed, add it to the
model.biblLike
class
Usually named with a Like or Part suffix:
members of
model.pLike
are all things that ‘behave like’ paragraphs, and are permitted in the same places as paragraphs
model.pPart.edit
elements for simple editorial intervention such as <corr>, <del>
model.pPart.data
‘data-like’ elements such as <name>, <num>, <date>
model.pPart.msdesc
extra elements for manuscript description such as <seal> or <origPlace>
Basic Model Structures
There are three generally recognized classes of element:
divisions
high level major divisions of texts
chunks
elements such as paragraphs appearing within texts or divisions, but not within other chunks
phrase-level elements
elements such as highlighted phrases which can occur only within chunks
There are also:
inter-level elements
such as lists which can appear either in or between chunks
components elements
which can appear directly within texts or text divisions
New elements
Questions:
What other elements is it like?
What other elements can contain it?
What can it contain?
What did we just do?
<elementSpec ident="something"
ns="http://www.example.org/ns/nonTEI" mode="add">
<desc>some description</desc>
<classes>
<memberOf key="model.divPart"/>
<memberOf key="att.typed"/>
</classes>
<content>
<rng:oneOrMore>
<rng:ref name="model.pLike"/>
</rng:oneOrMore>
</content>
</elementSpec>
Defining a new element
Conclusions:
What classes do we make it a member of?
What content model do we select (which classes can appear inside it)?
The ODD advantage
We can express these constraints in our ODD meta-schema, and then generate a formal schema to enforce them using whichever schema language we like. For example:
ISO Relax NG
W3C Schema Language
DTD Language (But DTDs should be deprecated!)

Element content is documented with either RelaxNG or new 'Pure ODD' syntax (see chapter 22.4.4)
Full transcript