Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Manuscript Transcription and Description with the TEI

A Prezi for an introductory TEI Workshop. Licensed Creative Commons Attribution. http://tinyurl.com/jc-msDesc
by

James Cummings

on 11 March 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Manuscript Transcription and Description with the TEI

TEI for Manuscript Transcription and Description

@jamescummings
Structure of an msDesc
<msDesc xml:id="myFavouriteMS" xml:lang="en">
<
msIdentifier
>

<!-- Manuscript Identification -->
</
msIdentifier
>
<
msContents
>

<!-- Intellectual Structure -->
</
msContents
>
<
physDesc
>

<!-- Physical Description -->
</
physDesc
>
<
history
>

<!-- Origin, Provenance, Acquisition -->
</
history
>
<
additional
>

<!-- Additional Administrative Metadata -->
</
additional
>
</msDesc>
Manuscripts are unique objects, sometimes (but not necessarily) of great cultural or political value.
Books, by contrast, exist in multiple copies, and can be described adequately by well-established and formalised bibliographic conventions.
For manuscripts, there are several traditions, often descriptive or belle lettriste, and little consensus.
Manuscripts can be ancient or modern, and it benefits us to use the same forms of description for them.
Similar concerns apply to other text-bearing objects, and <
msDesc
> is often also used for incunabula.
Why Are Manuscripts So Special?
The TEI <
msDesc
> element is intended for several different kinds of applications:
standalone database of library records (
finding aid
)
discursive text collecting many records (
catalogue raisonné
)
metadata component within a digital surrogate (
electronic edition
)
tool for ‘quantitative codicology’
Objectives of <
msDesc
>
<
div
>
<
head
>The Arnamagnæan Manuscript Collection</
head
>
<
p
>The Arnamagnæan Collection is widely recognised as one of the most significant collections of early Scandinavian manuscripts in the world...</
p
>
<
p
>Among its more important holdings are:
<
msDesc xml:id
="AM02-0101"
xml:lang
="en">

<!-- ...-->
</
msDesc
>
</
p
>
<
p
>In the following manuscript....
<
msDesc xml:id
="AM04-0595"
xml:lang
="en">
<!-- ...-->
</
msDesc
>
</
p
>
</
div
>
Catalogue Raisonné
Manuscript description in the TEI caters for two conflicting desires:

preserve (or perpetuate) existing descriptive prose
reliable search, retrieval, and analysis of data
The <
msDesc
> tries, wherever possible, to enable both of these approaches.
Having one's cake and eating it
Inside <
msDesc
> only <
msIdentifier
> is required. This could be followed by <
p
> elements or:
Components of an <
msDesc
>
<
msContents
>: alist of the intellectual content of the manuscript
<
physDesc
>: groups information concerning all physical aspects of the manuscript
<
history
>: provides information on the history of the manuscript, its origin, provenance and acquisition by current holding institution
<
additional
>: groups other information about the manuscript (e.g. administrative information relating to its availability, custodial history, surrogates)
<
msPart
>: contains in essence a nested <msDesc>, in cases of composite manuscripts now regarded as constituting a single unit but made up of two or more parts which were originally physically distinct.
What Counts as a Manuscript?
Identifying Manuscripts
The <
msIdentifier
> element has a traditional manuscript location three part specification:
place
: <
country
>, <
region
>, <
settlement
>
repository
: <
institution
>, <
repository
>
identifier
: <
collection
>, <
idno
>, <
altIdentifier
>

Example <msIdentifier>
More Examples
Named Manuscripts
<
msIdentifier
>
<
country
>United States of America</
country
>
<
region
>Texas</
region
>
<
settlement
>Austin</
settlement
>
<
institution
> The University of Texas at Austin </
institution
>
<
repository
>Harry Ransom Centre</
repository
>
<
collection
>Wilfred Owen Collected Letters</
collection
>
<
idno type
="folio">ff504</
idno
>
<
altIdentifier
>
<
idno
>Letter no. 535 Ed. 'Wilfred Owen Collected Letters'</
idno
>
</
altIdentifier
>
<
msName
>Letter to Leslie Gunston</
msName
>
</
msIdentifier
>
<
msIdentifier
>
<
country
>Canada</
country
>
<
settlement
>Ottawa</
settlement
>
<
repository
>Library and Archives Canada </
repository
>
<
collection
>E.W.B. Morrison</
collection
>
<
idno
>MG 30 E 81 v. 16</
idno
>
</
msIdentifier
>
<
msIdentifier
>
<
country
>France</
country
>
<
settlement
>Troyes</
settlement
>
<
repository
>Bibliothèque Municipale</
repository
>
<
idno
>50</
idno
>
</
msIdentifier
>
<
msIdentifier xml:lang
="da">
<
country
>Danmark</
country
>
<
settlement
>København</
settlement
>
<
repository
>Det ArnamagnæanskeInstitut</
repository
>
<
idno
>AM 45 fol.</
idno
>
<
msName xml:lang
="la">Codex Frisianus</
msName
>
<
msName xml:lang
="is">Fríssbók</
msName
>
<
altIdentifier
>
<
idno type
="TEI">tei-eg-23</
idno
>
</
altIdentifier
>
</
msIdentifier
>
Alternative or additional names can be provided, perhaps in different languages
Structured vs Unstructured <
msDesc
>
After <
msIdentifier
> you can either have a set of structured elements or one or more <
p
> elements
The same is true inside the rest of the elements inside <
msDesc
> (except for <
msIdentifier
> which must be structured)
<
msDesc xml:id
="myMS">
<
msIdentifier
>
<
msName
>My Manuscript</
msName
>
</
msIdentifier
>
<
p
>One or more paragraphs</
p
>
</
msDesc
>
<
msDesc xml:id
="myMS">
<
msIdentifier
>
<
msName
>My Manuscript</
msName
>
</
msIdentifier
>
<
msContents
>
<
p
>Paragraph(s) about manuscript contents</
p
>
</
msContents
>
<
physDesc
>
<
p
>Paragraph(s) about manuscript as physical object</
p
>
</
physDesc
>
<
history
>
<
p
>Paragraph(s) about history of manuscript</
p
>
</
history
>
<
additional
>
<
listBibl
>
<
bibl
>Bibliographic records containing
additional information</
bibl
>
</
listBibl
>
</
additional
>
</
msDesc
>

Structured <
msContents
>
<
msContents
> documents the intellectual content of the manuscript. If you don't use a set of paragraphs then it will be structured like:
<
msContents
>
<
summary
>An extraordinary charivari of heroic
deeds, improving tales, and hymns.</
summary
>
<
msItem
>

<!-- details of Guy of Warwick here -->
</
msItem
>
<
msItem
>

<!-- other msItems for hymns here -->
</
msItem
>
</
msContents
>
About <
msItem
>
<
msItem
> documents the identifiable items of intellectual content in a manuscript. These are often physically tied to a
locus
:
The <
locus>
if present,
must
be given first then any of the following:

model.biblLike <
bibl
>, <
biblFull
>, <
biblStruct
>, <
listBibl
>, <
msDesc
>
model.msQuoteLike <
colophon
>, <
explicit
>, <
finalRubric
>, <
incipit
>, <
rubric
>, <
title
>
model.quoteLike <
cit
>, <
quote
>
model.respLike <
author
>, <
editor
>, <
funder
>, <
meeting
>, <
principal
>, <
respStmt
>, <
sponsor
>
<
decoNote
>, <
filiation
>, <
idno
>, <
msItem
>, <
msItemStruct
>, <
textLang
>
<
msItem
> example
<
msContents
>

<!-- first item -->
<
msItem n
="1">
<
locus from
="5r"
to
="7v">fols. 5r-7v</
locus
>
<
title type
="supplied">An ABC</title>
</
msItem
>

<!-- second item -->
<
msItem n
="2">
<
locus from
="7v"
to
="8v">fols. 7v-8v</
locus
>
<
title type
="uniform"
xml:lang
="fr">L'envoy de Chaucer a Scogan</
title
>
</
msItem
>

<!-- ...further items here... -->
<
msItem n
="6">
<
locus from
="14r"
to
="126v">fols. 14r-126v</
locus
>
<
title type
="uniform">Troilus and Criseyde</
title
>
<
note
>Bk. 1:71-Bk. 5:1701, with additional losses due to mutilation throughout</
note
>
</
msItem
>
</
msContents
>
Physical Description
<
physDesc
>
example
<
objectDesc
>
example
The <
physDesc
> element records any information concerning the physicality or materiality of the manuscript. If using the structured form this might include:
The physical carrier:
<
objectDesc
>
What it carries:
<
handDesc
>, <
scriptDesc
>, <
typeDesc
>
Special features:
<
additions
>, <
decoDesc
>, <
musicNotation
>
External things:
<
bindingDesc
>, <
sealDesc
>, <
accMat
>

<
physDesc
>
<
objectDesc form
="codex">
<
supportDesc material
="perg">
<
support
>Parchment.</
support
>
<
extent>i +
55 leaves
<
dimensions scope
="all"
type
="leaf">
<
height unit
="mm"
quantity
="184.15">7¼ in</
height
>
<
width unit
="mm"
quantity
="136.53">5⅜ in</
width
>
</
dimensions
>
</
extent
>
</
supportDesc
>
<
layoutDesc
>
<
layout columns
="2">In double columns.</
layout
>
</
layoutDesc
>
</
objectDesc
>
<
handDesc
>
<
p
>Written in more than one hand.</
p
>
</
handDesc
>
<
decoDesc
>
<
p
>With a few coloured capitals.</
p
>
</
decoDesc
>
</
physDesc
>
<
objectDesc form
="codex">
<
supportDesc material
="mixed">
<
p
>Early modern
<
material
>parchment</
material
> and
<
material
>paper</
material
>.</
p
>
</
supportDesc
>
<
layoutDesc
>
<
layout columns
="1"
ruledLines
="25 32"/>
</
layoutDesc
>
</
objectDesc
>
But both <
supportDesc
> and <
layoutDesc
> could be much more structured with specific elements for <
support
>, <
extent
>, <
foliation
>, <
collation
>, <
condition
>
<
layoutDesc
>
<
layout ruledLines
="25 32"
columns
="1">
<
locus

from
="1r"
to
="202v"/>
<
locus from
="210r"
to
="212v"/> Between 25
and 32 ruled lines.
</
layout
>
<
layout ruledLines
="34 50"
columns
="1">
<
locus from
="203r"
to
="209v"/>Between 34
and 50 ruled lines.
</
layout
>
</
layoutDesc
>
Description of Hands and Decoration

<
handDesc
>

Contains multiple <
handNote
> elements

<
decoDesc
>

Contains multiple <
decoNote
> elements
<
handDesc hands
="2">
<
handNote
>The manuscript is written in two contemporary hands, otherwise unknown, but clearly those of practised scribes. Hand 1 writes ff.1r-22v and hand 2 ff. 23 and 24. Some scholars, notably <
persName ref
="AMI:VD02">Verner Dahlerup</
persName
> and <
persName ref
="AMI:HB05">Hreinn Benediktsson</
persName
>, have argued for a third hand on f. 24, but the evidence for this is insubstantial.</
handNote
>
</
handDesc
>
<
handDesc hands
="2">
<
handNote xml:id
="Eirsp-1"
scope
="minor"
script
="Bookhand_IG">
The first part of the manuscript,
<
locus from
="1v"
to
="72v:4">fols 1v-72v:4</
locus
>, is written in a
practised Icelandic Gothic bookhand. This hand is not found
elsewhere.
</
handNote
>
<
handNote xml:id
="Eirsp-2"
scope
="major"
script
="other">
The second part of the manuscript,
<
locus from
="72v:4"
to
="194v">fols 72v:4-194</
locus
>, is written
in a hand contemporary with the first; it can also be found in a
fragment of <
bibl
><
title
>Knýtlinga saga</
title
>,
<
ref
>AM 20b II fol.</
ref
></
bibl
>.
</
handNote
>
</
handDesc
>
<
additions
>
example
<
accMat
>
example
Another
<
physDesc
>
example
The <
additions
> element can be used to list or describe any
additions to the manuscript, such as marginalia, scribblings,
doodles, etc., which are considered to be of interest or
importance.
<
additions
>
<
p
>The text of this manuscript is not interpolated with sentences from Royal decrees promulgated in 1294, 1305 and 1314. In the margins, however, another somewhat later scribe has added the relevant paragraphs of these decrees, see pp. 8, 24, 44, 47 etc.</
p
>
<
p
>As a humorous gesture the scribe in one opening of the manuscript, pp. 36 and 37, has prolonged the lower stems of one letter f and five letters þ and has them drizzle down the margin.</
p
>
</
additions
>
<
accMat
> (accompanying material) contains details of any significant additional material which may be closely associated with the manuscript being described, such as non-contemporaneous documents or fragments bound in with the manuscript at some earlier historical period.
<
accMat
> A copy of a tax form from 1947 is included in the envelope with the letter. It is not catalogued separately. </
accMat
>
<
physDesc
>
<
objectDesc form
="folio">
<
supportDesc material
="paper">
<
support
>A single folio of <
material
>paper</
material
> in the
collection as ff504 recto and verso</support>
</
supportDesc
>
<
layoutDesc
>
<
layout columns
="1"
writtenLines
="20">Written full width as a
single column, with approximately 20 lines per page</
layout
>
</
layoutDesc
>
</
objectDesc
>
<
handDesc hands
="1">
<
handNote
>Written in
<
persName ref
="#WO">Wilfred Owen's</
persName
> hand.
</
handNote
>
</
handDesc
>
</
physDesc
>
Encoding a manuscript's
<
history
>
<
history
> groups elements describing the full history of a manuscript or manuscript part.
<
origin
>: where it all began
<
provenance
>: everything in between
<
acquisition
>: how you acquired it
Although <
origin
> is a member of att.datable, it also has special purpose elements <
origDate
> and <
origPlace
> to record the manuscript's origin date and place.
<
history
>
Encoding
<
history
>
<
origin
>
<
p
>Written in <
origPlace ref
="places.xml#england">England</
origPlace
>
in the <
origDate notAfter
="1299"
notBefore
="1200">
13th Cent.</
origDate
>
</
p
>
</
origin
>
<
provenance
>
<
p
>On fol. 54v very faint is <
q
>Iste liber est fratris guillelmi de buria de
<
gap reason
="illegible"/> Roberti ordinis fratrum Pred<
ex
>icatorum</
ex
>
</
q
>, 14th cent. (?): <
q
>hanauilla</
q
> is written at the foot of the page
(15th cent.).</
p
>
</
provenance
>
<
acquisition
>
<
p
>Bought from the Rev. <
name type
="person">W. D. Macray</
name
>
on <
date when
="1863-03-17">March 17, 1863</
date
>, for 1 pound 10s.
</
p
>
</
acquisition
>
</
history
>
Additional Metadata
<
additional
> groups additional information, combining bibliographic information about a manuscript, or surrogate copies of it with curatorial or administrative information.
<
adminInfo
> administrative information
<
surrogates
> information about other surrogates (e.g. photographs, microfilms, digital images) etc.
<
listBibl
> bibliography of works concerning the manuscript
<
additional
>
example
<
additional
>
<
adminInfo
>
<
custodialHist
>
<
custEvent type
="conservation"
notBefore
="1961-03"

notAfter
="1963-02">
<
p
>Conserved between March 1961 and February 1963
at Birgitte Dalls Konserveringsværksted.</
p
>
</
custEvent
>
<
custEvent type
="photography"

notBefore
="1988-05-01"
notAfter
="1988-05-30">
<
p
>Photographed in May 1988 by AMI/FA.</
p
>
</
custEvent
>
</
custodialHist
>
</
adminInfo
>
</
additional
>
What about
<
msPart
>
?
An <
msDesc
> can contain <
msPart
>, essentially a nested <
msDesc
>, where originally distinct manuscripts or parts of a manuscripts have been brought together to form a composite
manuscript.
<
msDesc
>
<
msIdentifier
>
<
settlement
>Amiens</
settlement
>
<
repository
>Bibliothèque Municipale</
repository
>
<
idno
>MS 3</
idno
>
<
msName
>Maurdramnus Bible</
msName
>
</
msIdentifier
>

<!-- other elements here -->
<
msPart
>
<
altIdentifier
>
<
idno
>MS 6</
idno
>
</
altIdentifier
>

<!-- other information specific to this part here -->
<
/msPart
>

<!-- other msParts here -->
</
msDesc
>
<
objectDesc
>
<
supportDesc
>
<
collation
>
<
formula
>1-5.8 6.6 (catchword, f. 46, does not match following text) 7-8.8 9.10, 11.2
(through f. 82) 12-14.8 15.8(-7)</
formula
>
<
catchwords
>Catchwords are written horizontally in center or towards the right lower
margin in various manners: in red ink for quires 1-6 (which are also signed in red ink
with letters of the alphabet and arabic numerals); quires 7-9 in ink of text within
yellow decorated frames; quire 10 in red decorated frame; quire 12 in ink of text;
quire 13 with red decorative slashes; quire 14 added in cursive hand.</
catchwords
>
</
collation
>
</
supportDesc
>
</
objectDesc
>
Collation Formula and Catchwords
http://tinyurl.com/jc-msDesc
What is the TEI?
(The Text Encoding Initiative)
An international consortium of institutions, projects and individual members
A community of users and volunteers
A freely available manual of set of regularly maintained and updated recommendations: 'The Guidelines'
Definitions, examples, and discussion of over 540 markup distinctions for textual, image facsimile, genetic editing etc.
A mechanism for producing customized schemas for validating your project's digital texts
A set of free and openly licensed, customizable tools and stylesheets for transformations to many formats (e.g. HTML, Word, PDF, Databases, RDF/LinkedData, Slides, ePub, etc.)
A simple consensus-based way of organizing and structuring textual (and other) resources
A format for documenting your interpretation and understanding of a text (and how text functions)
An archival, well-understood, format for long-term preservation of digital data and metadata
Whatever
you
make it! It is a community-driven standard
Markup makes explicit the distinctions we want to make when processing a string of bytes
Markup is a way of naming and characterizing the parts of a text in a formalized way
Markup provides additional levels of annotation on data
It's (usually) more useful to markup what we think things are than what they look like
Defining Markup
Markup makes explicit to a machine which is implicit to a person
<
TEI
>
<
teiHeader
>
<!-- required -->
</
teiHeader
>
<
text
>
<
front>

<!-- optional -->
</
front
>
<
body
>
<
pb n="f.1r"
/>
<
div n="1"
>
<
head
>
<!-- a heading -->
</
head
>
<
p
>paragraphs</
p
>
</
div
>
<
div n="2"
>...</
div
>
</
body
>
<
back
>
<!-- optional -->
</
back
>
</
text
>
</
TEI
>

TEI Text Structure
<
TEI
>
<
teiHeader
>
<!-- required -->
</
teiHeader
>
<
facsimile
>
<!-- optional -->
</
facsimile
>
<
sourceDoc
>

<!-- optional -->
</
sourceDoc
>
<
text
>

<!-- required if no facsimile or sourceDoc-->
</
text
>
</
TEI
>

TEI Basic Structure
About XML
XML is structured data represented as strings of text
XML looks like HTML, except that:
XML is
extensible
XML
must
be
well-formed
XML
can
be
validated
XML is application-, platform-, and vendor- independent
XML empowers the content provider and facilitates data integration and migration
It is one of the best plain text long-term preservation formats for textual data that we have
XML Markup
<
element

attribute
="
value
">
Text or child elements here
</
element
>
<
element
> Text </
element
>
<
element

attribute
="
value
"/>
XML Terminology
<?
xml

version
="
1.0
" ?>
<
root
xmlns
="
http://namespace/
"
<
element

attribute
="
value
">
content
<
childElement

type
="
empty
"/>
content
</
element
>
<!-- comment -->
</
root
>
XML declaration
root element
namespace
element
attribute
and value
an 'empty' child element
content
comment
Transcription: A special kind of reading
What is the goal of transcription?
To make primary sources accessible
...and comprehensible
which may mean adding or using additional material
Hence all transcription is selective, interpretative, and imaginative
Transcription Phenomena
In transcription for a digital edition, some textual phenomena which commonly attract editorial attention:
original layout information
abbreviations or other arcana
‘evident’ errors which invite correction or conjecture
scribal additions, deletions, substitutions, restorations
non-standard orthography (etc.) which invites normalisation
irrelevant or non-transcribable material
passages which are damaged or illegible
TEI Transcription Methods
<teiHeader>
: provides metadata for the whole thing, at various levels, notably including a
<msDesc>
<text>
: contains a structured reading of a document's intellectual content ... its ‘text’
<facsimile>
: organizes a set of page images representing an individual physical document
<sourceDoc>
: a non-interpretative transcription of a physical document, e.g. for a dossier génétique
Should a transcription encode a 'text' or a 'document'?
What's This?
‘agreable’ is struck-through, ‘pleasing’ is written above it, in the interlinear space?
‘agreable’ is deleted and replaced by ‘pleasing’?
Originally, the text read ‘agreable’, but at some subsequent stage this word was deleted; the word ‘pleasing’ was added in the same context?
Your answer decides the encoding!
Character Encoding
<g>
element (character or glyph) represents a glyph, or a non-standard character

distinguish allographical forms of a letter
represent non-standard characters
track scribal variation

Abbreviations, &c.
In Western MSS, we commonly distinguish :
Suspensions:
the first letter or letters of the word are written, generally followed by a point : for example ‘e.g.’ for ‘exempla gratia’
Contractions:
both first and last letters are written, sometimes with some mark of abbreviation such as superscript strokes, or points : e.g. ‘Mr’ for ‘Mister’
Brevigraphs:
Special signs such as the Tironian nota used for ‘et’, the letter p with a barred tail used for ‘per’, the letter c with a circumflex used for ‘cum’ etc.
Superscripts:
Superscript letters (vowels or consonants) used to indicate various kinds of contraction: e.g. ‘w’ followed by superscript ‘ch’ for ‘which’.
How to Read an Abbreviation
An abbreviation may be viewed in two different ways:
as a particular sequence of letters or marks upon the page: thus, a ‘p with a bar through the descender’, a ‘superscript hook’, a ‘macron’
as another way of representing the letter or letters it is believed to be standing for: thus, ‘per’, ‘re’, ‘n'
This element then references metadata in the the <teiHeader> which describes and documents the character or glyph.
The TEI proposes two levels of encoding:

the whole of an abbreviated word and the whole of its expansion may be marked using
<abbr>
and
<expan>
respectively
abbreviation signs or characters and the ‘invisible’ characters they imply may be marked using
<am>
and
<ex>
respectively
Or these two levels can be used in conjunction with each other
Compare ev(er)y (per)sone
Let's say we have the phrase "every persone" written an a late Middle English manuscript. Except the 'er' of 'every' is an expanded abbreviation and the 'per' of 'persone' is as well. How do we encode that?
ev
<choice>

<am>
<g ref="#abbr-er"/>
</am>
<ex>
er
</ex>
</choice>
y

<choice>
<am>
<g ref="#abbr-per"/>
</am>
<ex>
per
</ex>
</choice>
sone
<choice>
<abbr>
ev
<am><g ref="#abbr-er"/></am>
y
</abbr>
<expan>
ev
<ex>
er
</ex>
y
</expan>
</choice>

<choice>
<abbr><am><g ref="#abbr-per"/></am>
sone
</abbr>
<expan><ex>
per
</ex>
sone
</expan>
</choice>
<ex>
and
<am>
Using these elements, from the 'transcr' module, a transcriber may indicate the status of the individual letters or signs within both the abbreviation and the expansion.
<ex>
(editorial expansion) contains a sequence of letters added by an editor or transcriber when expanding an abbreviation.
<am>
(abbreviation marker) contains a sequence of letters or signs present in an abbreviation which are omitted or replaced in the expanded form of the abbreviation.
Handling Abbreviations: An Example
<p>
<lb/>
Cours chacune piece
<expan>
pour
</expan>

<lb/><expan>
cinquante
</expan>
soubz
<expan>
tournois
</expan><pc>
.
</pc>
</p>
Note the silent expansions:
<abbr>
po
&#xFFFD;
</abbr>
po
<ex>
u
</ex>
r
<abbr>
po
<am>
&#xFFFD;
</am></abbr>
<expan>
po
<ex>
u
</ex>
r
</expan>
po
<choice>
<am>
&#xFFFD;
</am>
<ex>
ur
</ex>
</choice>

<choice>
<abbr>
po
<am>
&#xFFFD;
</am></abbr>
<expan>
po
<ex
>u
</ex>
r
</expan>
</choice>
A glance at
<choice>
<choice>
(groups alternative editorial encodings)

Abbreviation:
<abbr>
(abbreviated form)
<expan>
(expanded form)
Errors:
<sic>
(apparent error)
<corr>
(corrected error)
Regularization:
<orig>
(original form)
<reg>
(regularized form)
Additions, deletions, substitutions and modifications
Alterations made to the text, whether by the scribe or in some later hand, can be encoded using
<add>
(addition) or
<del>
(deletion).
Where the addition and deletion are regarded as a single substitution, they can be grouped together using the
<subst>
(substitution) element :
<add>
(addition) or
<del>
(deletion) are used for evident alterations in the source
a combined addition and deletion may be marked using
<subst>
(substitution)
<mod>
(modification) represents any kind of general modification without any semantic interpretation
Semi-legible text:
<unclear>
Use <unclear> if the text is partly illegible i.e. it can be
read but without perfect confidence. The @reason
attribute here states the cause of the uncertainty in
transcription.
I
<subst>
<add place="above">
might
</add>
<del>
<unclear reason="overinking" cert="medium" resp="#LDB">

should
</unclear>
</del>
</subst>
have
Supplied and damaged text
Use the
<supplied>
element if the transcriber has provided a reading not actually visible in the text, whether because of damage or scribal error : @reason here indicates why the text has been supplied.
...Dragging the worst
among
<supplied reason="authorialError">
s
</supplied>
t us...
Use the
<damage>
element to record the existence of
physical damage to the document, whether or not the
damaged text is readable:
<l>
The Moving Finger wri
<damage agent="water" group="1">
es; and
</damage>
having writ,
</l>
<l>
Moves
<damage agent="water" group="1"><supplied>
on: nor all your
</supplied></damage>
Piety nor Wit
</l>
Using attributes to clarify who did what when
The author (WJ) wrote ‘One must have lived... ’
The author added the word ‘But’ before ‘One’
An editor (FB) corrected ‘One’ to ‘one’
<add place="supra" hand="#WJ" cert="medium">
But
</add>
<choice><sic>
One
</sic>
<corr resp="#FB" cert="high">
one
</corr>
</choice>
must have lived ...

<!-- elsewhere in teiHeader-->
<respStmt xml:id="FB">
<resp>
editorial changes
</resp>
<name>
Fredson Bowers
</name>
</respStmt>
<respStmt xml:id="WJ">
<resp>
authorial changes
</resp>
<name>
William James
</name>
</respStmt>
Authorial changes of mind
The author writes ‘For I hate this my body’
The word ‘my’ is deleted
The author writes ‘stet’ in the margin
The <restore> element can be used to indicate that a
deletion has been reversed:
<l>
[...] For I hate this
<restore hand="#dhl" type="marginalStetNote">
<del>
my
</del>
</restore>
body [...]
</l>
Note: we have not encoded the
<metamark>
‘stet’, but rather its effect.
Text omitted from or supplied in the transcription
<gap>
indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible or inaudible.
<supplied>
signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
<div>
<head>
Lectio x.
</head>
<p>
Hic itaque paterfamilias ad excolendam

<gap extent="20" unit="words" reason="not transcribed" resp="#DC"/>
congregare non desistit.
</p>
</div>
<p
>Oblatus est

<supplied reason="omitted" resp="#DC">
quia ipse voluit

</supplied>
.
</p>
<damage>, <space>, and <unclear>
Revelabunt caeli iniquitatem Judae et
<damage agent="rubbing"/>
consurget et
<space quantity="7" unit="minims"/>
manifestum erit peccatum ipsius in die furoris
do
<unclear agent="rubbing" resp="#JC">
mini
</unclear>
cum eis qui dixerunt domino deo recede a nobis scientiam viarum tuarum nolumus
Original layout information
The TEI privileges the logical view, but does permit the physical view to ‘show through’ as empty milestone elements :
<gb/>
the start of a new gathering or quire
<pb/>
the start of a new page
<cb/>
the start of a new column
<lb/>
the start of a new written line
These are primarily useful to establish a reference system. The
<fw>
(forme work) element can be used to mark ‘paratextual’ features such as running heads, foliation etc.
The
<handShift/>
element can be used to mark changes of hand or writing in a document.
Changes of scribe
A special kind of milestone the
<handShift/>
can be used to mark the beginning of a sequence of text written in a new hand, or the beginning of a scribal stint.
<l>
When wolde the cat dwelle in his ynne
</l>
<pb n="f.23v"/>
<handShift medium="greenish-ink" new="#h1"/>
<l>
And if the cattes skynne be slyk
<handShift medium="black-ink" new="#h2"/>
and gaye
</l>
<handNotes>
<handNote xml:id="h1" script="copperplate">
Carefully written with
regular descenders
</handNote>
<handNote xml:id="h2" medium="pencil">
Unschooled scrawl
</handNote>
</handNotes>
Main phrase-level editorial elements
A summary list of some of the more important phrase-level transcription elements might include:
Core module:

<abbr>
,
<add>
,
<choice>
,
<corr>
,
<del>
,
<expan>
,
<gap>
,
<orig>
,
<reg>
,
<sic>
,
<unclear>
Transcription module:

<am>
,
<damage>
,
<ex>
,
<metamark>
,
<mod>
,
<redo>
,
<restore>
,
<retrace>
,
<space>
,
<subst>
,
<supplied>
,
<surplus>
,
<transpose>
,
<undo>
Genetic Editing
<sourceDoc>
(a sibling of
<text>
) contains a non-interpretative transcription or other representation of a single source document potentially forming part of a dossier génétique or collection of sources.
An embedded transcription is one in which words and other written traces are encoded as subcomponents of elements representing the physical surfaces carrying them rather than independently of them.
A
<sourceDoc>
, like

<facsimile>
usually contains one or more
<surface>
elements with
<zone>
or
<line>
elements.
<line>
is a specialisation of
<zone>
that contains the transcription of a topographic line in the source document
Some editorial markup (such as
<add> <del>
) are available in
<line>
, but it is not meant to be used for interpretative judgements like
<persName>
<sourceDoc>
<surface>
<zone facs="#postcard-back_zone4" rend="printed">
<line><choice><abbr><g ref="#RN_logo">
RN
</g></abbr>
<expan>
Reinthal &amp; Newman
</expan>
</choice></line>
</zone>

<zone facs="#postcard-back_zone5" rend="printed">
<line>
THIS SPACE MAY BE USED FOR
</line>
<line>
CORRESPONDENCE
</line>
</zone>
<zone facs="#postcard-back_zone6" rend="printed">
<line>
PRINTED IN AMERICA
</line>
</zone>
</surface>
</sourceDoc>
<sourceDoc> Example
Medieval vs Modern MSS
http://www.tei-c.org/
Full transcript