Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Translation and Corpora

No description

Claudia Koch-McQuillan

on 19 February 2017

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Translation and Corpora

Corpora A corpus is a collection of naturally occurring authentic texts, stored in an electronic database to be accessed on a computer. Corpora can consist of both written and spoken texts. "A corpus is a remarkable thing" http://corpus.byu.edu/coca/ Let's have a look! http://www.natcorp.ox.ac.uk/ Corpus of Contemporary American English British National Corpus KWIC: key word in context
POS: part of speech
Lemma: word as presented in a dictionary (e.g. "go")
Token: individual word <-> Type: unique word; therefore: the more types compared to tokens, the greater the lexical variety of a text (type-token ratio) Some key concepts Concordance: list of words in a corpus, with immediate context
Concordancer: corpus analysis software IntelliText http://smlc09.leeds.ac.uk/itweb/htdocs/Query.html# Free online concordancing tools:
Antconc (http://www.antlab.sci.waseda.ac.jp/antconc_index.html)
SCP & CLOC (http://www.textworld.com/scp/) http://www.lextutor.ca/concordancers/concord_e.html Lextutor/Québec University Corpus analysis Frequency of word occurrence (word lists)
Concordance/co-text (KWIC) (not context), i.e. the lexical environment of a word
Clusters, i.e. frequently associated words
Colligations, i.e. grammatical associations
-> necessity of tagging corpora for analysis Linguistic/grammatical investigations (e.g. collocations, prepositions, phrasal verbs)
Subject and genre knowledge
Pragmatic use of language
Investigation of register (e.g. academic)
Machine translation Applications Parallel and comparable corpora
Multilingual corpora
Interpreting corpora
Corpora and concordancing in TEnT/CAT tools
Alignment Bilingual corpora and searches Time-consuming, self-directed activity
-> identify problem carefully
-> select relevant corpora/texts
-> process and store results appropriately Applications "The purpose of a language corpus is to provide language workers with evidence of how language is really used, evidence that can then be used to inform and substantiate individual theories about what words might or should mean. Traditional grammars and dictionaries tell us what a word ought to mean, but only experience can tell us what a word is used to mean." http://www.natcorp.ox.ac.uk/using/index.xml Australian National Corpus: http://www.ausnc.org.au/corpora/ace
Cambridge English Corpus (subscription only) http://www.cambridge.org/au/elt/
Corpus of American Soap Operas: http://corpus2.byu.edu/soap/
Michigan Corpus of Academic Spoken English: http://micase.elicorpora.info/
Oxford English Corpus (subscription only, but tutorial videos are helpful):
Time Magazine Corpus: http://corpus.byu.edu/time/
Vienna-Oxford International Corpus of English: http://ota.ahds.ac.uk/desc/2542
Oxford Collocations Dictionary: http://5yiso.appspot.com/
KAIST Korean Corpus: http://semanticweb.kaist.ac.kr/home/index.php/KAIST_Corpus
Leipzig Corpora Collection: http://corpora.informatik.uni-leipzig.de/
Google nGram Viewer: http://books.google.com/ngrams
Lextutor Complete Lexical Tutor: http://www.lextutor.ca/ More online corpora and tools Cutting, J. (2002). Pragmatics and Discourse: A resource book for students (2nd ed.). Abingdon (UK)/New York (USA): Routledge
Gilmore, A. (2009). Using online corpora to develop students' writing skills. ELT Journal, 63(4), 363-372
Maher, A., Waller, S. & Kerans, M.E. (2008). Acquiring or enhancing a translation specialism: the monolingual corpus-guided approach. The Journal of Specialised Translation, 10, 56-76
Olohan, M. (2004). Introducing Corpora in Translation Studies. Abingdon (UK)/New York (USA): Routledge
Sinclair, J. (2004). Developing Linguistic Corpora: a Guide to Good Practice. http://www.ahds.ac.uk/guides/linguistic-corpora/chapter1.htm
Zanettin, F., Bernardini, S. & Stewart, D. (2003): Corpora in Translator Education. Manchester: St. Jerome Publishing
Zanettin, F. (2012). Translation-Driven Corpora. Manchester: St. Jerome Publishing References (Cutting, 2002) (Sinclair, 2004) Extract terminology
Compare word lists from different texts or corpora -> highlight characteristic lexical choices, e.g. of a particular author or translator
Analyse lexical variety and lexical density
-> Use stop word lists
-> Validate lists statistically Translator/interpreter training and practice
Investigation of translation universals, e.g. simplification, explicitation, standardisation
Investigation of translation regularities, e.g. in a particular language combination/direction or of particular translators Challenges
Full transcript