Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

You also might be interested in

Postgraduate course CALM:

Computer-Assisted Language Mediation (UGent)

  • More information: Thursday 18 February, 12:50-13:20 pm (faculteit Economie en Bedrijfskunde, Sint-Pietersplein 7, lokaal 0.013)
  • http://www.flw.ugent.be/najemaster

Essential links

Structure of the course

Week 1: general introduction + R + use case 1

Week 2: machine learning + use case 2

Week 3-6: Python tutorial

Week 7-8: in-class preparation of one of the two case studies

Week 9: presentation of the results

Specific aims

  • Acquiring basic terminology in the field of DH in general, and distributional semantics and/or sentiment analysis in specific.
  • Acquiring hands-on experience in a selected set of computational tools.
  • Being able to critically reflect on strengths and weaknesses of these tools.
  • Being able to conduct a small-scale case study using these tools, and present the results in a paper and presentation.

Main aim in this course

  • Learning how to retrieve, analyze and visualize linguistic patterns
  • for research purposes in Translation Studies (distributional semantics)
  • for end-user purposes in organizations (sentiment analysis)

Course evaluation

As a consequence

Conduct one of the two use cases and present it in a(n):

  • [Oral presentation (deadline: week 9]
  • Written assignment (deadline: to be determined)

Online collection of DH-definitions: http://whatisdigitalhumanities.com/

An online guide to DH: http://sites.library.northwestern.edu/dh/

Online companion to DH: http://digitalhumanities.org:3030/companion/view?docId=blackwell/9781405103213/9781405103213.xml

Structured collection of Digital Research Tools: http://dirtdirectory.org/

  • DH is a very broad, dynamic field: it attracts many scholars from different backgrounds.
  • DH is interdisciplinary in nature: it uses computational tools from other scientific fields.
  • DH is empirical in nature: it uses observable data as point of departure.

Added value of introducing digital methods to humanities?

  • Computers are faster, more reliable and better in some of the tasks that are inextricably linked to doing research in humanities, such as:
  • Finding linguistic phenomena (characters, words, constructions) in large collections of texts (= corpora).
  • Adding linguistically relevant information in corpora (POS tagging, lemmatization...).
  • Counting instances and visualizing patterns in the data
  • Nevertheless, humans outperform computers in asking the relevant questions, selecting the appropriate tools and interpreting the visualized output.

A more specific definition

Key references

Berry, D. M; (2012). Understanding Digital Humanities.

Palgrave MacMillan

Burdick, A., J. Drucker, P. Lunenfeld, T. Presner & J. Schnapp (2012). Digital_Humanities. MIT Press.

Jones, S. E. (2013). The Emergence of the Digital Humanities. Routledge.

Digital Humanities is a field of research:

  • which extensively uses digital methods like presentation software, database storage, programming languages, text analysis tools, dynamic visualizations...
  • ... in order to gather, organize, analyze,

teach and present scholarly research in the humanities (e.g. literature, linguistics, philosophy and history) ...

  • ... with the ultimate goal to ask and answer new questions and view old questions differently.

A definition

A Gentle Introduction to

Digital Humanities

What sits at the intersection

of computational methods

and the traditional pursuits

of the humanities.

Overview of the Google corpus compilation project

Let's get a feel for some really cool research in DH

Evolution of grammar

http://whatisdigitalhumanities.com/

Track the regularization of irregular verbs in US and GB:

  • Chided vs. chid: chided:eng_gb_2012,chid:eng_gb_2012,chided:eng_us_2012,chid:eng_us_2012
  • Burnt vs. burned: burnt:eng_gb_2012,burned:eng_gb_2012,burnt:eng_us_2012,burned:eng_us_2012

Collective memory

Some really interesting digital tools (we won't be presenting)

By Gert De Sutter

  • You can easily find out when (and for how long) a given artist, politician, writer became popular.
  • Elvis Presley, Jimi Hendrix, Procol Harum, Marvin Gaye
  • Lenin, Stalin, Brezhnev, Andropov, Gorbachev, Yeltsin
  • You can even do that for 'years', as a quantification of interest in the present and societal forgetting (e.g., 1900, 1950, 1980)
  • Culturomics: the quantitative study of human culture as reflected in (diachronic) language use.
  • Building on extreme corpora consisting of digitized books (< Google Books project),
  • Scanned, OCR'ed, provided with meta data
  • 5 M books (~4% of the complete population), amounting to 500 Bn. tokens, in different languages (not Dutch, though :-(, from 1500-2000.

Michel et al. (2011), "Quantitative Analysis

of Culture Using Millions of Digitized Books", Science 331. [DOI: 10.1126/science.1199644]

History of censorship

  • You can easily find out which artist, politician, writer was censored during a given period in a given country.
  • Compare Picasso and Chagall in the English and German parts of the Google corpus (Picasso:eng_2012, Picasso:ger_2012)

Some really interesting digital tools we will be presenting

  • Presentation software: Prezi, LaTeX
  • Web crawling: Site Crawler
  • Text analysis: Serendip
  • Mapping software: Neatline
  • Data visualization software: Cytoscape

  • And many many more:
  • http://dirtdirectory.org/
  • http://sites.library.northwestern.edu/dh/tools-resources/

Historical studies

  • Historical epidemiology: e.g., influenza (pandemic).
  • History of Warfare: e.g., Taliban.
  • History of nations: e.g., Belgium.
  • History of food: e.g., hamburger, pizza, steak, pasta, sushi.
  • History of science: e.g., Darwin, Galileo, Freud, Einstein.
  • Advanced text editing: Sublime Text, Notepad++
  • Scripting languages: Python
  • Machine learning: Weka
  • Statistical analysis and data visualization: R

Foci in culturomics

  • Cultural change: which concepts are getting into attention at which point in time (e.g., slavery)?
  • Application: history of {diseases, science, food, war}, detection of censorship, prominence in collective memory.

  • Linguistic change: what is the evolution of the words used for those concepts (e.g. the Great War vs. World War I)?
  • Application: evolution of grammar, lexicography

Just do it yourself!

  • http://www.culturomics.org
  • http://ngrams.googlelabs.com
Learn more about creating dynamic, engaging presentations with Prezi