Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Computational Linguistics

No description
by

Thang Tran

on 24 August 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Computational Linguistics

Rules that have been agreed upon
Recursion
Multiple Languages
Dialects/"style"
Standards
Ability to create new names and words
Certain languages are more "popular" or well-known
Possible to know more than one language (in each category)
Language is based around creating objects
The language tells objects how they work
Used to instruct/command a computer
Bits/gates/operators
Language is interpretted differently depending on the computer
Phonetics
Phonology
Pragmatics
Native Speakers
Used as communication
Phonemic Inventories
Mostly internalized in a universal manner (UG)
Computer Language
Human Language
vs.
Computational Linguistics
What is Computation Linguistics?
Natural Language Processing
Corpus Linguistics
Definition: a method of deriving an array of abstract rules that a particular language follows or uses to relate to another language.
Machine translation a sub-field of computational linguistics which explores the use of software in the translation of one natural language, either written or spoken, to another.
"Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things."
Chowdhury, Gobinda G. "Natural Language Processing." Annual Review of Information Science and Technology 37.1 (2003): 51-89. Print.
Think: How is this done?
Demonstration!
Mic takes sound converts it into electrical signals
Converts to binary by signal processing and then a sound wave file for the computer
Computer converts to Phonetics then Matches the phonetics to the actual words
Sound wave - haj gæleksi! - Hi Galaxy!
Electrical Engineering
Signal Processing
Linguistics
Difference Between Computer Languages and Natural Languages
Not spoken: phonology or phonetics
Little to no morphology
- Originally done by hand but now largely automated

- Those who follow these methods believe that the best analysis comes from "real world" contexts (natural data, minimal experimental interference)

- Within the field there are different views about the value of annotation of text (this includes structural mark-up, POS tagging, or parsing), though there is the advantage of annotations allowing linguists to work with and analyze corpus released by other linguists.


Semantics and Syntax
Syntax: consistent structure so the compiler can understand
Semantics: Telling the computer what we really want it to do (logic of the program)
The 3A Perspective
Applications of Natural Language Processing
Annotation
Abstraction
Analysis
Anything getting computers to use natural language
Applying a scheme to a text.
(includes previously discussed processes)
Mapping of terms in the scheme to terms in a theoretically motivated dataset
(usually linguist-directed, may also contain rule-learning for parsers)
Statistically probing, generalizing, and manipulating from the data set
(may include optimization of rule-bases or knowledge discovery methods)
Approaches to Machine Translation
The Four Major Approaches
Rule-based
Statistical
Example-based
Hybrid MT
Rule-based
Statistical
Example-based
Computational Complexity
of
Natural Language
Hybrid MT
Computational Complexity: a mathematical characterization of the difficulty of a mathematical problem which describes the resources required by a computing machine to solve the problem.
Recursion causes absolutely awful complexity within algorithms. In fact, computer scientists avoid it whenever possible.

However, it is also commonplace for recursion to occur within human language. This presents a challenge for computers, as they would need to have as many non-recursive algorithms as possible.
The use of grammars and dictionaries incorporating the morphological, semantic and syntactic normalities of the source and target languages within a MT (machine translation) system to guide the translation between the two.
A MT system which uses various statistical models to analyze bilingual text corpra and use them as the bases for translation.
Very similar to Statistical MT systems in that is uses bilingual corpra inorder to translate. But while statistical MT establishes a model baised on the corpra Example-based MT translates by analogy.
Combines rule-based and Statistical MT attempting to combine the strengths of both for a more comprehensive translation program.
There are two variations of Hybrid MT.
RPPS which translates based on rules and then uses statistical analysis to correct any anomalies.
Statistics guided by rules were the models which output a translation are editted by a body of rules.
Bibliography
Habash, N., Dorr, B., and Traum, D. (2003). Hybrid Natural Language Generation from Lexical Conceptual Structures. Machine Translation 18, 81–127.

Levison, M., and Lessard, G. (1992). A System for Natural Language Sentence Generation. Computers and the Humanities 26, 43–58.

Full transcript