Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Transcript of Computational Linguistics
Ability to create new names and words
Certain languages are more "popular" or well-known
Possible to know more than one language (in each category)
Language is based around creating objects
The language tells objects how they work
Used to instruct/command a computer
Language is interpretted differently depending on the computer
Used as communication
Mostly internalized in a universal manner (UG)
What is Computation Linguistics?
Natural Language Processing
Definition: a method of deriving an array of abstract rules that a particular language follows or uses to relate to another language.
Machine translation a sub-field of computational linguistics which explores the use of software in the translation of one natural language, either written or spoken, to another.
"Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things."
Chowdhury, Gobinda G. "Natural Language Processing." Annual Review of Information Science and Technology 37.1 (2003): 51-89. Print.
Think: How is this done?
Mic takes sound converts it into electrical signals
Converts to binary by signal processing and then a sound wave file for the computer
Computer converts to Phonetics then Matches the phonetics to the actual words
Sound wave - haj gæleksi! - Hi Galaxy!
Difference Between Computer Languages and Natural Languages
Not spoken: phonology or phonetics
Little to no morphology
- Originally done by hand but now largely automated
- Those who follow these methods believe that the best analysis comes from "real world" contexts (natural data, minimal experimental interference)
- Within the field there are different views about the value of annotation of text (this includes structural mark-up, POS tagging, or parsing), though there is the advantage of annotations allowing linguists to work with and analyze corpus released by other linguists.
Semantics and Syntax
Syntax: consistent structure so the compiler can understand
Semantics: Telling the computer what we really want it to do (logic of the program)
The 3A Perspective
Applications of Natural Language Processing
Anything getting computers to use natural language
Applying a scheme to a text.
(includes previously discussed processes)
Mapping of terms in the scheme to terms in a theoretically motivated dataset
(usually linguist-directed, may also contain rule-learning for parsers)
Statistically probing, generalizing, and manipulating from the data set
(may include optimization of rule-bases or knowledge discovery methods)
Approaches to Machine Translation
The Four Major Approaches
Computational Complexity: a mathematical characterization of the difficulty of a mathematical problem which describes the resources required by a computing machine to solve the problem.
Recursion causes absolutely awful complexity within algorithms. In fact, computer scientists avoid it whenever possible.
However, it is also commonplace for recursion to occur within human language. This presents a challenge for computers, as they would need to have as many non-recursive algorithms as possible.
The use of grammars and dictionaries incorporating the morphological, semantic and syntactic normalities of the source and target languages within a MT (machine translation) system to guide the translation between the two.
A MT system which uses various statistical models to analyze bilingual text corpra and use them as the bases for translation.
Very similar to Statistical MT systems in that is uses bilingual corpra inorder to translate. But while statistical MT establishes a model baised on the corpra Example-based MT translates by analogy.
Combines rule-based and Statistical MT attempting to combine the strengths of both for a more comprehensive translation program.
There are two variations of Hybrid MT.
RPPS which translates based on rules and then uses statistical analysis to correct any anomalies.
Statistics guided by rules were the models which output a translation are editted by a body of rules.
Habash, N., Dorr, B., and Traum, D. (2003). Hybrid Natural Language Generation from Lexical Conceptual Structures. Machine Translation 18, 81–127.
Levison, M., and Lessard, G. (1992). A System for Natural Language Sentence Generation. Computers and the Humanities 26, 43–58.