The Internet belongs to everyone. Let’s keep it that way.

Protect Net Neutrality
Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Improving the Usability of Existing LRs & Speeding up the Cr

No description
by

Bartholomäus Wloka

on 20 May 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Improving the Usability of Existing LRs & Speeding up the Cr

What kind of resources are needed?
depends on the kind of MT
depends on the domain
depends on the final application
How do we get these resources?
identification
collection
categorization
evaluation
Identification
Collection
crawling
cleaning of Data
alignment on Sentence Level
Data
Machine Translation
Improving the Usability of Existing LRs & Speeding up the Creation of New LRs
Bartholomäus Wloka
Center for Translation Studies, University of Vienna

Many variables
& many definitions.

Let's define a common ground!
Translation
Localization?
Evaluation?
Quality?
Quantity?
Open Data
high availability
metadata
technical description
example description
affordable
Categorization
domain
topicmodeling
terminology
lexical resource
Evaluation
BLEU score
human expert (sample)
How do we improve the usability existing resources?
improving visibility
metadata
search techniques
Visibility
Language Resource Catalogue
reviewed by experts
pre-selected
basic set of information/metadata
Metadata
set of fields tailored to LR needs
title
type (corpus, terminology, lexicon...)
creator
languages
availability (research/commercial, price)
URL
domain
format (plain text, XML,...)
Search Techniques
metadata as basis
flexible, depending on the need of the community
~100 resources
Full transcript