Menu

Inducing Cross-Lingual Semantic Representations of Words, Phrases, Sentences and Events

calendar icon Jan 11, 2013 6301 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Cross-lingual representations of linguistic units (e.g., words or phrases) can facilitate transfer of annotation from resource-rich to resource-poor languages and have many potential multilingual applications (e.g., machine translation and crosslingual information retrieval). In this talk, I will discuss our ongoing work which aims to induce cross-lingual representations relying primarily on monolingual unannotated texts readily available for many languages. From the learning standpoint, our approaches maximize the likelihood of monolingual unannotated texts but also use a form of regularization which favors agreement on a smaller collection of parallel data (i.e. sentences along with their translations). I will address the induction of different types of cross-lingual representations (clusters and distributed representations) for different types of units (words, phrases and predicateargument structures). We show that these models induce linguistically-plausible semantic representations and that cross-lingual induction both helps to induce better representations for individual languages and benefits various cross-lingual applications. Specifically, I will consider direct transfer of a classifier for a document classification task from one language to another, and show preliminary results in the context of low resource machine translation.

RELATED CATEGORIES

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.