Inducing Cross-Lingual Semantic Representations of Words, Phrases, Sentences and Events
published: Jan. 11, 2013, recorded: December 2012, views: 6246
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Cross-lingual representations of linguistic units (e.g., words or phrases) can facilitate transfer of annotation from resource-rich to resource-poor languages and have many potential multilingual applications (e.g., machine translation and crosslingual information retrieval). In this talk, I will discuss our ongoing work which aims to induce cross-lingual representations relying primarily on monolingual unannotated texts readily available for many languages. From the learning standpoint, our approaches maximize the likelihood of monolingual unannotated texts but also use a form of regularization which favors agreement on a smaller collection of parallel data (i.e. sentences along with their translations). I will address the induction of different types of cross-lingual representations (clusters and distributed representations) for different types of units (words, phrases and predicateargument structures). We show that these models induce linguistically-plausible semantic representations and that cross-lingual induction both helps to induce better representations for individual languages and benefits various cross-lingual applications. Specifically, I will consider direct transfer of a classifier for a document classification task from one language to another, and show preliminary results in the context of low resource machine translation.
Download slides: nipsworkshops2012_titov_semantic_representations_01.pdf (6.1 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !