Learning to Translate: statistical and computational analysis

author: Marco Turchi, Department of Engineering Mathematics, University of Bristol
published: July 1, 2009,   recorded: May 2009,   views: 2876

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

In this talk, an extensive experimental study of a Statistical Machine Translation system, Moses, from the point of view of its learning capabilities is presented. Very accurate Learning Curves are obtained, by using high-performance computing, and extrapolations of the projected performance of thesystem under different conditions are provided. Our experiments suggest:

1. The representation power of the system is not currently a limitation to its performance,
2. The inference of its models from finite sets of i.i.d. data is responsible for current performance limitations,
3. It is unlikely that increasing dataset sizes will result in significant improvements (at least in traditional i.i.d. setting),
4. It is unlikely that novel statistical estimation methods will result in significant improvements.

The current performance wall is mostly a consequence of Zipf's law, and this should be taken into account when designing a statistical machine translation system. A few possible research directions are discussed as a result of this investigation, most notably the integration of linguistic rules into the model inference phase, and the development of active learning procedures.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: