Structured Prediction Problems in Natural Language Processing
introducer: William Cohen, Carnegie Mellon University
Description
Modeling language at the syntactic or semantic level is a key problem in natural language processing, and involves a challenging set of structured prediction problems. In this talk I'll describe work on machine learning approaches for syntax and semantics, with a particular focus on lexicalized grammar formalisms such as dependency grammars, tree adjoining grammars, and categorial grammars. In addition, I'll describe applications to machine translation, and natural language interfaces.
| Slides | |
| 0:00 | Structured Prediction Problems in Natural Language Processing |
| 4:06 | Structured Prediction Problems |
| 5:23 | Examples of Structured Prediction Problems |
| 6:28 | Syntactic Structures (1) |
| 7:03 | Syntactic Structures (2) |
| 7:42 | Structured Prediction Models for Parsing (1) |
| 8:43 | Structured Prediction Models for Parsing (2) |
| 9:00 | Structured Prediction Models for Parsing (3) |
| 9:23 | Structured Prediction Models for Parsing (4) |
| 9:59 | Overview |
| 10:12 | Context-Free Grammars (CFGs) for Language |
| 11:32 | Motivation for Parsing: Grammatical Relations |
| 12:23 | Syntactic Structures and Grammatical Relations (1) |
| 13:19 | Syntactic Structures and Grammatical Relations (2) |
| 13:33 | Syntactic Structures and Grammatical Relations (3) |
| 13:36 | Probabilistic Context-Free Grammars (PCFGs) |
| 15:29 | Overview |
| 15:33 | Models: Key Points |
| 16:27 | Conditional Random Fields |
| 16:58 | The Building Blocks for CRFs: Feature Vectors |
| 18:08 | Conditional Random Fields |
| 19:03 | Generalizing CRFs to Parsing |
| 19:34 | A TAG-Style Formalism |
| 20:41 | A Combination Operation: Sister Adjunction (1) |
| 21:32 | A Combination Operation: Sister Adjunction (2) |
| 21:41 | A Combination Operation: Sister Adjunction (3) |
| 22:12 | The Decomposition into Spines and Adjunctions |
| 22:36 | Advantages of TAG |
| 23:59 | The Contrast with Context-free Grammars |
| 24:18 | Features on Adjunctions (1) |
| 25:18 | Features on Adjunctions (2) |
| 25:21 | Features on Adjunctions (3) |
| 25:24 | Features on Adjunctions (4) |
| 25:29 | Features on Adjunctions (5) |
| 25:31 | Features on Adjunctions (6) |
| 25:59 | Features on Adjunctions (7) |
| 26:00 | A TAG-Based Model |
| 26:50 | Experiments |
| 27:41 | Probabilistic Context-Free Grammars (PCFGs) |
| 28:05 | PCFGs with Parent Annotations |
| 28:49 | Lexicalized PCFGs |
| 29:39 | PCFGs with Latent Variables |
| 30:26 | A Comparison to PCFGs (1) |
| 30:51 | A Comparison to PCFGs (2) |
| 31:19 | A Comparison to PCFGs (3) |
| 31:35 | Overview |
| 31:51 | An Application: Translation |
| 32:33 | Phrase-based Translation (Och et al., 1999) (1) |
| 33:12 | Phrase-based Translation (Och et al., 1999) (2) |
| 33:34 | Phrase-based Translation (Och et al., 1999) (3) |
| 34:42 | Translation as Parsing (1) |
| 35:31 | Translation as Parsing (2) |
| 35:51 | Translation as Parsing (3) |
| 36:15 | Translation as Parsing (4) |
| 36:22 | Properties of Translation as Parsing |
| 36:28 | Translation as Parsing (4) |
| 36:31 | Properties of Translation as Parsing |
| 37:25 | Overview |
| 37:28 | Inference |
| 37:45 | Inference: Key Points |
| 39:10 | Dependency Structures |
| 40:53 | Efficient Parsing Algorithms (Eisner 1997, 2000) |
| 42:00 | TAG Parses and Dependency Structures |
| 42:40 | Applying Eisner’s Algorithms to our Formalism |
| 43:20 | Coarse-to-fine Dynamic Programming |
| 44:02 | Three Simple Dependencies In Every Adjunction |
| 45:15 | Effect of the Beam (Validation Data) |
| 45:58 | Overview |
| 46:04 | Conditional Random Fields |
| 46:20 | Efficiency is a Key Problem |
| 47:13 | Parameter Estimation: the Structured Perceptron |
| 49:02 | Parameter Estimation: Averaging |
| 50:02 | Properties of the Perceptron |
| 51:33 | Averaged Perceptron Convergence |
| 52:04 | Regularized Log-Likelihood for Training CRFs |
| 53:17 | The Dual |
| 55:16 | Dual Coordinate Descent |
| 56:30 | An Exponentiated Gradient (EG) Algorithm |
| 58:06 | Properties of the EG Algorithm |
| 58:43 | An Exponentiated Gradient (EG) Algorithm |
| 58:48 | Properties of the EG Algorithm |
| 60:45 | Comparison to L-BFGS on a Parsing Problem (Objective Value) |
| 61:14 | Comparison to L-BFGS on a Parsing Problem (Accuracy) |
| 61:35 | Conclusions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



