event thumbnail image
PASCAL Challenges Workshop 2

A Simpler, Intuitive Approach to Morpheme Induction

author: Samarth Keshava, Yale University
coauthor: Emily Pitler, Yale University

Description

We present a simple, psychologically plausible algorithm to perform unsupervised learning of morphemes. The algorithm is most suited to Indo-European languages with a concatenative morphology, and in particular English. We will describe the two approaches that work together to detect morphemes: 1) finding words that appear as substrings of other words, and 2) detecting changes in transitional probabilities. This algorithm yields particularly good results given its simplicity and conciseness: evaluated on a set of 532 human-segmented English words, the 252-line program achieved an F-score of 80.92% (Precision: 82.84% Recall: 79.10%).

You might be experiencing some problems with Your Video player.
Slides
0:00 Step 2: Scoring morphemes (1)
1:11 Step 1: Build the trees (2)
1:27 Step 2: Scoring morphemes (1)
1:28 Step 2: Scoring morphemes (2)
2:37 Step 3: Pruning
3:43 Top English Morphemes (1)
4:03 Top English Morphemes (2)
4:15 Top English Morphemes (3)
5:03 Step 4: Segmenting Words
6:24 Results (1)
6:48 Results (2)
7:02 Step 2: Scoring morphemes (1)
7:19 Results (2)
7:42 Results (3)
8:21 Simple and Effective
9:12 Thank you for listening.

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: