How Do Infants Bootstrap into Spoken Language?: Models and Challenges

author:Emmanuel Dupoux, Laboratoire de Sciences Cognitives et Psycholinguistique, Ecole Normale Superieure
published: Aug. 26, 2009,   recorded: June 2009,   views: 255
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Early language acquisition: data & models (1)
1:26 Early language acquisition: data & models (2)
2:13 Human Language
3:01 Problems for a concatenative view (1)
3:16 Problems for a concatenative view (3)
3:17 Problems for a concatenative view (2)
3:25 Problems for a concatenative view (4)
4:30 Problems for a concatenative view (5)
4:32 Problems for a concatenative view (6)
5:02 Early language acquisition timeline (1)
5:03 Early language acquisition timeline (2)
6:58 Early language acquisition timeline (3)
9:20 How do we know? (1)
10:59 How do we know? (2)
11:16 How do we know? (3)
11:37 Summary
12:11 Probabilistic models of speech recognition (1)
13:10 Probabilistic models of speech recognition (2)
14:04 Successive State Splitting
14:46 Optimized State Splitting
16:52 Background: Phonemes and Allophonic Rules
22:21 How to reduce the number of allophones? Idea #1: complementary distributions
24:11 Effect of phonotactics
25:51 Number of allophones
26:37 Limits of KL
26:47 Idea #2. The linguistic/articulatory filters
28:37 Implementation of the filters
29:13 Tests on French
29:58 French (1)
30:24 French (2)
30:37 Limits of the linguistic/articulatory filters
31:50 Idea #3: use the lexicon (1)
33:20 Idea #3: use the lexicon (2)
34:04 Idea #3: use the lexicon (3)
34:25 But isn’t is cheating? (1)
34:31 But isn’t is cheating? (2)
37:48 But isn’t is cheating? (3)
38:51 Summary
40:20 Is this psychologically plausible? testing 12 month old American infants
41:13 The effect of linguistic naturality
41:42 Work in progress
41:51 CSJ Corpus: 400 hours of annotated spontaneous speech
42:09 Ling. filter 1: acoustic distance
42:10 Ling filter 2: coarticulation model (1)
42:43 More bootstrapping problems (2)
43:27 Take home message
44:23 Thank you

Related content

Visitors who watched this lecture also watched...
59:47
Drifting Games, Boosting and Online Learning

465 views - Yoav Freund, 2009
01:12:55
Can Learning Kernels Help Performance?

372 views - Corinna Cortes, 2009
01:07:02
The Time Paradox: The New Psychology of Time That Will Change Your Life

4927 views - Philip Zimbardo, 2008
01:43:02
Fuzzy Logic

21169 views - Michael Berthold, 2005
01:12:20
On The History of Ugliness

10605 views - Umberto Eco, 2007
02:06:19
Active Learning

618 views - John Langford, Sanjoy Dasgupta, 2009
01:36:27
PhD Thesis Defense: Dynamics of large networks

14172 views - Jure Leskovec, 2008
03:32:21
Introduction to Learning Theory

3058 views - Olivier Bousquet, 2006
48:33
A Data Miner’s Story – Getting to Know the Grand Challenges

2893 views - Usama Fayyad, 2007
54:00
Awards Session

172 views - Thorsten Joachims, Michael Littman, Yann LeCun, Léon Bottou, Andrea Pohoreckyj Danyluk, 2009

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.

We are currently conducting a short survey. We value your feedback, and would appreciate if you took a few moments to respond to some questions. Click here to take the survey.

Description

Human infants learn spontaneously and effortlessly the language(s) spoken in their environments, despite the extraordinary complexity of the task. Here, I will present an overview of the early phases of language acquisition and focus on one area where a modeling approach is currently being conducted using tools of signal processing and automatic speech recognition: the unsupervized acquisition of phonetic categories. During their first year of life, infants construct a detailed representation of the phonemes of their native language and lose the ability to distinguish nonnative phonemic contrasts. Unsupervised statistical clustering is not sufficient; it does not converge on the inventory of phonemes, but rather on contextual allophonic units or subunits. I present an information-theoretic algorithm that groups together allophonic variants based on three sources of information that Can be acquired independently: the statistical distribution of their contexts, the phonetic plausibility of the grouping, and the existence of lexical minimal pairs. This algorithm is tested on several natural speech corpora. We find that these three sources of information are probably not language specific. What is presumably unique to language is the way in which they are combined to optimize the emergence of linguistic categories.

Emmanuel Dupoux is the director of the Laboratoire de Sciences Cognitives et Psycholinguistique in Paris. He conducts research on the early phases of language and social acquisition in human infants, using a mix of behavioral and brain-imaging techniques as well as computational modeling. He teaches at the Ecole des Hautes Etudes en Sciences sociales where he has set up an interdisciplinary graduate program in Cognitive Science.

Link this page  

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: