event thumbnail image
Machine Learning Summer School 2006 - Taipei
Pascal

Predictive methods for Text mining

author: Tong Zhang, Yahoo! Research, Yahoo!

Description

I will give a general overview of using prediction methods in text mining applications, including text categorization, information extraction, summarization, and question answering. I will then discuss some of the more advanced issues encountered in real applications such as structured and complicated output classification, the use of unlabeled data, modeling link structures, collective inference and community effect, and transfer learning under changing environment, etc.

You might be experiencing some problems with Your Video player.
Slides
0:05 Predictive Methods for Text Mining
1:25 Motivation for Text Mining
3:01 Structured Data-mining
4:00 Structured Data Example
4:29 Unstructured Text-mining
6:25 Some Problems in Predictive Text-mining
9:21 The Machine Learning Approach
10:17 Supervised learning
11:20 Outline of the Tutorial
12:27 Electronic Text
13:23 An Example of XML Document
13:52 Text Processing for Predictive Modeling
14:58 Tokenization
16:17 Issues in Tokenization
20:22 Simple English Tokenization Procedure
22:11 Lemmatization and Stemming
25:46 Document Level Feature Representation
29:45 Vector Space Document Model
38:49 Removal of Stopwords
41:29 Term Weighting
45:05 Term Weighting in Document Retrieval
48:08 Token Statistics: Zipf’s Law
50:55 Summary of Document Level Feature Generation
53:09 Example Feature Vector for Email Spam Detection
55:17 Text Categorization
56:28 Text Categorization Applications
58:20 Electronic Spam Detection
61:10 Taxonomy Classification
62:57 Basic Text Categorization Framework
65:38 Probability Calibration
67:18 Comments on Probability Calibration
69:34 Common Classification Methods
69:58 Document Similarity in Vector Space Model
71:50 Nearest Neighbor Method
72:57 Centroid Method
76:56 Example Feature Vector for Email Spam Detection

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 1:25:29
Flash video Slide Synchronization Windows Media video

!NOW PLAYING
Watch Part 2
Part 2 0:57:54
Flash video Slide Synchronization Windows Media video

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: