Machine learning for the semantic web thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Chapter list

Document Classification into large taxonomies00:00
How to Represent Text? ...from Characters to Logic00:00
DMoz (Open Directory Project)00:08
Outline00:45
Some Initial thoughts01:03
Classification of a query into DMoz02:24
Quick example: Why representation matters?02:48
Classification of a document into DMoz03:59
Visual & Contextual Search04:52
Contextualized search05:06
Example: searching 10:04
Context sensitive search with10:05
News reporting bias10:05
News Reporting Bias example10:26
Experimental setup10:54
Prediction of news source11:25
Detecting News Reporting11:39
Easy decision problems require simple data representation12:36
Harder decision problems require better data representation - 112:37
Harder decision problems require better data representation - 212:39
Harder decision problems require better data representation - 312:40
How we represent Text?12:41
How we process data?12:42
News Visualization12:43
Topic landscape of the query “Clinton” from Reuters news 1996-199712:48
What we do with data?15:48
Visualization of social relationships between “Clinton”15:49
Topic Trends Tracking of the documents including “Clinton”16:21
Key paradigms17:15
WW2 query “Pearl Harbor” into NYTimes archive18:18
WW2 query “Belgrade” into NYTimes archive18:55
How different research areas approach text?20:53
WW2 query “Normandy” into NYTimes archive20:59
Levels of text representations - 921:23
Language model level21:39
Context aware auto-complete22:23
Context-aware prediction for document authoring - 122:29
Context-aware prediction for document authoring - 222:32
How do we represent text?23:28
Context-aware prediction for document authoring - 323:31
Levels of text representations - 223:34
Levels of text representations - 1024:15
Full-parsing level24:17
Text Enrichment24:46
Text enrichment with http:// Enrycher.ijs.si24:48
Levels of text representations - 125:15
Levels of text representations - 329:16
Character level representation29:20
Good and bad sides of character level representation (n-grams)30:08
Language identification30:31
Graph33:34
Knowledge based summarization33:36
Summarization via semantic graphs33:37
Detailed Summarization33:38
Example of automatic summary33:38
Character level normalization35:09
Levels of text representations - 435:11
Word level36:08
Key semantic word Properties36:22
Levels of text representations - 1237:10
Collaborative tagging37:48
Example: flickr.com tagging37:49
Example: del.icio.us tagging37:50
Stop-words37:52
Levels of text representations - 1438:07
Stemming and lemmatization38:20
Template / frames level38:22
Examples of simple templates38:39
Stemming39:32
Levels of text representations - 540:10
Phrase level40:14
Google n-gram corpus41:11
Examples of Google n-grams42:32
Levels of text representations - 643:01
Part-of-Speech43:06
Part-of-Speech Tags43:07
Part-of-Speech examples43:09
Levels of text representations - 743:55
Taxonomies/thesaurus level43:59
WordNet – database of lexical44:04
WordNet relations44:39
Levels of text representations - 846:07
Vector-space model level46:24
Cyc’s front-end: “Cyc Analytic Environment” – querying 47:03
Bag-of-Words document representation47:11
Bag-of-Words Words47:58
Example document and its vector representation49:08
Cyc’s front-end: “Cyc Analytic Environment” – justification 49:32
Similarity between BoW50:44
Document Categorization Task52:36
Algorithms for learning document classifiers53:19
Example learning algorithm: Perceptron54:01
Measuring success – Model quality estimation54:52
Document Clustering Task54:55
K-Means clustering algorithm55:24
Example of hierarchical clustering (bisecting k-means)55:37
Latent Semantic Indexing55:42
LSI Example55:43
Further references ...57:49
References to some Text-Mining books57:50
Books on Semantic Technologies58:08
References to the main conferences58:14
Videos on Text and Semantic Technologies58:33