The role of hierarchies in exploratory data mining

author:Raghu Ramakrishnan, Yahoo! Research
published: Oct. 10, 2008,   recorded: September 2008,   views: 115
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Hierarchies in Data Mining
1:39 About this Talk
3:02 Background: The Multidimensional Data Model Cube Space
3:09 Star Schema
4:06 Dimension Hierarchies
4:30 Multidimensional Data Model
5:27 Multidimensional Data
7:00 Cube Space
7:37 OLAP Over Imprecise Data
7:53 Multidimensional Data
8:38 OLAP Over Imprecise Data
9:06 Imprecise Data
10:07 Querying Imprecise Facts
10:49 Allocation (1)
10:56 Allocation (2)
11:33 Allocation (3)
11:51 Allocation Policies
12:43 Motivating ExampleWe
13:45 Desideratum I: Consistency
14:00 Desideratum II: Faithfulness (1)
15:12 Desideratum II: Faithfulness (2)
16:00 Query Semantics
16:21 Dealing with Data Sparsity
16:58 Motivating Application
19:16 Estimation in the "Tail"
20:42 Sampling of Webpages
21:52 Imputation of Impression Volume
22:23 Exploiting Taxonomy Structure
23:07 Imputation of Impression Volume (1)
24:13 Imputation of Impression Volume (2)
24:26 Imputing xij
24:32 Imputation: Summary
24:45 Dealing with Data Sparsity
24:58 Yahoo! Home Page
26:26 Novel Aspects
29:20 Bellwether Analysis:Global Aggregates from Local Regions
29:32 Motivating Example
31:25 Key Ideas
31:53 Motivating Example
32:20 A Straightforward Approach
33:10 Using Regional Features
34:38 Basic Bellwether Problem
35:42 Experiment on a Mail Order Dataset (1)
36:32 Experiment on a Mail Order Dataset (2)
36:50 Basic Bellwether Computation
37:11 Subset-Based Bellwether Prediction
37:34 Characteristics of Bellwether Trees & Cubes
37:45 Efficiency Comparison
37:52 Scalability
37:54 Exploratory Mining:Prediction Cubes
38:28 The Idea
39:21 Example (1/7): Regular OLAP
39:33 Example (2/7): Regular OLAP
40:16 Example (3/7): Decision Analysis (1)
41:20 Example (3/7): Decision Analysis (2)
41:25 Example (4/7): Prediction Cubes
42:29 Example (5/7): Model-Similarity
42:57 Example (6/7): Predictiveness
43:00 Example (7/7): Prediction Cube
43:28 Efficient Computation
43:46 Bottom-Up Data Cube Computation
43:50 Functions on Sets
43:51 Scoring Function
43:57 Machine-Learning Models
44:17 Probability-Based Ensemble
44:49 Efficiency Comparison
45:37 Conclusions
45:38 Related Work: Building Models on OLAP Results
45:52 Related Work (Contd.)

Related content

Visitors who watched this lecture also watched...
56:47
Industrial data mining, Challenges and perspectives

219 views - Françoise Fogelman Soulié, 2008
02:23:20
Data Mining for Anomaly Detection

1219 views - Arindam Banerjee, Aleksandar Lazarevic, Jaideep Srivastava, Vipin Kumar, Varun Chandola, 2008
42:42
Data Clustering: 50 Years Beyond K-means

938 views - Anil K. Jain, 2008
46:04
Changing the world of web search

71 views - Prabhakar Raghavan, 2008
01:19:20
The Regularization Frontier in Machine Learning

253 views - Gilles Gasso, 2008
01:04:57
Building blocks for semantic search engines: Ranking and compact indexing in entity-relation graphs

743 views - Soumen Chakrabarti, 2006
28:58
Ecml/Pkdd 2008 Opening and Awards Ceremony

38 views - 2008
57:43
Distributed Data Mining

957 views - Giuseppe di Fatta, 2005
01:43:02
Fuzzy Logic

16686 views - Michael Berthold, 2005
05:38:35
Contrast Data Mining: Methods and Applications

393 views - Rao Kotagiri, 2008

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.

Description

In a broad range of data mining tasks, the fundamental challenge is to efficiently explore a very large space of alternatives. The difficulty is two-fold: first, the size of the space raises computational challenges, and second, it can introduce data sparsity issues even in the presence of very large datasets. In this talk, we'll consider how the use of hierarchies (e.g., taxonomies, or the OLAP multi-dimensional model) can help mitigate the problem.

Link this page  

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: