event thumbnail image
The 13th International Conference on Knowledge Discovery and Data Mining

Exploiting Duality in Summarization with Deterministic Guarantees

author: Panagiotis Karras, The Hong Kong University of Science and Technology

Description

Summarization is an important task in data mining. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guarantee, often expressed in terms of a maximum-error metric. Histograms and several hierarchical techniques have been proposed for this problem. However, their time and/or space complexities remain impractically high and depend not only on the data set size n, but also on the space budget B. These handicaps stem from a requirement to tabulate all allocations of synopsis space to different regions of the data. In this paper we develop an alternative methodology that dispels these deficiencies, thanks to a fruitful application of the solution to the dual problem: given a maximum allowed error, determine the minimum-space synopsis that achieves it. These complexity advantages offer both a spaceefficiency and a scalability that previous approaches lacked. We verify the benefits of our approach in practice by experimentation.

You might be experiencing some problems with Your Video player.
Slides
0:00 The Impact of Duality on Data Synopsis Problems
0:19 Introduction
2:41 Outline
3:26 Histograms-part01
8:00 Histograms-part02
10:35 Histograms-part03
12:30 Restricted Haar Wavelet Synopses-part01
14:34 Restricted Haar Wavelet Synopses-part02
15:42 Unrestricted Haar and Haar+ Synopses-part01
17:05 Unrestricted Haar and Haar+ Synopses-part02
17:48 Experiments: Histograms, Time vs. n
17:58 Experiments: Histograms, Time vs. B
18:11 Experiments: Haar Wavelets, Time vs. n
18:38 Experiments: Haar Wavelets, Time vs. B
19:08 Conclusions
19:51 Related Work
20:04 Thank you! Questions?

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: