event thumbnail image
Workshops

Large Scale Learning - Challenge

author: Vojtech Franc, Fraunhofer Institute
author: Sören Sonnenburg, Fraunhofer FIRST

Description

With the exceptional increase in computing power, storage capacity and network bandwidth of the past decades, ever growing datasets are collected in fields such as bioinformatics (Splice Sites, Gene Boundaries, etc), IT-security (Network traffic) or Text-Classification (Spam vs. Non-Spam), to name but a few. While the data size growth leaves computational methods as the only viable way of dealing with data, it poses new challenges to ML methods. This workshop is concerned with the scalability and efficiency of existing ML approaches with respect to computational, memory or communication resources, e.g. resulting from a high algorithmic complexity, from the size or dimensionality of the data set, and from the trade-off between distributed resolution and communication costs.

You might be experiencing some problems with Your Video player.
Slides
0:00 Schedule
0:41 Large Scale Learning - Challenge
0:43 Outline
1:04 Large Scale Problems
2:12 Our Motivation
3:43 We Need a Fair Comparison!
4:42 Competition
6:01 Setup and Evaluation Criteria
6:39 Evaluation: Time vs. Test Error
7:43 Dataset Size vs. Time
8:07 Dataset Size vs. Test Error
8:53 Adjusted Goals and Evaluation for SVMs
9:23 Adjusted Evaluation for SVMs
9:41 Time Line
10:15 Statistics - 1
12:00 Statistics - 2
13:38 Datasets
17:50 Performance
21:06 Preliminary Results - Wild Track
23:20 Preliminary Results - Alpha: Time vs. Error
24:20 Preliminary Results - Alpha: Size vs. Error
25:15 Preliminary Results - Beta: Time vs. Error
25:58 Preliminary Results - Beta: Size vs. Error
26:12 Preliminary Results - Gamma: Time vs. Error
27:13 Preliminary Results - Gamma: Size vs. Error
27:18 Preliminary Results - Delta: Time vs. Error
27:21 Preliminary Results - Epsilon: Size vs. Error
27:23 Preliminary Results - Zeta: Size vs. Error
27:25 Preliminary Results - DNA: Time vs. Error
28:37 Preliminary Results - Webspam: Time vs. Error
29:12 Preliminary Results - Webspam: Size vs. Error
29:28 Preliminary Results - FD: Time vs. Error
29:51 Preliminary Results - OCR: Time vs. Error
30:18 Preliminary Results - OCR: Size vs. Error
30:44 Preliminary Results - Linear SVM Track
32:12 Conclusions
35:01 Winners
36:33 Future

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: