Large Scale Learning at Twitter

author: Aleksander Kolcz, Twitter, Inc.
chairman: Marko Grobelnik, Artificial Intelligence Laboratory, Jožef Stefan Institute
published: Aug. 13, 2012,   recorded: May 2012,   views: 1926
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Large Scale (Machine) Learning at Twitter
3:52 Earthquake Los Angeles, July 29, 2008
4:51 Japan Replies (1)
5:43 Japan Replies (2)
6:13 What is Twitter
7:03 #egypt, #tunesia, #libya, #syria ...
7:44 #ocws, #occupywallstreet, ...
8:13 The Scale of Twitter
8:48 Large scale infrastructure of information delivery
9:38 Support for user interaction
10:06 Problems we are trying to solve: Relevance
11:36 Problems we are trying to solve: Who to follow
12:23 Problems we are trying to solve: Content recommendation
12:54 (other) problems we are trying to solve
13:18 Recommendation/Personalization
13:50 Is this BIG data ?
14:47 Challenges
15:25 What type of machine learning?
16:20 ML for social networks (1)
18:15 ML for social networks (2)
19:04 ML for social networks (2)
19:22 Are there wheels not to reinvent?
20:28 Analytics Ecosystem
21:19 Maximizing the use of Hadoop
22:17 AVOID: "janky" analysis of messy data
23:43 Leveraging off-line tools
24:08 Large scale learning frameworks
25:23 Our extensions
26:28 Build/reuse/integrate
27:06 MapReduce
28:25 Java programming
29:14 PigML
29:35 Training a model in Pig (1)
29:53 Training a model in Pig (2)
30:05 Training a model in Pig (3)
30:28 Applying a model in Pig (1)
30:41 Applying a model in Pig (2)
31:00 Model training UDF internals
32:08 Supervised classification in a nutshell (1)
33:17 Supervised classification in a nutshell (2)
35:21 Ensembles
35:48 Classifier Training / Making Predictions
36:59 Ensembles: continued
37:17 Further advantages of parallelism
37:27 Example: tweet sentiment detection
39:14 Diminishing returns ...
40:42 Iterative algorithms
42:09 Topic modeling
42:41 Example: modeling topic distribution
43:15 Mahout/PigML integration
44:06 LDA applications
44:12 Anchoring LDA
45:12 ML/Data Mining we contribute to
45:46 ML outside of Twitter
46:28 Publications mentioning ...
46:58 Quick search
47:17 Spam/spammer modeling
48:59 Example: normal interactions
49:38 Example: spammy interactions
50:20 ML @ Twitter
50:46 Thank you

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Description

Twitter represents a large complex network of users with diverse and continuously evolving interests. Discussions and interactions range from very small to very large groups of people and most of them occur in the public. Interests are both long and short term and are expressed by the content generated by the users as well as via the Twitter follow graph, i.e. who is following whose content. Understanding user interests is crucial to providing good Twitter experience by helping users to connect to others, find relevant information and interesting information sources. The manner in which information is spread over the network and communication attempts are made can also help in identifying spammers and other service abuses. Understanding users and their preferences is also a very challenging problem due to the very large volume information, the fast rate of change and the short nature of the tweets. Large scale machine learning as well as graph and text mining have been helping us to tackle these problems and create new opportunities to better understand our users. In the talk I will describe a number of challenging modeling problems addressed by the Twitter team as well as our approach to creating frameworks and infrastructure to make learning at scale possible.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: