Scalable Collaborative Filtering Algorithms for Mining Social Networks

author: Edward Chang, Google
published: Dec. 20, 2008,   recorded: December 2008,   views: 1201
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Scalable Collaborative Filtering for Mining Social Networks
0:17 Collaborators
0:26 Confucius, a Q&A System
1:21 OpenSocial
1:51 Web 1.0
2:06 Web 2.0 --- Web with People
2:32 Confucius, a Q&A system
2:41 Query - Yellowstone
2:49 Query - Yellowstone - resuts
3:03 Query - Yosemite
3:21 Query - Beijing
4:14 Key ML Subroutines of Confucius
5:50 Naive User Evaluation
6:11 Shortcomings
6:46 Link-based User Credential Ranking: HITS
8:16 Q&A/Blog/BBS Search
10:05 Data Mining Impact & Opportunities
10:10 Web 2.0 --- Web with People
10:22 + Social Platforms
10:38 social networks example (1)
10:54 social networks example (2)
10:57 What users are interested in?
11:19 example - blog
11:23 example - pictures
11:26 Open Social APIs
11:40 Open Social map use (1)
11:59 Open Social map use (2)
12:05 Open Social map use (3)
12:13 Open Social
12:33 Personalized Search Example (1)
12:58 Personalized Search Example (2)
13:00 Personalized Search Example (3)
13:01 Personalized Search Example (4)
13:10 Personalized Search Example (5)
13:13 Personalized Recommendation
13:35 Recommendation Systems
13:52 Outline (1)
14:23 Outline (2)
14:25 Task: Targeting Ads at SNS Users
14:47 Mining Profiles, Friends & Activities for Relevance
15:22 Consider also User Influence
15:54 Outline (3)
15:59 Collaborative Filtering
16:18 Some Queries
16:56 FIM-based Recommendation
17:37 FIM Preliminaries
19:08 Preprocessing
20:14 Parallel Projection
21:27 Example of Projection (1)
21:56 Example of Projection (2)
21:58 Example of Projection (3)
22:01 Recursive Projections [H. Li, et al. ACM RS]
22:50 Projection using MapReduce
23:36 Outline (4)
23:57 Collaborative Filtering (1)
24:26 Collaborative Filtering (2)
25:07 Notations
25:48 Probabilistic Latent Semantic Analysis (PLSA)
26:38 Example of Latent Analysis
28:16 Baseline Models
28:47 CCF Model [Chen, et. al. KDD 08]
30:07 Empirical Study
30:30 Community Recommendation
31:22 Results
32:32 Gibbs Sampling MapRedue Speedup
33:31 Extensions
33:38 …Extensions
33:53 Outline (5)
34:09 Distributed Computing Perspectives
35:23 MapReduce framework
37:05 Comparison between Parallel Computing Frameworks
39:11 Conclusions…
39:38 …Conclusions (1)
40:19 …Conclusions (2)
41:03 - questions
41:12 - questions
42:15 - questions
44:32 - questions
46:02 - questions
46:29 - questions

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Description

Social networking sites such as Orkut, MySpace, Hi5, and Facebook attract billions of visits a day, surpassing the page views of Web Search. These social networking sites provide applications for individuals to establish communities, to upload and share documents/photos/videos, and to interact with other users. Take Orkut as an example. Orkut hosts millions of communities, with hundreds of communities created and tens of thousands of blogs/photos uploaded each hour. To assist users to find relevant information, it is essential to provide effective collaborative filtering tools to perform recommendations such as friend, community, and ads matching. In this talk, I will first describe both computational and storage challenges to traditional collaborative filtering algorithms brought by aforementioned information explosion. To deal with huge social graphs that expand continuously, an effective algorithm should be designed to 1) run on thousands of parallel machines for sharing storage and speeding up computation, 2) perform incremental retraining and updates for attaining online performance, and 3) fuse information of multiple sources for alleviating information sparseness. In the second part of the talk, I will present algorithms we recently developed including parallel Spectral Clustering [1], parallel PF-Growth [2], parallel combinational collaborative filtering [3], parallel LDA, parallel spectral clustering, and parallel Support Vector Machines [4].

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: