Using Rank Propagation and Probabilistic Counting for Link-based Spam Detection thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Using Rank Propagation and Probabilistic Counting for Link-based Spam Detection

Published on Feb 25, 20077981 Views

Related categories

Chapter list

Using Rank Propagation and Probabilistic Counting for Link-based Spam Dete00:01
Content00:26
What is on the Web?00:54
Web spam (keywords + links)01:32
Web spam (mostly keywords)02:20
Search engine?02:44
Fake search engine03:14
Problem: “normal” pages that are spam03:24
Link farms03:56
Motivation05:23
Metrics06:01
Test collection07:18
Degree-based measures08:11
Degree08:16
Degree09:01
Edge reciprocity09:14
Assortativity10:09
Automatic classifier11:08
PageRank12:48
PageRank12:50
Maximum PageRank in the Host13:00
Variance of PageRank13:25
Variance of PageRank of in-neighbors14:15
Automatic classifier14:39
TrustRank15:09
TrustRank15:18
TrustRank score16:05
TrustRank / PageRank16:55
Automatic classifier17:13
Truncated PageRank17:36
Path-based formula for PageRank17:42
General functional ranking18:11
Truncated PageRank18:49
Truncated PageRank(T=2) / PageRank19:48
Max. change of Truncated PageRank20:14
Automatic classifier20:31
Counting supporters20:51
Idea: count “supporters” at different distances20:55
High and low-ranked pages are different21:37
Probabilistic count22:09
Hosts at distance 422:50
Automatic classifier23:07
Conclusions23:16
Summary of classifiers23:21
Top 10 metrics25:07
Conclusions25:12