event thumbnail image
First ACM International Conference on Web Search and Data Mining - WSDM 2008

Can Social Bookmarks Improve Web Search?

author: Paul Heymann, Stanford University

Description

Social bookmarking is a recent phenomenon which has the potential to give us a great deal of data about pages on the web. One major question is whether that data can be used to augment systems like web search. To answer this question, over the past year we have gathered what we believe to be the largest dataset from a social bookmarking site yet analyzed by academic researchers. Our dataset represents about forty million bookmarks from the social bookmarking site del.icio.us. We contribute a characterization of posts to del.icio.us: how many bookmarks exist (about 115 million), how fast is it growing, and how active are the URLs being posted about (quite active). We also contribute a characterization of tags used by bookmarkers. We found that certain tags tend to gravitate towards certain domains, and vice versa. We also found that tags occur in over 50 percent of the pages that they annotate, and in only 20 percent of cases do they not occur in the page text, backlink page text, or forward link page text of the pages they annotate. We conclude that social bookmarking can provide search data not currently provided by other sources, though it may currently lack the size and distribution of tags necessary to make a significant impact.

You might be experiencing some problems with Your Video player.
Slides
0:00 Can Social Bookmarking Improve Web Search?
0:10 - Introduction
0:12 YouTube
0:50 Amazon
1:16 Del.icio.us
1:59 - Problem Statement
2:00 Problem Statement
2:06 Subproblems
2:48 Tags versus Other Content
3:22 - Data Gathering Methodology
3:23 Del.icio.us Posts
4:03 Realtime Web Crawling
4:55 - Analysis
4:57 Size and Growth
8:04 URL Indexing and Age - 1
10:54 URL Indexing and Age - 2
12:46 Tagging Caveats ("The Tagging 6")
15:15 Tags versus Other Content
15:36 Tagging Caveats ("The Tagging 6")
16:45 - Conclusions
16:46 Conclusions - 1
17:22 Conclusions - 2
17:35 Conclusions - 3
18:04 - Questions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: