Web mining

author: Ricardo Baeza-Yates, NTENT, Inc.
published: Nov. 16, 2010,   recorded: September 2010,   views: 430
Categories

Slides

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Delicious Bibliography

Description

The Web continues to grow and evolve very fast, changing our daily lives. This activity represents the collaborative work of the millions of institutions and people that contribute content to the Web as well as the one billion people that use it. In this ocean of hyperlinked data there is explicit and implicit information and knowledge. Web Mining is the task of analyzing this data and extracting information and knowledge for many different purposes. The data comes in three main flavors: content (text, images, etc.), structure (hyperlinks) and usage (navigation, queries, etc.), implying different techniques such as text, graph or log mining. Each case reflects the wisdom of some group of people that can be used to make the Web better. For example, user generated tags in Web 2.0 sites. The tutorial covers (a) the main concepts behind Web mining, the different data that is found in the Web and typical applications; (b) the mining process: data recollection, data cleaning, data warehousing and data analysis, including crawling in the case of content mining, and privacy issues in the case of usage mining; (c) the main techniques used for the different data types; and (d) use cases of the three types: content, structure and usage mining, ranging from Web site design to search engines.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: