The Blogosphere at a glance: Content-based structures made simple
published: Aug. 4, 2011, recorded: July 2011, views: 2694
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
A network representation based on a basic wordoverlap similarity measure between blogs is introduced. The simplicity of the representation renders it computationally tractable, transparent and insensitive to representation-dependent artifacts. Using Swedish blog data, we demonstrate that the representation, in spite of its simplicity, manages to capture important structural properties of the content in the blogosphere. First, blogs that treat similar subjects are organized in distinct network clusters. Second, the network is hierarchically organized as clusters in turn form higher-order clusters: a compound structure reminiscent of a blog taxonomy.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !