Learning to Link with Wikipedia

author: David Milne, Computer Science Department, University of Waikato
published: Nov. 19, 2008,   recorded: October 2008,   views: 5358


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.

See Also:

Download slides icon Download slides: cikm08_milne_ltlww_01.ppt (5.1┬áMB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Sayali, May 1, 2009 at 8:15 a.m.:

I am unable to hear the sound with "Standalone WM Player".

Comment2 Amir Hossein Jadidinejad, June 21, 2009 at 12:06 a.m.:

Great lecture!
Thank David.

Comment3 Tony Souter, December 28, 2009 at 4:36 a.m.:

Thanks; nicely set up here, with the ability to click through your vid.

WP overall is massively (internally) overlinked. Most wikilink decisions require human decisions as to reader-utility and relevance. I do hope this algorithm won't lead to a worsening of overlinking.


Comment4 David Milne, December 29, 2009 at 4:27 a.m.:


I tend to agree with you; Wikipedia has a lot of pointless links. I definitely think that any linking algorithm that is run on Wikipedia should be closely supervised. I would only suggest using this in an interactive setting, where the algorithm suggests links and a person must explicitly verify or correct them (i.e. they don't get included until someone says they should).

That said, this kind of stuff could actually help to fix the overlinking problem you have identified. It could easily look at existing links and indicate those that don't fit the model it has learned. That model can be built from any set of wikipedia articles; i.e. a carefully crafted set of articles that have been linked very sparingly?

Also, the broader applications are for other documents (e.g. news stories, blogs, etc), not Wikipedia itself.

Comment5 Ramona, February 25, 2016 at 1:32 p.m.:

this is verry useful. I vas looking for something like this. And now i found it by mistake

Write your own review or comment:

make sure you have javascript enabled or clear this field: