Classifying Political Orientation on Twitter: It's Not Easy!
published: April 3, 2014, recorded: July 2013, views: 1767
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Numerous papers have reported great success at inferring the political orientation of Twitter users. This paper has some unfortunate news to deliver: while past work has been sound and often methodologically novel, we have discovered that reported accuracies have been systemically overoptimistic due to the way in which validation datasets have been collected, reporting accuracy levels nearly 30% higher than can be expected in populations of general Twitter users. Using careful and novel data collection and annotation techniques, we collected three different sets of Twitter users, each characterizing a different degree of political engagement on Twitter - from politicians (highly politically vocal) to “normal” users (those who rarely discuss politics). Applying standard techniques for inferring political orientation, we show that methods which previously reported greater than 90% inference accuracy, actually achieve barely 65% accuracy on normal users. We also show that classifiers cannot be used to classify users outside the narrow range of political orientation on which they were trained. While a sobering finding, our results quantify and call attention to overlooked problems in the latent attribute inference literature that, no doubt, extend beyond political orientation inference: the way in which datasets are assembled and the transferability of classifiers.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !