Processing social media data: can we circumvent the Tower of Babel?

author: Nikola Ljubešić, Department of Knowledge Technologies, Jožef Stefan Institute
published: Aug. 1, 2017,   recorded: July 2017,   views: 942


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Social media are known to be a diverse and rich source of information for various areas of research. However, they pose a series of processing challenges due to the linguistic and cultural diversity of their users. Processing social media texts with standard language technologies has an error rate much higher than that on standard texts. Furthermore, researchers are regularly in need of additional user data like their sociodemographic information. In the first part of my talk I will present a series of technology adaptations for processing varying language production, while in the second part I will overview some experiments on language-independent user profiling such as user type identification and gender prediction.

See Also:

Download slides icon Download slides: solomon_ljubesic_tower_of_babel_01.pdf (589.9 KB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: