The web of data: how are we doing so far?
published: May 3, 2021, recorded: April 2021, views: 13
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Throughout its history, the web has shaped our understanding and interactions with data. In the age of AI, this is mostly the data that it helps create, find, organise, and utilise, through its myriad of interconnected applications and user communities. This data takes many forms: digital traces we leave behind while being online, user-generated content, datasets published by scientists and government, or labels produced on crowdsourcing platforms to train machine learning algorithms. The web of data was supposed to bring it all together through links, metadata, shared vocabularies, and standardised technologies. We’ve come a long way since the first linked open data cloud was published in 2007 with 12 datasets. But we’ve also encountered road blockers that we’re still to overcome. Huge investments have been made in opening different streams of data to developers, yet publishers struggle to show evidence of the impact of these investments and become sustainable. Finding and making sense of data online is as critical as it has ever been, especially as more and more jobs come to rely on it. Lots of data turns out to be flawed, eroding our social bonds and trust in institutions. Despite a rise in knowledge graphs, data siloes are more common than ever and governments are building virtual borders to data flows in the name of digital sovereignty. In this talk I will present recent research that provide insights into the state of the web of data today. Just like the web stands for a melange of services and platforms, from search to shopping to social networks, the web of data is a concept with multiple facets – we need to unpack these different facets to understand how much progress we’ve made and where the challenges really lie. Data on the web is not just about the standards and protocols promoted by the linked open data cloud; in a wider interpretation, it amounts to the (sparsely linked) graph of web tables embedded in documents, to millions of online datasets in various formats, but also to charts that present data in accessible ways. The web of data is a mechanism to publish and reuse data, a social network, a marketplace, and a platform to help train the AIs of this world, all affected to a larger or lesser degree by technopolitics. I will discuss these different interpretations, supported by studies into open data portals, data communities, and crowdsourced datasets, and deep-dive into technical, user experience, innovation and policy questions and their impact on present and future developments in this space.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !