Bias in the Web

Logo
Presented by

Ricardo Baeza-Yates, CTO, NTENT; ACM Fellow; IEEE Fellow

About this talk

The Web is the most powerful communication medium and the largest public data repository that humankind has created. Its content ranges from great reference sources such as Wikipedia to ugly fake news. Indeed, social (digital) media is just an amplifying mirror of ourselves. Hence, the main challenge of search engines and other websites that rely on web data is to assess the quality of such data. However, as all people have their own biases, web content, as well as our web interactions, are tainted with many biases. Data bias includes redundancy and spam, while interaction bias includes activity and presentation bias. In addition, sometimes algorithms add bias, particularly in the context of search and recommendation systems. As bias generates bias, we stress the importance of de-biasing data as well as using the context and other techniques such as explore & exploit, to break the filter bubble. The main goal of this talk is to make people aware of the different biases that affect all of us on the Web. Awareness is the first step to be able to fight and reduce the vicious cycle of bias. Ricardo Baeza-Yates areas of expertise are web search and data mining, information retrieval, data science, and algorithms. He is CTO of NTENT, a semantic search technology company based in California, USA since 2016. Before, he was VP of Research at Yahoo Labs, based first in Barcelona, Spain, and later in Sunnyvale, California, from January 2006 to February 2016. He also is part time Professor at DTIC of the Universitat Pompeu Fabra, in Barcelona, Spain, as well as at DCC of Universidad de Chile in Santiago.

Related topics:

More from this channel

Upcoming talks (3)
On-demand talks (606)
Subscribers (89584)
Data is the foundation of any organization and therefore, it is paramount that it is managed and maintained as a valuable resource. Subscribe to this channel to learn best practices and emerging trends in a variety of topics including data governance, analysis, quality management, warehousing, business intelligence, ERP, CRM, big data and more.