InfoTechTarget and Informa Tech's Digital Businesses Combine.

Together, we power an unparalleled network of 220+ online properties covering 10,000+ granular topics, serving an audience of 50+ million professionals with original, objective content from trusted sources. We help you gain critical insights and make more informed decisions across your business priorities.

Optimizing Performance of ML Models Through a Bayesian Lens with Tripadvisor

Presented by

Narendra Mukherjee (Machine Learning Scientist @ Tripadvisor)

About this talk

Talk Abstract: Do you encounter missing values in your model features, but don’t give them much thought? I have two goals in this talk: 1) use my work with sort algorithms at Tripadvisor to show how ad-hoc imputation of missing values severely hurts the performance of real-world ML models, and 2) cast the missing value problem as a probabilistic model which one can solve through Bayesian inference. I will end by showing that the most widely used missing value imputation technique in the statistics community (Multiple Imputation by Chained Equations, MICE), which scikit-learn implements in its IterativeImputer) can be better understood as approximate Bayesian inference in a simple probabilistic model. This talk will have content that should appeal to data and ML related researchers of all skill levels. For beginning data-related practitioners, part 1 of my talk will demonstrate why it is important to think about missing values carefully during feature engineering and how to examine their role in a model’s predictive performance. For more experienced attendees, part 2 of my talk will try to draw a bridge between the statistical literature on missing value imputation and the world of the machine learning practitioner through a Bayesian lens. Speaker Bio: Narendra is a long time Bayesian interested in the connections between statistics, causal inference and machine learning. Currently, he is a Machine Learning Scientist at Tripadvisor based at their global headquarters in Needham, MA. His work at Tripadvisor spans the entire range of customer-centric ML problems from recommendation engines to building probabilistic models of user-generated content creation. To learn more about Narendra, look at his webpage at: https://narendramukherjee.github.io Disclaimer: All views, thoughts, & opinions expressed in the webinar belong solely to the panelists, & not to the panelists’ employer, organization, committee, other group or individual.
Dataiku

Dataiku

59775 subscribers285 talks
Everyday AI, Extraordinary People
Dataiku is the Universal AI Platform, uniting the technology, teams, and operations needed for companies to build intelligence into their daily operations, from modern analytics to generative AI. Together, they design, develop and deploy new AI capabilities, at all scales and in all industries. Organizations that use Dataiku enable their people to be extraordinary, creating the AI that will power their company into the future. More than 700 companies worldwide use Dataiku, driving diverse use cases from predictive maintenance and supply chain optimization, to quality control in precision engineering, to marketing optimization, generative AI customer proof, and everything in between.
Related topics