Building data pipelines that drive highly predictive, resilient models

Presented by

Anindya Datta, Founder and CEO, Mobilewalla

About this talk

Predictive performance and resiliency are equally important requirements for model operationalization. Users would rather have resilient models with “good” performance than models that show high predictivity at training and testing, but underperform when deployed, and require frequent tuning. Of the multiple impactors of performance and resilience, it is well known that training data has an outsize impact on predictivity, and data drift is a major influencer of resilience. In this webinar we will discuss two strategies to improve accuracy and resiliency of models: - Enriching first party data to create effective training data sets that possess adequate breadth, depth and scale - Using the concept of data stability to reduce the impact of data drift and produce resilient models from scratch In discussing these strategies, we will address the following questions: - Why machine learning models often underperform in production - How data enrichment can improve model performance - The concept of data stability, and how it can be used proactively - Discuss how feature transformation and selection is so critical to model resilience\

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (31)
Subscribers (2276)
Data Science Central is the industry's online resource for data practitioners. From Statistics to Analytics to Machine Learning to AI, Data Science Central provides a community experience that includes a rich editorial platform, social interaction, forum-based support, plus the latest information on technology, tools, trends, and careers.