Testing and Monitoring Machine Learning Pipelines at Slack

Presented by

Josh Wills, Director of Data Engineering, Slack

About this talk

Testing and Monitoring Machine Learning Pipelines at Slack At Slack, we build a variety of offline models for search ranking, anomaly detection, channel recommendations, and conversion prediction. As we have encountered different kinds of failure modes in the model development process, we have implemented a variety of tests and alerts to detect common failure modes that we have experienced. In this talk, I will walk through our framework for testing and monitoring model building pipelines and highlight some of the more unusual kinds of errors that we handle via our automated detection and remediation system.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (32)
Subscribers (3430)
The StreamSets DataOps platform enables companies to build, execute, operate and protect batch and streaming dataflows. It is powered by StreamSets Data Collector, award-winning open source software with approximately 2,000,000 downloads to date from thousands of companies. The commercial StreamSets Control Hub is the platform's cloud-native control plane through which enterprises design, monitor and manage complex data movement that is executed by multiple Data Collectors. Unique Intelligent Pipeline technology automatically inspects the data in motion, detecting unexpected changes, errors and sensitive data in-stream. Global 2000 customers use StreamSets for data lake ingestion, Apache Kafka enablement, cybersecurity, IoT, customer 360, GDPR compliance and more. In 2017, the company tripled its customer count and quadrupled revenues.