Best of 2019 - Leveraging Streaming and Batch Data Sets for ML Applications

Presented by

Jorge Villamariona and Ojas Mulay from Qubole

About this talk

Data Engineering is fast emerging as the most critical function in Analytics and Machine Learning programs. The ability to build and manage data pipelines for streaming and batch data sets are critical for the downstream success of your ML applications. In this webinar, you will learn how to use Qubole’s cloud-native platform to acquire and transform data sets for data science and analytics, make data sets available to different users, and fully leverage your data lake throughout your organization. Our experts will also walk through a real-world example of how to use Apache Spark and Airflow, as well as Notebooks, to build an end-to-end solution. Attendees will learn how to: + Ingest data to/from a cloud storage data lake + Perform interactive data analysis and build AI/ML models + Transform data sets with Spark and build interactive dashboards + Seamlessly interact with other data sources + Deploy end-to-end data pipeline using Apache Airflow
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (9)
Subscribers (8316)
Tune in to hear from open data lake platform leaders and engineers discuss everything from continuous date engineering on data lakes for machine learning, streaming analytics, ad-hoc analytics and data exploration in the cloud. The interactive talks are designed for both data engineers, data analysts and data scientists that want to learn about some of the challenges and solutions for use cases seen in data-driven organizations. Learn more about Qubole: