How to Process Tons of Data for Cheap with Spark + Kubernetes

Presented by

Gus Cavanaugh, Solution Engineer @ Dataiku

About this talk

This webinar has something for everyone, whether technical or not. On the business side, you’ll get an overview of how to better manage your infrastructure spend while providing the compute your analysts and data scientists need as well as a practical demonstration of how autoscaling your data processing infrastructure provides the horsepower you need without breaking the bank. On the technical side, if you like Spark for processing big data and Kubernetes for scaling and managing containers but you haven’t run Spark on Kubernetes yet, this is the webinar for you. In this one hour session, you’ll learn: - Why Kubernetes is a great scheduler for Spark jobs - How to quickly spin up a managed Kubernetes cluster on AWS and run you first Spark job from your environment - How Dataiku lets data scientists spin up Kubernetes clusters and run Spark jobs with just a few mouse clicks

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (478)
Subscribers (53775)
Dataiku is the world’s leading platform for Everyday AI, systemizing the use of data for exceptional business results. Organizations that use Dataiku elevate their people (whether technical and working in code or on the business side and low- or no-code) to extraordinary, arming them with the ability to make better day-to-day decisions with data. More than 450 companies worldwide use Dataiku to systemize their use of data and AI, driving diverse use cases from fraud detection to customer churn prevention, predictive maintenance to supply chain optimization, and everything in between.