Right Tool for the Job: Running Apache Spark at Scale in the Cloud

Presented by

Ashwin Chandra Putta, Sr. Product Manager at Qubole

About this talk

Apache Spark is powerful open source engine used for processing complex, memory-intensive workloads. However, running Apache Spark in the cloud can be complex and challenging. Qubole has re-engineered Apache Spark, optimising its performance and efficiency while reducing any administrative overheads. Today, Qubole runs some of the world’s largest Apache Spark clusters in the cloud. In this webinar, we’ll take a deeper look at the use cases for Apache Spark, including ETL and machine learning, and compare Apache Spark on Qubole versus Open Source Apache Spark. We’ll cover: - Why Apache Spark is essential for big data processing - How to deploy Spark at scale in the cloud and enable all data users - The enhancements made to Qubole Spark - A live demo and real-world examples of Apache Spark on Qubole

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (118)
Subscribers (8306)
Tune in to hear from open data lake platform leaders and engineers discuss everything from continuous date engineering on data lakes for machine learning, streaming analytics, ad-hoc analytics and data exploration in the cloud. The interactive talks are designed for both data engineers, data analysts and data scientists that want to learn about some of the challenges and solutions for use cases seen in data-driven organizations. Learn more about Qubole: http://bit.ly/AboutQubole