Fix Spark Performance Issues Without Thinking Too Hard

Presented by

Heidi Carson and Alex Pierce

About this talk

This discussion explores the results of analyzing thousands of Spark jobs on many multi-tenant production clusters. We will discuss common issues we have seen, the symptoms of those issues, and how you can address and overcome them without thinking too hard. Pepperdata big data performance management gathers trillions of performance data points on hundreds of production clusters running Spark, covering a variety of industries, applications, and workload types. Based on analyzing the behavior and performance of thousands of Spark applications and use case data from the Pepperdata Big Data Performance report, Heidi and Alex will discuss key performance insights. Topics include best and worst practices, gotchas, machine learning, and tuning recommendations.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (117)
Subscribers (6413)
Pepperdata Capacity Optimizer delivers 30-47% greater cost savings for data-intensive workloads, eliminating the need for manual tuning by optimizing CPU and memory in real time with no application changes. Pepperdata pays for itself, immediately decreasing instance hours/waste, increasing utilization, and freeing developers from manual tuning to focus on innovation.