Drive Cloud Performance on Amazon EMR with Autoscaling

Logo
Presented by

Alex Pierce

About this talk

Autoscaling automatically increases or decreases the computational resources delivered to a cloud workload based on need. This typically means adding or reducing active servers (instances) that are leveraged against your workload within an infrastructure. The promise of autoscaling is that workloads receive exactly the cloud computational resources they require at any given time, and you only pay for the server resources you need, when you need them. Autoscaling enables applications to perform their best when demand changes, but depending on the application, performance varies. While some applications are constant and predictable, others are bound by CPU or memory, or “spiky” in nature. Autoscaling automatically addresses these variables to ensure optimal application performance. Amazon EMR, Azure HDInsight, and Google Cloud Dataproc all provide autoscaling for big data and Hadoop with a different approach. While autoscaling provides the elasticity that customers require for their big data workloads, it can lead to exorbitant runaway waste and cost and management complexity. Estimating the right number of cluster nodes for a workload is difficult; user-initiated cluster scaling requires manual intervention, and mistakes are often costly and disruptive. Join Pepperdata Field Engineer Alex Pierce for this discussion about operational challenges associated with maintaining optimal big data performance in the cloud, what milestones to set, and recommendations on how to create a successful cloud migration framework. Learn the following: – Autoscaling types – Autoscaling strengths and weaknesses – When to use autoscaling and what autoscaling does well – Is traditional autoscaling limiting your success? – What is optimized cloud autoscaling? – What does cloud autoscaling success look like?
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (117)
Subscribers (6408)
Pepperdata is the Big Data performance company. Fortune 1000 enterprises depend on Pepperdata to manage and optimize the performance of Hadoop and Spark applications and infrastructure. Developers and IT Operations use Pepperdata solutions to diagnose and solve performance problems in production, increase infrastructure efficiencies, and maintain critical SLAs. Pepperdata automatically correlates performance issues between applications and operations, accelerates time to production, and increases infrastructure ROI. Pepperdata works with customer Big Data systems on-premises and in the cloud.