Autoscaling Big Data Operations in the Cloud

Logo
Presented by

Kirk Lewis

About this talk

The ability to scale the number of nodes in your cluster up and down on the fly is among the major features that make cloud deployments attractive. Estimating the right number of cluster nodes for a workload is difficult; user-initiated cluster scaling requires manual intervention, and mistakes are often costly and disruptive. Autoscaling enables applications to perform their best when demand changes. But the definition of performance varies, depending on the app. Some are CPU-bound, others memory-bound. Some are “spiky” in nature, while others are constant and predictable. Autoscaling automatically addresses these variables to ensure optimal application performance. Amazon EMR, Azure HDInsight, and Google Cloud Dataproc all provide autoscaling for big data and Hadoop, but each takes a different approach. Pepperdata field engineer, Kirk Lewis will discuss the operational challenges associated with maintaining optimal big data performance, what milestones to set, and offer recommendations on how to create a successful cloud migration framework. Topics include: – Types of scaling – What does autoscaling do well? When should you use it? – Does traditional autoscaling limit your success? – What is optimized cloud autoscaling?

Related topics:

More from this channel

Upcoming talks (2)
On-demand talks (112)
Subscribers (6158)
Pepperdata is the Big Data performance company. Fortune 1000 enterprises depend on Pepperdata to manage and optimize the performance of Hadoop and Spark applications and infrastructure. Developers and IT Operations use Pepperdata soluions to diagnose and solve performance problems in production, increase infrastructure efficiencies, and maintain critical SLAs. Pepperdata automatically correlates performance issues between applications and operations, accelerates time to production, and increases infrastructure ROI. Pepperdata works with customer Big Data systems on-premises and in the cloud.