About this talk

The ability to scale the number of nodes in your cluster up and down on the fly is among the major features that make cloud deployments attractive. Estimating the right number of cluster nodes for a workload is difficult; user-initiated cluster scaling requires manual intervention, and mistakes are often costly and disruptive. Autoscaling enables applications to perform their best when demand changes. But the definition of performance varies, depending on the app. Some are CPU-bound, others memory-bound. Some are “spiky” in nature, while others are constant and predictable. Autoscaling automatically addresses these variables to ensure optimal application performance. Amazon EMR, Azure HDInsight, and Google Cloud Dataproc all provide autoscaling for big data and Hadoop, but each takes a different approach. Pepperdata field engineer, Kirk Lewis will discuss the operational challenges associated with maintaining optimal big data performance, what milestones to set, and offer recommendations on how to create a successful cloud migration framework. Topics include: – Types of scaling – What does autoscaling do well? When should you use it? – Does traditional autoscaling limit your success? – What is optimized cloud autoscaling?

Autoscaling Big Data Operations in the Cloud

Presented by

About this talk

More from this channel