Use Machine Learning to Get the Most out of Your Big Data Clusters

Logo
Presented by

Dr. Eric Chu, VP Data Analytics, Unravel Data

About this talk

Enterprises across all sectors have invested heavily in big data infrastructure (Hadoop, Impala, Spark, Kafka, etc.) to turn data into insights into business value. Clusters are getting bigger, more complex and employing more and more data scientists and engineers. As a result, it is increasingly challenging for Data Ops teams to operate and maintain these clusters to meet business requirements and performance SLAs. For instance, a single SQL query may fail or take a long time to complete for various reasons, such as SQL-level inefficiencies, data skew, missing and stale statistics, pool-level resource configurations, such that a resource-hogging query could impact the entire application stack on that cluster. A critical capability to scale application performance is to do cluster-wide tuning. Examples include: tune the default application configurations so that all applications would benefit from that change, tune the pool-level resource allocations, identify wide-impact issues like slow nodes and too many small files, and many others. Cluster-level tuning requires considering more factors, and has a risk of significantly worsening cluster performance; however, it is often done via trial and error with educated guesswork, if attempted at all. We employ machine learning and AI techniques to make cluster-level tuning easier, more data-driven, and more accurate. Recorded at Spark AI Summit in San Francisco, this talk will describe our methodology to learn from various sources of data such as the workload, the cluster and pool resources, metastore, etc., and provide recommendations for cluster defaults for application and pool resource configurations. We will also present a case study where a customer applied unravel tuning recommendations and achieved 114% increase in the number of applications running per day while using 47% fewer vCore-Hours and 15% fewer containers.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (85)
Subscribers (5785)
At Unravel, we see an urgent need to help every business understand and optimize the performance of their applications, while managing data operations with greater insight, intelligence, and automation. For these businesses, Unravel is the AI-powered data operations company. We offer novel solutions that leverage AI, machine learning, and advanced analytics to help you fully operationalize the way you drive predictable performance in your modern data applications and pipelines.