Proven Approaches to Hive Query Tuning

Logo
Presented by

Kirk Lewis, Pepperdata Field Engineer

About this talk

Apache Hive is a powerful tool frequently used to analyze data while handling ad-hoc queries and regular ETL workloads. Despite being one of the more mature solutions in the Hadoop ecosystem, developers, data scientists and IT operators are still unable to avoid common inefficiencies when running Hive at scale. Inefficient queries can mean missed SLAs, negative impact on other users, and slow database resources. Poorly tuned platforms or poorly sized queues can cause even efficient queries to suffer. This webinar discusses proven approaches to Hive query tuning that improve query speed and reduce cost. Learn how to understand the detailed performance characteristics of query workloads and the infrastructure-wide issues that impact these workloads. Pepperdata Field Engineer, Kirk Lewis will discuss: - Finding problem queries - Pinpointing delayed queries, expensive queries, and queries that waste CPU and memory - Improving query utilization and performance with database and infrastructure metrics - Ensuring your infrastructure is not adversely impacting query performance
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (117)
Subscribers (6416)
Pepperdata Capacity Optimizer delivers 30-47% greater cost savings for data-intensive workloads, eliminating the need for manual tuning by optimizing CPU and memory in real time with no application changes. Pepperdata pays for itself, immediately decreasing instance hours/waste, increasing utilization, and freeing developers from manual tuning to focus on innovation.