Name: Five Mistakes to Avoid When Using Spark
Start: 2019-05-15T17:00:00Z
End: 2019-05-15T17:00:31.000Z
Location: BrightTALK
Rating: 4

Pepperdata is the Big Data performance company. Fortune 1000 enterprises depend on Pepperdata to manage and optimize the performance of Hadoop and Spark applications and infrastructure. Developers and IT Operations use Pepperdata solutions to diagnose and solve performance problems in production, increase infrastructure efficiencies, and maintain critical SLAs. Pepperdata automatically correlates performance issues between applications and operations, accelerates time to production, and increases infrastructure ROI. Pepperdata works with customer Big Data systems on-premises and in the cloud.

Learn how Pepperdata uses machine learning to provide Continuous Intelligent Tuning automatically to your Amazon EKS applications, helping your platform team recover wasted capacity and ultimately reduce your spend for cloud resources.

Who should attend: Platform Engineers, Architects, FinOps Leaders, Platform Budget Owners, AWS Admins

What you will learn: 
- New approaches cost optimization on Amazon EKS
- The value of automation versus a manual approach
- How to get the most out of the tools in your toolbox
- How to evaluate vendor solutions for your environment

This webinar also includes a live demo of how Pepperdata Capacity Optimizer Next Gen software reduces costs on Amazon EKS by up to 40%.

Optimization Without Recommendations: Automating Cost Optimization on Amazon EKS

You’ve migrated your workloads to Amazon EMR, and you want to optimize the environment and do more for less. With cloud cost optimization critical in establishing an innovative big data framework, AWS and Pepperdata gathered multiple experts to discuss how to maximize your savings with Amazon EMR and Pepperdata. In this webinar, you will learn:
- AWS customer best practices to optimize your Amazon EMR utilization 
- How Autodesk, a global leader offering design and manufacturing software, lowered operational expenses while enhancing its Amazon EMR scalability and reliability.
- How implementing Pepperdata Capacity Optimizer can help you see an immediate cost reduction in your Amazon EMR infrastructure bill while maintaining performance SLAs. 

Webinar Speakers:
Kiran Guduguntla, AWS Worldwide Business Leader for Amazon EMR 
Kirk Lewis, Pepperdata Senior Solutions Architect
and special guest Mark Kidwell, Autodesk Chief Data Architect, Data Platforms and Services 

Who Should Attend:
CIO, CFO, VP of IT, FinOps Engineer, Platform Engineer, IT Ops, Cloud Architect, Data Platform, Big Data Architect

Maximize Your Savings: How Autodesk Optimized its Amazon EMR with Pepperdata

Containers have revolutionized application development and deployment, but optimizing Kubernetes cloud costs and achieving better performance is nearly impossible without proper visibility, a thorough understanding of how to optimize your container environment, and automation. Capacity management is a critical and required discipline for companies that want to run reliable and efficient cloud-native infrastructure. 
 
Join Pepperdata Engineers Alex Pierce and Shekhar Gupta for a discussion on reducing your overall cloud spend with Kubernetes and Pepperdata Capacity Optimizer. Learn how to achieve automatic cost savings through real-world examples and the following best practices:

- Automatic optimization
- Efficiently managing capacity by identifying mismanaged resources 
- Eliminating waste and recapturing unused capacity
- Observability, alerts, and job-specific recommendations for Spark

Reduce Your Cloud Spend on Kubernetes with Pepperdata Capacity Optimizer

The cloud offers organizations unlimited scalability and lower IT costs by only charging for the resources you use.  But the truth about cloud pricing is that cloud customers typically pay for allocated resources whether they use them or not. 
Cloud cost optimization is the process of reducing your overall cloud spend by identifying mismanaged resources, eliminating waste, and right-sizing computing services to scale. Gartner recently estimated that 70% of cloud overspending can be attributed to companies that do not have a defined plan for cloud cost management. Fortunately, there are many best practices for cloud cost optimization. Join Pepperdata Engineer Chaitanya Patel to discuss the key best practices for cloud cost management and optimization.

Shift Left for Cloud Cost Control

Cost control is frequently misunderstood as cost cutting when it is a business-focused discipline for reducing spend and maximizing business value. As forward-looking data and analytics leaders investigate cost control as a discipline, drive value creation and equip organizations for a digital future, they are keen to optimize costs and maximize the business value of their data investments, but this can result in needless limitation of resources, failure to exploit digital business opportunities, and weakening of competitive advantage. 
Join Pepperdata Product Manager Heidi Carson for a webinar addressing the discipline and best practices of cost optimization and why cloud data control and optimization are essential to your organization’s digital future. Learn some best practices and techniques to increase cost efficiency and maximize business value.

Why Cost Control is Essential for Your Modern Data Stack

A modern platform helps you deliver actionable insights to your organization — insights that were previously unavailable or weren’t delivered in a timely manner. These insights help you to overcome cost and resource inefficiencies by reducing data complexity and enabling powerful self-service capabilities.

Moving to a modern platform can also provide observability and autonomous optimization, and helps your organization move from reactive to proactive. Join Joel Stewart, VP of Customer Success at Pepperdata for this session as he uses real-world use cases and learn the following:

-- Key success factors to ensure a successful migration
-- The key players in data stack modernization
-- Thinking beyond traditional needs to consider architecture trends like self-service analytics that empower line-of-business professionals to perform queries and generate reports on their own
-- Automation, optimization, new techniques, and tools

Migrating to the Modern Data Stack

Monitoring cloud platforms and containers while optimizing performance is an almost impossible challenge for the DevOps team without resource optimization and observability. Join Pepperdata engineers Shekhar Gupta and Chai Patel for a discussion on reducing your overall cloud spend. Learn how to achieve automatic cost savings through real-world examples and best practices.

Modern Strategies for Cloud Cost Optimization

Whether it is lack of visibility into application needs and resource utilization, spending too much on cloud bills, or hundreds of developers calling you to fix a problem, you need to maximize the productivity of your modern big data initiative with a scalable performance optimization solution for data engineering, AI/ML pipelines, and analytics workloads. Automatically improve resource usage in real-time unlike passive observability solutions that rely on recommendations and manual tuning and do not scale.
 
Learn about an automated, scalable solution that helps big data engineers optimize performance and meet the demands of their analytics and stakeholders. In this 15-minute talk, you will  find out how to:
 
- Identify opportunities to increase efficiency
- Identify opportunities to go from reactive to proactive
- Understand the correlation between resources being consumed and workloads
- Avoid issues and remediate issues quickly.
- Balance performance goals versus resource availability

Transform the Performance of your Hyperscale Distributed Systems

Monitoring cloud platforms and containers while optimizing performance is an almost impossible challenge for the DevOps team without resource optimization and observability. Are your existing processes and tools sufficient? Join Pepperdata Product Manager Heidi Carson for a webinar where you will learn how observability can help your team to drive better decisions and results. Learn how to monitor and optimize your entire stack at any scale in one place.

Learn:

Where most companies are at when it comes to observability
Which types of observability tools are the best for cloud environments
Observability to validate in support of optimization
The importance of automation and automatic optimization
The most important observability capabilities for DevOps teams
Whether observability drives better business decisions

Beyond Observability: Optimizing High-Performance Big Data

Whether it is lack of visibility into application needs and resource utilization; spending too much on cloud bills, or hundreds of developers calling you to fix a problem, you need to maximize the productivity of your modern big data initiative with a scalable performance optimization solution for data science and AI/ML pipelines. Automatically improve resource usage in real-time unlike passive observability solutions that rely on recommendations and manual tuning and do not scale.

Learn about an automated, scalable solution that helps customers optimize big data performance and meet the demands of their analytics and big data stakeholders. Learn how to:

- Identify opportunities to increase efficiency
- Identify opportunities to go from reactive to proactive
- Understand correlation resources being consumed and workloads
- Avoid issues and remediate issues quickly.
- Balance performance goals versus resource availability

Monitoring, managing, and optimizing Kubernetes performance in a sprawling production environment brings unique monitoring challenges and visibility gaps that make it difficult to detect and troubleshoot Kubernetes issues.

Join this discussion of best practices for monitoring and optimizing Kubernetes. Learn how Pepperdata can help you improve visibility and transform the performance of your big data systems. Learn how to:

- Overcome the visibility and performance challenges of Kubernetes performance management.
- Meet the demands of both complex and new microservices apps while maintaining legacy apps.
- Deploy, manage, monitor, and simplify big data analytics monitoring, platform monitoring, and dynamic optimization.
- Utilize best practices to improve Kubernetes performance and reduce your overall spend.
- Reduce the complexity of monitoring and managing Kubernetes with automatic optimization

Automatically Optimize Kubernetes and Improve Container Visibility

Spark on Kubernetes is growing in popularity due to improved isolation, better resource sharing, and the ability to leverage homogeneous and cloud-native infrastructure for the entire stack. However, running Spark on Kubernetes in a stable, performant, cost-efficient, and secure manner still presents complex challenges.
In this webinar, Alex Pierce discusses the key performance metrics to focus on when monitoring and optimizing Spark performance on Kubernetes.

Topics include:

- Core concepts and setup of Spark on Kubernetes
- Spark on Kubernetes architecture
- Monitoring Spark on Kubernetes best practices
- Why Prometheus alone is not enough for Kubernetes monitoring
- Observability and optimization
- Configuration tips for performance and efficient resource sharing
- End-to-end visibility for Spark on Kubernetes

Monitoring Spark on Kubernetes: Key Performance Metrics

Managing Kubernetes cloud costs and recapturing resource waste is challenging for even the most experienced ITOps teams. The promise of autoscaling is that workloads receive exactly the cloud computational resources they require at any given time, and you only pay for the server resources you need, when you need them. However, most autoscaling features aren’t granular enough to address today’s variable workload and application needs.  Without granular and automatic control for your cloud instance resources, you are likely over paying for your Kubernetes workloads.  
 
All the cloud providers provide autoscaling for Kubernetes, but with different approaches.
 
Join Pepperdata Field Alex Pierce for this discussion about operational challenges associated with maintaining optimal big data performance in the cloud with a focus on Kubernetes  autoscaling best practices.

Recapture Cloud Waste to Reduce Cloud Costs – Best Practices

Spark on Kubernetes is growing in popularity due to improved isolation, better resource sharing, and the ability to leverage homogeneous and cloud-native infrastructure for the entire stack. However, running Spark on Kubernetes in a stable, performant, cost-efficient, and secure manner still presents complex challenges.
In this webinar, Alex Pierce discusses the key performance metrics to focus on when monitoring and optimizing Spark performance on Kubernetes.

Topics include:

- Automation and observability for lowering costs and improving performance
- Deploying, managing, monitoring, and simplifying Spark on Kubernetes: big data 
   application monitoring, platform monitoring, and dynamic optimization
- Configuring for performance and efficiency
- Spark app-level dynamic allocation and cluster level autoscaling
- The fastest way to improve Spark on Kubernetes performance

Beyond Observability:  Optimize Big Data Resources  for Better Business Results

Complex applications running on Kubernetes containers scale rapidly, but monitoring, managing, and optimizing Kubernetes performance in a sprawling production environment also brings unique monitoring challenges and visibility gaps. These challenges make it hard to detect and troubleshoot Kubernetes issues while managing performance.

Join this discussion of best practices for monitoring and improving Kubernetes visibility and performance. You’ll learn how to:

- Overcome the challenges of Kubernetes performance management.
- Meet the demands of both complex and new microservices apps while maintaining legacy apps.
- Deploy, manage, monitor, and simplify big data analytics monitoring, platform monitoring, and dynamic optimization.
- Utilize five best practices to improve Kubernetes performance and reduce your overall spend.

Monitor and Automatically Improve Kubernetes Performance: Best Practices

Learn how to avoid common mistakes managing Spark in a cluster environment and improve its usability

Apache Spark is playing a critical role in the adoption and evolution of Big Data technologies because it provides sophisticated ways for enterprises to leverage Big Data compared to Hadoop. The increasing amounts of data being analyzed and processed through the framework is massive and continues to push the boundaries of the engine. 
 
Drawing on experiences across dozens of production deployments, Pepperdata Field Engineer Alexander Pierce explores issues observed in a cluster environment with Apache Spark and offers guidelines on how to avoid common mistakes. Attendees can use these observations to improve the usability and supportability of Spark and avoid such issues in their projects.
 
Topics include:

– Serialization
– Partition sizes
– Executor resource sizing
– DAG management
– Shading

Five Mistakes to Avoid When Using Spark

SPARK

DAG management

hadoop

Apache Spark

Big Data

Big Data Management

Cloud

Cloud Architecture

Data Analysis

IT Operations

The data center management community focuses on the holistic management and optimization of the data center. From technologies such as virtualization and cloud computing to data center design, colocation, energy efficiency and monitoring, the BrightTALK data center management community provides the most up-to-date and engaging content from industry experts to better your infrastructure and operations. Engage with a community of your peers and industry experts by asking questions, rating presentations and participating in polls during webinars, all while you gain insight that will help you transform your infrastructure into a next generation data center.

Data Center Management

Join thousands of engaged IT professionals in the application management community on BrightTALK. Interact with your peers in relevant webinars and videos on the latest trends and best practices for application lifecycle management, application performance management and application development.

Application Management

Welcome to the big data and data management community on BrightTALK. Join thousands of data quality engineers, data scientists, database administrators and other professionals to find more information about the hottest topics affecting your data. Subscribe now to learn about efficiently storing, optimizing a complex infrastructure, developing governing policies, ensuring data quality and analyzing data to make better informed decisions. Join the conversation by watching live and on-demand webinars and take the opportunity to interact with top experts and thought leaders in the field.

Big Data and Data Management

As an IT professional, many of the problems you face are multifaceted, complex and don’t lend themselves to simple solutions. The information technology community features useful and free information technology resources. Join to browse thousands of videos and webinars on ITIL best practices, IT security strategy and more presented by leading CTOs, CIOs and other technology experts.

Five Mistakes to Avoid When Using Spark

Presented by

About this talk

More from this channel