Hi [[ session.user.profile.firstName ]]

Keeping the Trains Running: Effective Troubleshooting for Hadoop

When something goes wrong on your Hadoop cluster – a missed job, sudden performance slow downs, or massive spike in IO – are you able to pinpoint the exact cause of the issue? Most times, it can take hours or days (or maybe the cause will never be discovered). Join us for this webinar to see how Pepperdata reduces troubleshooting times by 90% and can prevent most performance problems from ever happening in the first place.
Recorded May 4 2016 67 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Sean Suchter, CEO of Pepperdata and Dez Blanchfield, Data Scientist at the Bloor Group
Presentation preview: Keeping the Trains Running: Effective Troubleshooting for Hadoop

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • How Pepperdata Helps Developers Fix Code and Apps Oct 11 2017 6:00 pm UTC 60 mins
    Vinod Nair, Director of Product Management
    Intended for software engineers, developers, architects and technical leads who develop Spark applications, Vinod Nair will give you an introduction to how we help Developers in Big Data Environments
  • Top Considerations When Choosing a Big Data Management and Performance Solution Recorded: Sep 20 2017 54 mins
    Kirk Lewis
    The growing adoption of Hadoop and Spark has increased the demand for Big Data management solutions that operate at scale and meet business requirements. However, Big Data organizations realize quickly that scaling from small, pilot projects to large-scale production clusters involves a steep learning curve. Despite tremendous progress, there remain critically important area, including multi-tenancy, performance optimization, and workflow monitoing where the DevOps team still needs management help. In this webinar, field engineer Kirk Lewis discusses the top considerations when choosing a big data management and performance solution.
  • HDFS on Kubernetes: Lessons Learned Recorded: Sep 19 2017 46 mins
    Kimoon Kim, Pepperdata Software Engineer
    HDFS on Kubernetes: Lessons Learned is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications and are interested in running Spark on Kubernetes. Pepperdata has been exploring Kubernetes as potential Big Data platform with several other companies as part of a joint open source project.

    In this webinar, Kimoon Kim will show you how to: 

    –Run Spark application natively on Kubernetes
    –Enable Spark on Kubernetes read and write data securely on HDFS protected by Kerberos
  • Making OpenTSDB Perform at Massive Scale - Pepperdata Meetup Replay Recorded: Aug 29 2017 33 mins
    Simon King
    OpenTSDB is a open-source time series database built on top of HBase. Thanks to HBase, OpenTSDB scales very nicely to accommodate large amounts of data in terms of bytes or data points -- at Pepperdata we ingest hundreds of billions of data points per day. Where OpenTSDB struggles to scale is in the number of distinct time series. Pepperdata stores time series data on all the hardware and processes across many Hadoop clusters: billions of discrete series per day. Speaker, Simon King, will discuss some of OpenTSDB's strengths and weaknesses, and some of the techniques Pepperdata uses to work around its limitations. Originally presented at Galvanize in San Francisco on 8/21/17.
  • Overcoming Performance Challenges of Building Spark Applications for AWS Recorded: Aug 16 2017 39 mins
    Vinod Nair, Director of Product Management at Pepperdata
    Overcome Performance Challenges in Building Spark Applications for AWS is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications for EMR or EC2 clusters.

    In this webinar, Vinod Nair will show you how to:

    Identify which portion of your application consumes the most resources
    Identify the bottlenecks slowing down your applications
    Test your applications against development or production workloads
    Significantly reduce troubleshooting issues due to ambient cluster conditions

    This webinar is followed by a live Q & A. A replay of this webinar will be available within 24 hours at https://www.pepperdata.com/resources/webinars/.
  • Kerberized HDFS – and How Spark on Yarn Accesses It Recorded: Aug 9 2017 50 mins
    Kimoon Kim
    Following up on his recent presentation HDFS on Kubernetes and the Lessons Learned, Senior Software Engineer, Kimmoon Kim presents on Kerberized HDFS and how Spark on Yarn Accesses It
  • Creatively Visualizing Spark Data Recorded: Jul 18 2017 26 mins
    Christina Holland
    Pepperdata tech talk by Pepperdata software engineer, Christina Holland on creatively visualizing spark data and designing new ways to see new pipelines.
  • Production Spark Series Part 4: Spark Streaming Delivers Critical Patient Care Recorded: Jun 22 2017 58 mins
    Charles Boicey, Chief Innovation Officer, Clearsense
    Clearsense is a pioneer in healthcare data science solutions using Spark Streaming to provide real time updates to health care providers for critical health care needs. Clinicians are enabled to make timely decisions from the assessment of a patient's risk for Code Blue, Sepsis and other conditions based on the analysis of information gathered from streaming physiological monitoring along with streaming diagnostic data and the patient historical record. Additionally this technology is used to monitor operational and financial process for efficiency and cost savings. This talk discusses the architecture needed and the challenges associated with providing real time SLAs along with 100% uptime expectations in a multi-tenant Hadoop cluster.
  • Spark Summit 2017 - Connect Code to Resource Consumption to Scale Production Recorded: Jun 6 2017 26 mins
    Vinod Nair, Director of Product Management
    Apache Spark is a dynamic execution engine that can take relatively simple Scala code and create complex and optimized execution plans. In this talk, we will describe how user code translates into Spark drivers, executors, stages, tasks, transformations, and shuffles. We will also discuss various sources of information on how Spark applications use hardware resources, and show how application developers can use this information to write more efficient code. We will show how Pepperdata’s products can clearly identify such usages and tie them to specific lines of code. We will show how Spark application owners can quickly identify the root causes of such common problems as job slowdowns, inadequate memory configuration, and Java garbage collection issues.
  • Spark Summit 2017 – Spark Summit Bay Area Apache Spark Meetup Recorded: Jun 5 2017 98 mins
    Sean Suchter, Pepperdata Founder and CTO
    Bay Area Apache Spark Meetup at the 10th Spark Summit featuring tech-talks about using Apache Spark at scale from Pepperdata’s CTO Sean Suchter, RISELab’s Dan Crankshaw, and Databricks’ Spark committers and contributors.
  • HDFS on Kubernetes: Lessons Learned Recorded: Jun 2 2017 36 mins
    Kimoon Kim, Engineer, Pepperdata
    There is growing interest in running Spark natively on Kubernetes (see https://github.com/apache-spark-on-k8s/spark). Spark applications often access data in HDFS, and Spark supports HDFS locality by scheduling tasks on nodes that have the task input data on their local disks. When running Spark on Kubernetes, if the HDFS daemons run outside Kubernetes, applications will slow down while accessing the data remotely.

    In this webinar, we will demonstrate how to run HDFS inside Kubernetes to speed up Spark. In particular, we will show:

    - Spark scheduler can still provide HDFS data locality on Kubernetes by discovering the mapping of Kubernetes containers to physical nodes to HDFS datanode daemons.
  • Production Spark Series Part 3: Tuning Apache Spark Jobs Recorded: May 30 2017 40 mins
    Simon King, Engineer, Pepperdata
    A Spark Application that worked well in a development environment or with sample data may not behave as expected when run against a much larger dataset in a production environment. Pepperdata Application Profiler, based on open source Dr Elephant, can help you tune you Spark Application based on current dataset characteristics and cluster execution environment. Application Profiler uses a set of heuristics to provide actionable recommendations to help you quickly tune your applications.

    Occasionally an application will fail (or execute too slowly) due to circumstances outside your control: a busy cluster, another misbehaving YARN application, bad luck, or bad "cluster weather". We'll discuss ways to use Pepperdata's Cluster Analyzer to quickly determine when an application failure may not be your fault and how to diagnose and fix symptoms that you can affect.
  • Production Spark Series Part 2: Connecting Your Code to Spark Internals Recorded: May 9 2017 39 mins
    Sean Suchter, CTO/Co-Founder, Pepperdata
    Spark is a dynamic execution engine that can take relatively simple Scala code and create complex and optimized execution plans. In this talk, we will describe how user code translates into Spark drivers, executors, stages, tasks, transformations, and shuffles. We will describe how this is critical to the design of Spark and how this tight interplay allows very efficient execution. Users and operators who are aware of the concepts will become more effective at their interactions with Spark.
  • Big Data for Big Data: Machine Learning Models of Hadoop Cluster Behavior Recorded: Apr 10 2017 37 mins
    Sean Suchter, CTO/Co-Founder, Pepperdata and Shekhar Gupta, Software Engineer, Pepperdata
    Learn how to use machine learning to improve cluster performance.

    This talk describes the use of very fine-grained performance data from many Hadoop clusters to build a model predicting excessive swapping events.

    Performance of batch processing systems such as YARN is generally determined by the throughput, which measures the amount of workload (tasks) completed in a given time window. For a given cluster size, the throughput can be increased by running as much workload as possible on each host, to utilize all the free resources available on host. Because each node is running a complex combination of different tasks/containers, the performance characteristics of the cluster are dynamically changing. As a result, there is always a danger of overutilizing host memory, which can result into extreme swapping or thrashing. The impacts of thrashing can be very severe; it can actually reduce the throughput instead of increasing it.

    By using very fine-grained (5 second) data from many production clusters running very different workloads, we have trained a generalized model that very rapidly detects the onset of thrashing, within seconds from the first symptom. This detection has proven fast enough to enable effective mitigation of the negative symptom of thrashing, allowing the hosts to continuously provide high throughput.

    To build this system we used hand-labeling of bad events combined with large scale data processing using Hadoop, HBase, Spark, and iPython for experimentation. We will discuss the methods used as well as the novel findings about Big Data cluster performance.
  • Production Spark Webinar Series - Part 1: Best Practices for Spark in Production Recorded: Mar 7 2017 59 mins
    Chad Carson, Co-Founder and Ed Colonna, VP of Marketing
    Join us for our Part 1 of our Production Spark Webinar Series. This first installment gathers Spark experts and practitioners from varying backgrounds to discuss the top trends, challenges and use cases for production Spark applications. Our expert panel will discuss several key considerations when running Spark in production and take questions directly from the audience.

    Our distinguished panel of industry experts is as follows:

    Dr. Babak Behzad, Senior Software Engineer, SAP/Altiscale
    Charles Boicey, Chief Innovation Officer, Clearsense
    Richard Williamson, Principal Engineer, Silicon Valley Data Science
    Andrew Ray, Principal Data Engineer, Silicon Valley Data Science
    Sean Suchter, CTO and Co-Founder, Pepperdata
  • Philips Wellcentive Cuts Hadoop Troubleshooting from Months to Hours Recorded: Dec 6 2016 48 mins
    Geovanie Marquez, Hadoop Architect at Philips Wellcentive
    Philips Wellcentive, a SaaS health management and data analytics company, relies on a nightly Mapreduce job to process and analyze data for their entire patient population; from birth to current day. It looks at their entire patient population to assess a number of different characteristics and powers the analytics that physician organizations need to deliver better services. When this job began to fail repeatedly, the Hadoop team spent months trying to identify the root cause using existing monitoring tools, but were unable to come up with an explanation for the job failures and slowdowns.

    Join our webinar to hear more about why existing Hadoop monitoring tools were insufficient to diagnose the root cause of Philips Wellcentive’s problems and how Pepperdata helped them to significantly improve their Big Data operations. The webinar will cover the different approaches that Philips Wellcentive took to rectify their missed SLAs, and how Pepperdata ultimately helped them quickly troubleshoot their performance problems and ensure their jobs complete on time.
  • Effectively Manage Multi-tenant Hadoop for the Enterprise Recorded: Nov 14 2016 39 mins
    Sean Suchter, CTO of Pepperdata
    As the Hadoop market matures and new applications and use cases for Big Data emerge, organizations are dealing with more complex environments than ever before. In days past, deployments often focused on single, batch-oriented workloads, and if you wanted to run multiple workloads at the same time, you needed to split your clusters. With Hadoop 2 and YARN, organizations are able to run multiple workloads on the same cluster. But, in multi-tenant environments, resource contention can become a daily problem and low-priority, ad-hoc jobs can sometimes monopolize hardware resource that is needed for high-priority workloads.

    Pepperdata is the first and only software that guarantees service levels in multi-tenant Hadoop environments. We have helped dozens of companies of all industries and all sizes to effectively manage and scale their multi-tenant environments, guaranteeing service levels and improving overall cluster performance.

    Join us for this webcast to hear best practices for running multi-tenant environments and how you can improve visibility, performance, and overall management of your big data environment.
  • Improve Amazon EMR Performance up to 4X Recorded: Oct 13 2016 36 mins
    Vinod Nair, Product Manager at Pepperdata
    Are you currently running Amazon EMR but lacking the visibility and measurement of how your cluster is performing? Pepperdata for Amazon EMR enables users of Amazon Elastic MapReduce to run jobs up to four times faster and simultaneously reduce costs. Users can see over 300 metrics even after the cluster has been terminated, so users have a historical view of performance.

    Register for our webinar to learn how Amazon EMR can help streamline your big data projects, and how Pepperdata can help you get the most value from your investment.
  • Ensure Quality of Service in Multi-tenant Hadoop Environments Recorded: Aug 3 2016 44 mins
    Sean Suchter of Pepperdata and Andy Oram of O'Reilly Media
    This webcast will show you how Pepperdata can help your organization guarantee quality of service in multi-tenant Hadoop environments by eliminating resource contention and guaranteeing service levels for high-priority jobs. Run HBase, MapReduce, Spark, Hive, and more all on a single cluster without worrying about jobs stomping on each other. We'll show you how Pepperdata automates cluster optimization to reduce time and cost, and keep your Hadoop humming happily.
  • Keeping the Trains Running: Effective Troubleshooting for Hadoop Recorded: May 4 2016 67 mins
    Sean Suchter, CEO of Pepperdata and Dez Blanchfield, Data Scientist at the Bloor Group
    When something goes wrong on your Hadoop cluster – a missed job, sudden performance slow downs, or massive spike in IO – are you able to pinpoint the exact cause of the issue? Most times, it can take hours or days (or maybe the cause will never be discovered). Join us for this webinar to see how Pepperdata reduces troubleshooting times by 90% and can prevent most performance problems from ever happening in the first place.
DevOps for Big Data
Pepperdata is the DevOps for Big Data company. Leading Enterprise companies depend on Pepperdata to manage and improve the performance of Hadoop and Spark. Developers and operators use Pepperdata products and services to diagnose and solve performance problems in production and increase cluster utilization. The Pepperdata product suite improves communication of performance issues between Dev and Ops, shortens time to production, and increases cluster ROI. Pepperdata products and services work with customer Big Data systems both on-premise and in the cloud

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Keeping the Trains Running: Effective Troubleshooting for Hadoop
  • Live at: May 4 2016 5:15 pm
  • Presented by: Sean Suchter, CEO of Pepperdata and Dez Blanchfield, Data Scientist at the Bloor Group
  • From:
Your email has been sent.
or close