Hi [[ session.user.profile.firstName ]]

Putting AI to Work on Apache Spark

Apache Spark simplifies AI, but why not use AI to simplify Spark performance and operations management? An AI-driven approach can drastically reduce the time Spark application developers and operations teams spend troubleshooting problems.

This talk will discuss algorithms that run real-time streaming pipelines as well as build ML models in batch to enable Spark users to automatically solve problems like: (i) fixing a failed Spark application, (ii) auto tuning SLA-bound Spark streaming pipelines, (iii) identifying the best broadcast joins and caching for SparkSQL queries and tables, (iv) picking cost-effective machine types and container sizes to run Spark workloads on the AWS, Azure, and Google cloud; and more.
Recorded Jun 13 2019 40 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Shivnath Babu, CTO & Co-Founder at Unravel Data
Presentation preview: Putting AI to Work on Apache Spark

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • APM & Operational Intelligence for Azure Databricks Recorded: Oct 3 2019 41 mins
    Abha Jain, Director of Products, Unravel Data; Ron Abellera, Microsoft Global Blackbelt Microsoft,
    According to Ovum research, over half of big data workloads will be running in the cloud by the end of this year (2019). Microsoft Azure provides a number of options for powering your modern data estate with the flexibility and scalability of the cloud. AI driven, intelligent DataOps is critical to gain visibility to modern data operations. In this webinar, we will focus on:

    Advantages of running modern data platforms in the cloud
    The importance of visibility into your cloud data infrastructure
    Demonstration of Unravel for Azure Databricks to manage DataOps on Azure

    Try Unravel risk free with a 60 day license and up to $15K Free Azure for starting a Proof of Concept. Contact: hello@unraveldata.com
  • Migrating Big Data Workloads to the Cloud with Unravel Recorded: Sep 25 2019 7 mins
    Abha Jain, Director of Products, Unravel Data
    Migrating Big Data Workloads to the Cloud with Unravel
  • Unravel for Azure Cloud Migration Recorded: Sep 10 2019 13 mins
    Abha Jain, Director of Products, Unravel Data
    As you’re migrating your Spark and Hadoop applications to Microsoft Azure, Unravel helps ensure you won’t be flying blind. With data-driven intelligence and recommendations for optimizing compute, memory, and storage resources, Unravel makes your transition a smooth one. Abha Jain, Director of Products at Unravel demonstrates how.
  • Unravel for Azure Databricks overview demo Recorded: Sep 10 2019 6 mins
    Abha Jain, Director of Products, Unravel Data
    Director of Products Abha Jain provides a demo of Unravel's support for Azure Databricks.
  • Unravel for Cloud Migration Recorded: Sep 5 2019 18 mins
    Abha Jain, Director of Products, Unravel Data
    As you’re migrating your Spark and Hadoop applications to the cloud, Unravel helps ensure you won’t be flying blind. With data-driven intelligence and recommendations for optimizing compute, memory, and storage resources, Unravel makes your transition a smooth one. Abha Jain, Director of Products at Unravel demonstrates how.
  • How to Optimize Spark Data Pipelines on Azure Databricks Recorded: Aug 14 2019 33 mins
    Aengus Rooney, Head of Solution Engineering - International, Unravel Data
    Join Unravel expert Aengus Rooney to develop an understanding of the performance dynamics of modern data pipelines and applications. In this session, you will learn about uncovering and understanding the key datasets, metrics, and best practices needed to develop mastery with Spark performance management on Azure Databricks.
  • Transforming the Business of Healthcare with Data Operations Recorded: Aug 13 2019 6 mins
    Charles Boicy, CIO, Clearsense & Kunal Argawhal
    Unravel and Clearsense chief executives discuss the potential life and death challenges of big data in healthcare.
  • Clearsense on keeping their big data promises in Healthcare Recorded: Aug 13 2019 2 mins
    Charles Boicy, CIO, Clearsense
    Clearsense CIO Charles Boicy explains why you'd be out of your mind to monitor your big data environment without Unravel.
  • DataOps Done Right: How to Optimize DataOps for the Cloud Recorded: Aug 6 2019 64 mins
    George Demarest, Senior Director of Product Marketing, Unravel; Wayne W. Eckerson; Eckerson Group
    Modern applications are powered by data that must first run through a gamut of software, systems, and technologies before being consumed by business users. DataOps represents an emerging discipline for designing, managing, and monitoring the flow of data from source to target. DataOps provides a level of rigor required to manage dozens or hundreds of data pipelines that potentially serve mission-critical applications with stringent service level agreements.

    Today, companies want to run some or all of their data pipelines in the cloud or spanning cloud and non-cloud platforms. But how does that work in theory and in practice? How does a DataOps team manage the processes, technologies, and data when pipelines cross multiple environments? What does a DataOps for the cloud look like? This webcast will define DataOps, explore best practices, and discuss how DataOps can build and manage data pipelines in the cloud.
  • Understanding DataOps and Its Impact on Application Quality Recorded: Jul 23 2019 59 mins
    George Demarest, Senior Director of Product Marketing, Unravel; Chris Riley, Editor, Sweetcode.io
    Modern day applications are data driven and data rich. The infrastructure your backends run on are a critical aspect of your environment, and require unique monitoring tools and techniques. In this webinar learn about what DataOps is, and how critical good data ops is to the integrity of your application. Intelligent APM for your data is critical to the success of modern applications. In this webinar you will learn:

    The power of APM tailored for Data Operations
    The importance of visibility into your data infrastructure
    How AIOps makes data ops actionable
  • Use Machine Learning to Get the Most out of Your Big Data Clusters Recorded: Jul 9 2019 43 mins
    Dr. Eric Chu, VP Data Analytics, Unravel Data
    Enterprises across all sectors have invested heavily in big data infrastructure (Hadoop, Impala, Spark, Kafka, etc.) to turn data into insights into business value. Clusters are getting bigger, more complex and employing more and more data scientists and engineers.

    As a result, it is increasingly challenging for Data Ops teams to operate and maintain these clusters to meet business requirements and performance SLAs. For instance, a single SQL query may fail or take a long time to complete for various reasons, such as SQL-level inefficiencies, data skew, missing and stale statistics, pool-level resource configurations, such that a resource-hogging query could impact the entire application stack on that cluster.

    A critical capability to scale application performance is to do cluster-wide tuning. Examples include: tune the default application configurations so that all applications would benefit from that change, tune the pool-level resource allocations, identify wide-impact issues like slow nodes and too many small files, and many others.

    Cluster-level tuning requires considering more factors, and has a risk of significantly worsening cluster performance; however, it is often done via trial and error with educated guesswork, if attempted at all. We employ machine learning and AI techniques to make cluster-level tuning easier, more data-driven, and more accurate.

    Recorded at Spark AI Summit in San Francisco, this talk will describe our methodology to learn from various sources of data such as the workload, the cluster and pool resources, metastore, etc., and provide recommendations for cluster defaults for application and pool resource configurations. We will also present a case study where a customer applied unravel tuning recommendations and achieved 114% increase in the number of applications running per day while using 47% fewer vCore-Hours and 15% fewer containers.
  • Using AI Powered Automation For High Performance Data Pipelines in The Cloud Recorded: Jul 4 2019 57 mins
    Alejandro Fernandez, Senior Software Engineer, Unravel
    With cloud becoming the deployment platform of choice for data pipelines, many IT organizations must now come to grips with what that means for planning, budgeting, migrating and operating big data in the cloud. Trying to make accurate, informed decisions about deploying data pipelines to the cloud is getting trickier and goes well beyond to-do lists and spreadsheets. IT organizations need a data-driven approach that neither buries them in semi-relevant detail, or oversimplifies the process. Join us for this informative webinar, where we’ll explore:

    • Assessing, planning, executing and validating a successful migration of data workloads to the cloud.
    • Mapping resource requirements for data pipelines, from physical servers in the data center to the ideal cloud server instance types.
    • Baselining application performance and dependencies, and selecting candidates as initial migration targets.
    • How Unravel applies full stack visibility, analytics and AI-powered automation to help data teams address these challenges.
    • Key considerations for maximizing the business and operational impact of workload migration
  • Migrating & Scaling Data Pipelines with AI on Amazon EMR, Redshift & Athena Recorded: Jul 3 2019 17 mins
    Kuna Agarwal, CEO & Shivnath Babu, CTO, Unravel Data
    Recorded at AWS Summit Santa Clara, Unravel CEO Kunal Agarwal and CTO Shivnath Babu talk about migrating and scaling data pipelines on AWS at the 2019 AWS Santa Clara Summit.
  • Planets Align The Irresistible Forces Pulling Big Data to the Cloud Recorded: Jun 28 2019 62 mins
    Eric Kavanagh, Analyst, Bloom Research; George Demares, Unravel Data
    Business leaders and technologists have become increasingly sophisticated and successful in collecting, monetizing, and using their data to create real business value through modern data applications. As early successes are followed by more ambitious goals for their big data programs, many organizations are feeling the gravitational pull of the cloud as a primary deployment platform for their new data pipelines.

    As these planets align cloud powerhouses like Amazon AWS, Microsoft Azure, and Google Cloud Platform are ramping up a rich collection of cloud services including Spark, Kafka, SQL/NoSQL databases, ML/AI, and many more.

    Register for this DM Radio Deep Dive to hear industry Analyst Eric Kavanagh explain why the cloud is forcing an evolution of thinking and investment in big data programs. He'll be joined by George Demarest of Unravel Data who will discuss the key role of AI, machine learning, expert automation, and market forces are playing in the planetary shift towards big data in the cloud.
  • How to Build Reliable Modern Data Pipelines Using AI and DataOps Recorded: Jun 28 2019 62 mins
    Dr. Eric Chu, VP Data Analytics, Unravel Data
    You Will Learn:
    -The role of DataOps in supporting modern data applications
    -A DataOps framework for building and managing data pipelines
    -The role of testing and monitoring in DataOps
    -How AI is needed to manage and monitor complex data pipelines and environments
    -How modern performance management software can reduce the risk of running modern data applications
  • Doing DevOps for Big Data? What you need to know about AIOps Recorded: Jun 28 2019 57 mins
    Bala Venkatrao, VP Products, Unravel Data
    AIOps has the promise to create hyper-efficiency within DevOps teams as they struggle with the diversity, complexity, and rate of change across the entire stack.
  • Data Applications in the Cloud: Optimization and Migration Recorded: Jun 27 2019 40 mins
    Grant Liu, VP Solutions Engineer, Unravel Data
    The movement to utilize data to drive more effective business outcomes continues to accelerate. But with this acceleration comes an explosion of complex platforms to collect, process, store, and analyze this data. Ensuring these platforms are utilized optimally is a tremendous challenge for businesses.

    Join Grant Liu, VP of Solution Engineering at Unravel data, as he takes you through an AI/ML based approach to Application Performance Management applied to data applications on any infrastructure - whether it be cloud, on-premise, or a combination of the two.
  • Unravel Spark Troubleshooting demo Recorded: Jun 26 2019 4 mins
    Marlon Braendli, Unravel Solution Specialist, Unravel
    In this video, Unravel solution specialists Marlon Braendli provides a brief demo to provide an overview of Unravel, focusing primarily on Spark troubleshooting and performance tuning.
  • Unravel automated Root Cause Analysis demo Recorded: Jun 26 2019 4 mins
    Abha Jain, Director of Products, Unravel Data
    Unravel Director of Product Abha Jain provides a quick 3 minute demo of Unravel RCA capabilities.
  • Demo: Moving Big Data Pipelines to the Cloud: Plan, Migrate, Validate & Manage Recorded: Jun 25 2019 42 mins
    Dave Berry, Senior Solutions Engineer - International, Unravel Data
    The movement to utilize data to drive more effective business outcomes continues to accelerate. But with this acceleration comes an explosion of complex platforms to collect, process, store, and analyze this data. Ensuring these platforms are utilized optimally is a tremendous challenge for businesses.

    Join Dave Berry, Senior Solution Engineer at Unravel data, as he takes you through an AI/ML based approach to Application Performance Management applied to data applications on any infrastructure - whether it be cloud, on-premise, or a combination of the two.
AI-powered performance management for your modern data applications.
At Unravel, we see an urgent need to help every business understand and optimize the performance of their applications, while managing data operations with greater insight, intelligence, and automation.

For these businesses, Unravel is the AI-powered data operations company. We offer novel solutions that leverage AI, machine learning, and advanced analytics to help you fully operationalize the way you drive predictable performance in your modern data applications and pipelines.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Putting AI to Work on Apache Spark
  • Live at: Jun 13 2019 1:00 pm
  • Presented by: Shivnath Babu, CTO & Co-Founder at Unravel Data
  • From:
Your email has been sent.
or close