Hi [[ session.user.profile.firstName ]]

Best Practices Optimizing your big data costs with Amazon EMR

Data is a core part of every business. As data volumes increase so do costs of processing it. Whether you are running your Apache Spark, Hive, or Presto workloads on-premise or on AWS, Amazon EMR is a sure way to save you money. In this session, we’ll discuss several best practices and new features that enable you to cut your operating costs and save money when processing vast amounts of data using Amazon EMR.

Hear from Unravel Data on how you can use Unravel APM, a full-stack solution for big data workloads running on Amazon EMR, to get visibility and reporting on your Amazon EMR cluster resource utilization and cost savings.

Learn about the best practices and new features such as Managed Scaling and improved Apache Spark performance to help you optimize and reduce your Amazon EMR costs.
Watch a demo on how Unravel APM, a full-stack monitoring, tuning, and troubleshooting solution for big data workloads running on Amazon EMR, can help you optimize your Amazon EMR cluster costs.
Recorded Jul 22 2020 52 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Kunal Agarwal, CEO Unravel Data & Roy Hasson, Sr. Manager Data Lakes, AWS
Presentation preview: Best Practices Optimizing your big data costs with Amazon EMR

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • Operationalize Your Insights - The Self-Service Data Roadmap, Session 4 of 4 May 13 2021 5:00 pm UTC 60 mins
    Sandeep Uttamchandani, CDO & VP Engineering, Unravel Data
    In this webinar, Unravel CDO and VP Engineering Sandeep Uttamchandani describes the fourth and final step for any large, data-driven project: the Operationalize phase. You've found your data (Discover phase), readied it for processing (Prep phase), and built out your processing logic and machine learning model(s) (Build phase). Now you need to Operationalize all your work to data as a live project, in production.

    Sandeep Uttamchandani is a leader in the fields of data, AI, and machine learning. This webinar is the third talk from his new O'Reilly animal book, The Self-Service Data Roadmap. The book shows how to start, implement, and complete large data science projects, up to and including the creation of a complete, self-service data science platform for your organization.
  • Reasons Why Big Data Cloud Migrations Fail - and Ways to Succeed Apr 29 2021 5:00 pm UTC 60 mins
    Chris Santiago, Global Solution Engineering Director, Unravel Data
    Organizations are moving big data from on-premises to the cloud, using best-of-breed technologies like Databricks, Amazon EMR, Azure HDI, and Cloudera, to name a few. However, many cloud migrations fail. Why? And, how can you overcome the barriers and succeed? Join Chris Santiago, Director of Solution Engineering, as he describes the biggest pain points and how you can avoid them, and make your move to the cloud a success. He will cover:

    The elements you must include in a successful cloud migration plan
    How to find the right strategy for your cloud migration
    Successful models for big data deployments in the cloud
    How Unravel customers are making solid plans, meeting their goals, and saving time and money
  • Build Your Insights & ML Models - The Self-Service Data Roadmap, Session 3 of 4 Apr 15 2021 5:00 pm UTC 60 mins
    Sandeep Uttamchandani, CDO & VP Engineering, Unravel Data
    In this webinar, Unravel CDO and VP Engineering Sandeep Uttamchandani describes the third step for any large, data-driven project: the Build phase. You've found your data, in the Discover phase, and readied it for processing, in the Prep phase. Now you need to Build the logic that will actually process the data and the machine learning models that the data will be run through.

    Sandeep Uttamchandani is a leader in the fields of data, AI, and machine learning. This webinar is the third talk from his new O'Reilly animal book, The Self-Service Data Roadmap. The book shows how to start, implement, and complete large data science projects, up to and including the creation of a complete, self-service data science platform for your organization.
  • Mastering Databricks Environments with Unravel Recorded: Apr 1 2021 38 mins
    Chris Santiago, Global Solution Engineering Director, Unravel Data
    Databricks is a great solution for customers looking to unlock the powerful use cases that Spark enables, with the high performance of Databricks and the convenience of a managed service. Databricks is available in AWS, Microsoft Azure, and GCP clouds.
    If you are already a Databricks customer, you want to get the most out of your investment - and if you're considering Databricks, you'll be wondering how hard it is to move to the platform, and how to optimize your investment once you get there.
    Unravel has a powerful platform that supports Spark on-premises, cloud migration, and Databricks operations in the cloud. Please join Chris Santiago, Global Director of Solutions Engineering at Unravel Data, to learn how to maximize the opportunities that Databricks can bring you.
  • Prepare Your Data - The Self-Service Data Roadmap, Session 2 of 4 Recorded: Mar 23 2021 60 mins
    Sandeep Uttamchandani, CDO & VP Engineering, Unravel Data
    In this webinar, Unravel CDO and VP Engineering Sandeep Uttamchandani describes the second step for any large, data-driven project: the Prep phase. Having found the data you need in the Discover phase, it's time to get your data ready. You must structure, clean, enrich, and validate static data, and ensure that "live," updated or streamed data events are continually ready for processing.

    Sandeep Uttamchandani is a leader in the fields of data, AI, and machine learning. This webinar is the second talk from his new O'Reilly animal book, The Self-Service Data Roadmap. The book shows how to start, implement, and complete large data science projects, up to and including the creation of a complete, self-service data science platform for your organization.
  • Getting the Best Performance & Reliability Out of Kafka & Spark Applications Recorded: Mar 11 2021 55 mins
    Chris Santiago, Global Solution Engineering Director, Unravel Data
    Kafka & Spark data pipelines are ubiquitous in any modern data stack. Developing Spark and Kafka applications have become simpler over the years but operating them in production environments still remains challenging to say the least.

    Join Chris Santiago of Unravel Data to learn how to troubleshoot the root cause of why these real-time applications lag or fail. He will share how Unravel provides a single pane of glass to easily see & fix problems such as; poor data partitioning, load imbalance; resource exhaustion or suboptimal configurations. Chris will also share how you can automatically tune and optimize these for cost and performance.
  • Going Beyond Observability for Spark Applications & Databricks Environments Recorded: Mar 4 2021 60 mins
    Chris Santiago, Global Solution Engineering Director, Unravel
    Join Chris Santiago, Solutions Engineer Director at Unravel Data, as he takes you through Unravel’s approach to getting better and finer grain visibility with Spark applications and how to tune and optimize them for resource efficiency.

    He will take you through:
    An overview of out of the box tools like Ganglia and their overall lack of visibility on Databricks jobs
    How Unravel helps you gain finer grain visibility, observability, monitoring into Spark data pipelines
    How Unravel can recommend better configurations and tuning of Spark applications.
  • Discover Your Datasets - The Self-Service Data Roadmap, Session 1 of 4 Recorded: Feb 18 2021 61 mins
    Sandeep Uttamchandani, CDO & VP Engineering, Unravel Data
    In this session, Unravel CDO and VP Engineering Sandeep Uttamchandani describes the start of any large, data-driven project: the Discover phase. You must identify the insights you want to generate from the project, you must discover; that is, you must identify the current data assets you have, and the new data assets you will need, to generate the insights you want to produce. Sandeep expertly guides you through this process, and shows you how to invest the right amount of time and effort to get the job done.

    Sandeep Uttamchandani is a leader in the fields of data, AI, and machine learning. This webinar is the first talk from his new O'Reilly animal book, The Self-Service Data Roadmap. The book shows how to start, implement, and complete large data science projects, up to and including the creation of a complete, self-service data science platform for your organization.
  • Moving Big Data and Streaming Data Workloads to Google Cloud Platform Recorded: Feb 11 2021 61 mins
    Chris Santiago, Global Solution Engineering Director, & Floyd Smith, Product Marketing Director Unravel Data
    Cloud migration may be the biggest challenge, and the biggest opportunity, facing IT departments today - especially if you use big data and streaming data technologies, such as Cloudera, Hadoop, Spark, and Kafka. In this 55-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to Google Dataproc, BigQuery, and other destinations on GCP, fast and at the lowest possible cost.
  • High-Performance, Cost-Effective Move to Azure Recorded: Jan 28 2021 61 mins
    Mick Nolen, Senior Solutions Engineer, & Floyd Smith, Product Marketing Director Unravel Data
    Cloud migration may be the biggest challenge, and the biggest opportunity, facing IT departments today - especially if you use big data and streaming data technologies, such as Cloudera, Hadoop, Spark, and Kafka. In this 55-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to Azure HDInsights, Databricks, and other destinations on Azure, fast and at the lowest possible cost
  • Why You Need DataOps in Your Organization Recorded: Dec 15 2020 61 mins
    Kunal Agarwal, CEO Unravel Data
    DataOps is the hot new trend in IT, following on from the rapid rise of DevOps over the last decade. The growth of AI, machine learning, and move to cloud all contribute to the growing importance of DataOps. Kunal Agarwal, Unravel Data CEO will take you through the rise of DataOps and show you how to implement a data culture in your organization.
  • Moving Big Data and Streaming Data Workloads to AWS Recorded: Dec 3 2020 56 mins
    Chris Santiago, Global Solution Engineering Director, & Floyd Smith, Product Marketing Director Unravel Data
    Cloud migration may be the biggest challenge, and the biggest opportunity, facing IT departments today - especially if you use big data and streaming data technologies, such as Cloudera, Hadoop, Spark, and Kafka. In this 55-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to AWS EMR, Databricks, and other destinations on AWS, fast and at the lowest possible cost.
  • Cost Optimization on Microsoft Azure Recorded: Nov 18 2020 54 mins
    Chris Santiago, Solution Engineering Director, Unravel
    Do you use big data and streaming services - such as Azure HDInsight, Databricks, and Kafka/EventHubs? Do you have on-premises big data that you want to move to Azure? Keeping costs down in Microsoft Azure is difficult, but vital. Join Chris Santiago of Unravel Data and explore how to to reduce, manage, and allocate streaming data and big data costs in Azure.
  • Cost-Effective, High-Performance Move to Cloud Recorded: Nov 5 2020 55 mins
    Chris Santiago, Global Solution Engineering Director, & Floyd Smith, Product Marketing Director Unravel Data
    The move to cloud may be the biggest challenge, and opportunity, facing IT departments today. In this 45-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to the cloud quickly, cost-effectively, and with high performance for the newly cloud-based workloads. Tune in to find out the best way to de-risk your cloud migration projects with data driven insights.
  • Reasons why your Big Data Cloud Migration Fails and Ways to Overcome Recorded: Oct 13 2020 51 mins
    Chris Santiago, Global Solution Engineering Director, Unravel Data
    The Cloud brings many opportunities to help implement big data across your enterprise and organizations are taking advantage of migrating big data workloads to the cloud by utilizing best of breed technologies like Databricks, Cloudera, Amazon EMR and Azure HDI to name a few. However, as powerful as these technologies are, most organizations that attempt to use them fail. Join Chris Santiago, Director of Solution Engineering as he shares the top reasons why your big data cloud migration fails and ways to overcome it. He will cover:
    The top considerations and under estimated efforts you need to be aware of
    The importance of getting the right strategy, right fit and right use cases for cloud migration
    The most common cloud models that will work for you
    How Unravel can help optimize resources to help mitigate the risks of migration
  • Why Enhanced Visibility Matters for your Databricks Environment Recorded: Oct 6 2020 43 mins
    Mick Nolen, Senior Solution Engineer, Unravel Data
    Databricks has become a popular computing framework for big data as organizations increase their investments of moving data applications to the cloud. With that journey comes the promise of better collaboration, processing, and scaling of applications to the Cloud. However, customers are finding unexpected costs eating into their cloud budget as monitoring/observability tools like Ganglia, Grafana, the Databricks console only telling part of the story for charge/showback reports.
    Join Mick Nolen, Senior Solutions Engineer at Unravel Data, as he takes you through Unravel’s approach to getting better and finer grain visibility with Databricks on AWS or Azure. He will take you through:
    An overview of out of the box solutions for monitoring Databricks
    Why enhanced visibility matters with Databricks environments
    How Unravel helps you gain finer grain visibility into Databricks pipelines
  • Automatically Reduce Your AWS Bill with Unravel Recorded: Sep 1 2020 39 mins
    Chris Santiago, Global Solution Engineering Manager, Unravel Data
    We know data is core to your business. As data use-cases increase so do costs of processing and storing it. Unravel helps you save money by identifying inefficient usage of AWS EMR, and then recommending how to fix it. Join us to learn how you can save beyond auto-scaling.
    - Right-size your environment
    - Get recommendations for the right EC2 machines based on your workload
    - Automatically reduce cluster usage wastage by your spark, presto and hive apps
  • Effective Migration & Cost Management for Databricks on Azure and AWS Recorded: Aug 27 2020 49 mins
    Chris Santiago, Global Solution Engineering Manager, Unravel Data
    Databricks has become very popular as a computing framework for big data. However, customers are finding unexpected costs eating into their cloud budget, specifically those planning migrations from, Hadoop. Furthermore, lack of visibility to root cause and general inefficiency is costing organizations thousands, if not millions in operating their Databricks environment.

    Join Unravel to discuss top cost management techniques in Databricks and new features to effectively help manage costs on Databricks, including:
    Best practices
    Cost analytics to provide assurance and forecasting for optimizing databricks workloads as they scale.

    Accurate, detailed chargeback reporting of the cost of running data apps on Databricks.
  • Best Practices Optimizing your big data costs with Amazon EMR Recorded: Jul 22 2020 52 mins
    Kunal Agarwal, CEO Unravel Data & Roy Hasson, Sr. Manager Data Lakes, AWS
    Data is a core part of every business. As data volumes increase so do costs of processing it. Whether you are running your Apache Spark, Hive, or Presto workloads on-premise or on AWS, Amazon EMR is a sure way to save you money. In this session, we’ll discuss several best practices and new features that enable you to cut your operating costs and save money when processing vast amounts of data using Amazon EMR.

    Hear from Unravel Data on how you can use Unravel APM, a full-stack solution for big data workloads running on Amazon EMR, to get visibility and reporting on your Amazon EMR cluster resource utilization and cost savings.

    Learn about the best practices and new features such as Managed Scaling and improved Apache Spark performance to help you optimize and reduce your Amazon EMR costs.
    Watch a demo on how Unravel APM, a full-stack monitoring, tuning, and troubleshooting solution for big data workloads running on Amazon EMR, can help you optimize your Amazon EMR cluster costs.
  • CDO Sessions: Transforming DataOps in Banking Recorded: Jul 9 2020 48 mins
    Sandeep Uttamchandani, CDO & VP Engineering, Unravel Data & Matteo Pelati, Executive Director, DBS
    Join Unravel’s CDO & VP of Engineering, Sandeep Uttamchandani and Matteo Pelati, Executive Director, Head of Technology - Data Platform at DBS Bank as they discuss:

    How their tactical/strategic focus areas are evolving in these challenging times
    Cloud big data migration strategy, do's and don'ts
    Practical advice they can share for other leaders in the big data community
    How Unravel has helped DBS optimize their big data
AI-powered performance management for your modern data applications.
At Unravel, we see an urgent need to help every business understand and optimize the performance of their applications, while managing data operations with greater insight, intelligence, and automation.

For these businesses, Unravel is the AI-powered data operations company. We offer novel solutions that leverage AI, machine learning, and advanced analytics to help you fully operationalize the way you drive predictable performance in your modern data applications and pipelines.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Best Practices Optimizing your big data costs with Amazon EMR
  • Live at: Jul 22 2020 5:00 pm
  • Presented by: Kunal Agarwal, CEO Unravel Data & Roy Hasson, Sr. Manager Data Lakes, AWS
  • From:
Your email has been sent.
or close