Running Hadoop and Spark on Docker: Challenges and Lessons Learned
Watch this on-demand webinar to learn how to run Hadoop and Spark on Docker in an enterprise deployment.
Today, most applications can be “Dockerized”. However, there are unique challenges when deploying a Big Data framework such as Spark or Hadoop on Docker containers in a large-scale production environment.
In this webinar, we discussed:
-Practical tips on how to deploy multi-node Hadoop and Spark workloads using Docker containers
-Techniques for multi-host networking, secure isolation, QoS controls, and high availability with containers
-Best practices to achieve optimal I/O performance for Hadoop and Spark using Docker
-How a container-based deployment can deliver greater agility, cost savings, and ROI for your Big Data initiative
Don’t miss watching this webinar on how to "Dockerize" your Big Data applications in a reliable, secure, and high-performance environment.
RecordedAug 18 201662 mins
Your place is confirmed, we'll send you email reminders
Sam Charrington, Founder & Industry Analyst, TWIML; Nanda Vijaydev, Lead Data Scientist, HPE
Join this webinar with Hewlett Packard Enterprise and TWIML – a community and podcast focused on AI and Machine Learning (ML) – as we explore the use of Kubernetes for AI / ML deployments in the enterprise.
As organizations mature in their use of AI and ML, they need to build repeatable, efficient, and sustainable processes for model development and deployment. Containers and Kubernetes provide essential building blocks to help operationalize these processes and support ML Operations (MLOps).
In this webinar, we will discuss:
-The challenges of moving from pilot to production with Machine Learning and Deep Learning, at enterprise scale
-How containers and Kubernetes can help address these challenges, as a foundation for MLOps
-Lessons learned and best practices from enterprises who have successfully leveraged Kubernetes for AI / ML
Register today and you'll receive the accompanying TWIML e-book on deploying Kubernetes for Machine Learning, Deep Learning, and AI.
Mike Gualtieri, VP, Principal Analyst, Forrester Research; Matheen Raza, Product Marketing, HPE
Join this webinar with HPE and guest speaker Forrester Research to learn about the key technological and organizational challenges that impact the success of machine learning (ML) projects.
As enterprises operationalize their ML models, they often face problems in the "last mile" of model deployment. The emerging field of ML Ops aims to deliver agility and speed to the ML lifecycle – similar to what DevOps processes have done for the software development lifecycle.
In this webinar, we'll discuss:
-Forrester's latest findings on the state of machine learning in the enterprise
-The 7 key requirements for successful ML Ops deployment
-Organizational best practices to improve productivity of data science teams
Register today and you'll receive the accompanying research report too!
Carl Olofson, Research Vice President, IDC; Anil Gadre, VP, GM Data Fabric, HPE
There are a multitude of distributed, scale-out data management technologies in the marketplace; but the key is finding a solution that supports a wide range of analytic applications across diverse data types without time-consuming data movement and transformation.
A new IDC report details how the MapR Data Platform provides a single platform environment for collecting, curating, and analyzing data.
In this webinar, we'll share these findings and show how MapR provides:
-A 5-year ROI of 567% and 11 month payback period
-Analytics improvements and team efficiencies
-Improved cost of operations and higher productivity
Register today to learn how you can improve the value of your data. You'll receive the full report too!
Mike Leone, ESG, Sr Analyst for Data Platforms, Analytics, and AI; Matt Hausmann, BlueData (HPE), Product Marketing
Organizations are using advanced analytics, machine learning, and deep learning for competitive advantage and have turned to GPUs to boost performance and speed innovation. But these valuable compute resources can take months to deploy and are often underutilized.
This webinar will discuss how to enable a cloud-like experience to drive maximum ROI from GPU-accelerated compute for enterprise AI and data science.
Learn how you can:
-Easily share GPU resources to improve utilization and data science productivity.
-Accelerate model development with the performance of bare metal.
-Shorten time-to-value provisioning a GPU environment in minutes instead of months.
Register for this webinar and receive a new benchmark paper from analyst firm Enterprise Strategy Group (ESG) which details how to get more value from your GPUs.
See an overview of the container-based BlueData software platform from Hewlett Packard Enterprise. Learn how you can spin up instant machine learning, deep learning, and analytics environments – while ensuring enterprise-grade security and performance. Provide your data science teams with on-demand access to the tools and data they need – whether on-premises, in the public cloud, or in a hybrid cloud architecture.
Mike Leone, Sr. Analyst, Enterprise Strategy Group; Victor Ghadban, Field CTO AI / ML, BlueData (HPE)
Join our webinar with an industry analyst from Enterprise Strategy Group (ESG), and learn how you can accelerate your AI initiative.
Enterprises in all industries are recognizing the game-changing business impact of Artificial Intelligence (AI) and Machine Learning (ML).
To be successful in your AI initiative, your data science teams need the ability to quickly build and deploy ML models in large-scale distributed environments. But this is easier said than done.
In this webinar, we'll share industry research from ESG and discuss how to:
- Address the skills gap in data science and AI / ML
- Utilize containers to accelerate your AI / ML deployment
- Run AI / ML workloads on either on-premises or in the cloud
- Achieve faster-time-to value for your AI initiative
Nanda Vijaydev, Sr. Director, Solutions, BlueData; John Spooner, Director of Solution Engineering, H2O.ai
Watch this webinar to learn how you can accelerate your deployment of H2O and AI / ML in Financial Services.
Keeping pace with new technologies for data science, machine learning, and deep learning can be overwhelming. And it can be challenging to deploy and manage these tools – including H2O and many others – for data science teams in large-scale distributed environments.
This webinar will discuss how to deploy H2O and other ML / DL tools in Financial Services. Learn about:
-Example use cases for AI / ML / DL in Financial Services
-Using H2O and other ML / DL tools with containers
-Overcoming deployment challenges for distributed environments
-How to ensure enterprise-grade security, high performance, and faster-time-to-value
Join this webinar to learn how you can accelerate innovation using AI / ML / DL in Healthcare and Life Sciences.
Healthcare professionals and researchers have access to immense volumes of data from a variety of sources. Early adopters of Machine Learning (ML) and Deep Learning (DL) are uncovering new insights from this data to improve patient care and transform the industry with AI-driven innovations.
But it can be challenging to deploy and manage these tools – including TensorFlow and many others – for data science teams in large-scale distributed environments.
In this webinar, we'll discuss:
- Example AI use cases – including precision medicine, drug discovery, and claims management
- Data access, data security, and other key requirements for implementing AI in Healthcare and Life Sciences
- How to overcome deployment challenges for distributed ML / DL environments using containers
- How to ensure enterprise-grade security, high performance, and faster-time-to-value for ML / DL
Tom Phelan, Chief Architect, BlueData; Nanda Vijaydev, Director, Solutions, BlueData
Join this webinar to learn how you can accelerate your deployment of TensorFlow and AI / ML in Financial Services.
Keeping pace with new technologies for data science, machine learning, and deep learning can be overwhelming. And it can be challenging to deploy and manage these tools – including TensorFlow and many others – for data science teams in large-scale distributed environments.
This webinar will discuss how to deploy TensorFlow and other ML / DL tools in the Banking, Insurance, and Capital Markets industries. Learn about:
-Example use cases for AI / ML / DL in Financial Services – with an enterprise case study
-Using TensorFlow and other ML / DL tools with GPUs and containers
-Overcoming deployment challenges for distributed environments – including operationalization
-How to ensure enterprise-grade security, high performance, and faster-time-to-value
Join this webinar to learn about deploying H2O in large-scale distributed environments using containers.
Artificial intelligence and machine learning are now a top priority for most enterprises. But it can be challenging to implement multi-node AI / ML environments for data science teams in large-scale enterprise deployments.
Together, BlueData and H2O.ai deliver a game-changing solution for AI / ML in the enterprise. In this webinar, discover how you can:
-Quickly spin up containerized H2O and Driverless AI environments whether on dev/test or production
-Ensure seamless support for H2O running on CPUs or GPUs, and provide a secure connection to your data lake
-Operationalize your distributed machine learning pipelines and deliver faster time-to-value for your AI initiative
Find out how to run AI / ML on containers while ensuring enterprise-grade security, performance, and scalability.
Lynn Calvo, AVP Emerging Data Technology, GM Financial; Nick Chang, Head of Customer Success, BlueData
Watch this on-demand webinar for a case study with GM Financial on deploying Machine Learning and Deep Learning applications using a flexible container-based architecture.
GM Financial, the wholly-owned captive finance subsidiary of General Motors, is a global enterprise in a highly regulated industry. Learn about their journey in implementing Machine Learning, Deep Learning, and Natural Language Processing – including how they’ve kept up with the blistering pace of change, while delivering immediate value and managing costs.
In this webinar, GM Financial will discuss some of their challenges, technology choices, and initial successes:
- Addressing a wide range of Machine Learning use cases, from credit risk analysis to improving customer experience
- Implementing multiple different tools (including TensorFlow™, Apache Spark™, Apache Kafka®, and Cloudera®) for different business needs
- Deploying a multi-tenant hybrid cloud environment with containers, automation, and GPU-enabled infrastructure
Don’t miss this webinar! Gain insights from an enterprise case study, and get perspective on Kubernetes® and other game-changing technology developments.
Tom Phelan, Chief Architect, BlueData; Yaser Najafi, Big Data Solutions Engineer, BlueData
Watch this on-demand webinar to learn about using Kubernetes with stateful applications for AI and Big Data workloads.
Kubernetes is now the de facto standard for container orchestration. And while it was originally designed for stateless applications and microservices, it's gaining ground in support for stateful applications as well.
But distributed stateful applications – including analytics, data science, machine learning, and deep learning workloads – are still complex and challenging to deploy with Kubernetes.
In this webinar, we'll discuss considerations for running stateful applications on Kubernetes:
-Unique requirements for multi-service stateful workloads including Hadoop, Spark, Kafka, and TensorFlow
-Persistent Volumes, Statefulsets, Operators, Helm, and other Kubernetes capabilities for stateful applications
-Technical gaps in Kubernetes deployment patterns and tooling, including security and networking
-Options and strategies to deploy distributed stateful applications in containerized environments
Learn about a new open source project focused on deploying and managing stateful applications with Kubernetes.
Radhika Rangarajan Director, Data Analytics and AI, Intel; Nanda Vijaydev Director, Solutions, BlueData
Watch this on-demand webinar to learn how you can accelerate your AI initiative and deliver faster time-to-value with machine learning.
AI has moved into the mainstream. Innovators in every industry are adopting machine learning for AI and digital transformation, with a wide range of different use cases. But these technologies are difficult to implement for large-scale distributed environments with enterprise requirements.
This webinar discusses:
-The game-changing business impact of AI and machine learning (ML) in the enterprise
-Example use cases: from fraud detection to medical diagnosis to autonomous driving
-The challenges of building and deploying distributed ML pipelines and how to overcome them
-A new turnkey solution to accelerate enterprise AI initiatives and large-scale ML deployments
Find out how to get up and running quickly with a multi-node sandbox environment for TensorFlow and other popular ML tools.
Tom Phelan, Chief Architect, BlueData; Nanda Vijaydev, Director - Solutions, BlueData
Watch this on-demand webinar to learn about deploying deep learning applications with GPUs in a containerized multi-tenant environment.
Keeping pace with new technologies for data science and machine learning can be overwhelming. There are a plethora of open source options, and it's a challenge to get these tools up and running easily and consistently in a large-scale distributed environment.
This webinar will discuss how to deploy TensorFlow and Spark clusters running on Docker containers, with a shared pool of GPU resources. Learn about:
*Quota management of GPU resources for greater efficiency
*Isolating GPUs to specific clusters to avoid resource conflict
*Attaching and detaching GPU resources from clusters
*Transient use of GPUs for the duration of the job
Find out how you can spin up (and tear down) GPU-enabled TensorFlow and Spark clusters on-demand, with just a few mouse clicks.
Nick Chang, Head of Customer Success, BlueData; Yaser Najafi, Big Data Solutions Engineer, BlueData
Watch this on-demand webinar to learn about use cases for Big-Data-as-a-Service (BDaaS) – to jumpstart your journey with Hadoop, Spark, and other Big Data tools.
Enterprises in all industries are embracing digital transformation and data-driven insights for competitive advantage. But embarking on this Big Data journey is a complex undertaking and deployments tend to happen in fits and spurts. BDaaS can help simplify Big Data deployments and ensure faster time-to-value.
In this webinar, you'll hear about a range of different BDaaS deployment use cases:
-Sandbox: Provide data science teams with a sandbox for experimentation and prototyping, including on-demand clusters and easy access to existing data.
-Staging: Accelerate Hadoop / Spark deployments, de-risk upgrades to new versions, and quickly set up testing and staging environments prior to rollout.
-Multi-cluster: Run multiple clusters on shared infrastructure. Set quotas and resource guarantees, with logical separation and secure multi-tenancy.
-Multi-cloud: Leverage the portability of Docker containers to deploy workloads on-premises, in the public cloud, or in hybrid and multi-cloud architectures.
Watch this on-demand webinar to learn how separating compute from storage for Big Data delivers greater efficiency and cost savings.
Historically, Big Data deployments dictated the co-location of compute and storage on the same physical server. Data locality (i.e. moving computation to the data) was one of the fundamental architectural concepts of Hadoop.
But this assumption has changed – due to the evolution of modern infrastructure, new Big Data processing frameworks, and cloud computing. By decoupling compute from storage, you can improve agility and reduce costs for your Big Data deployment.
In this webinar, we discussed how:
- Changes introduced in Hadoop 3.0 demonstrate that the traditional Hadoop deployment model is changing
- New projects by the open source community and Hadoop distribution vendors give further evidence to this trend
- By separating analytical processing from data storage, you can eliminate the cost and risks of data duplication
- Scaling compute and storage independently can lead to higher utilization and cost efficiency for Big Data workloads
Learn how the traditional Big Data architecture is changing, and what this means for your organization.
Darren Darnell, Jim Foppe, and Mike Steimel (Panera Bread); Nanda Vijaydev (BlueData)
Watch this on-demand webinar to learn how Panera Bread uses Big Data analytics to drive their business, with #1 ranked customer loyalty.
Panera Bread – with over 2,000 locations and 25 million customers in its loyalty program – relies on analytics to fine-tune its menu, operations, marketing, and more. Find out how they solve key business challenges using Hadoop and next generation Big Data technologies, including real-time data to analyze consumer behavior.
In this webinar, Panera Bread discussed how they:
-Use a data-driven approach to improve customer acquisition, customer retention, and operational efficiency
-Spin up instant clusters for rapid prototyping and exploratory analytics, with real-time streaming platforms like Kafka
-Operationalize their data science and data pipelines in a hybrid deployment model, both on-premises and in the cloud
Don’t miss watching this case study webinar. Discover your own recipe for success with Big Data analytics and data science!
Faster Time-to-Value for AI / ML and Big Data Analytics
Hewlett Packard Enterprise (HPE) – which recently acquired BlueData and MapR – is transforming how enterprises deploy AI / Machine Learning (ML) and Big Data analytics. HPE’s container-based software platform makes it easier, faster, and more cost-effective for enterprises to innovate with AI / ML and Big Data technologies – either on-premises, in the public cloud, or in a hybrid architecture. With HPE, our customers can spin up containerized environments within minutes, providing their data scientists with on-demand access to the applications, data, and infrastructure they need.
Running Hadoop and Spark on Docker: Challenges and Lessons LearnedTom Phelan, Chief Architect, BlueData; Anant Chintamaneni, VP of Products, BlueData[[ webcastStartDate * 1000 | amDateFormat: 'MMM D YYYY h:mm a' ]]62 mins