Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Founded by the team who created Apache Spark™, Databricks provides a Unified Analytics Platform for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership.
Brooke Wenig, Data Science Solutions Consultant at Databricks, Siddarth Murching, Software Engineer at Databricks
Deep Learning has shown tremendous success, and as we all know, the more data the better the models. However, we eventually hit a bottleneck on how much data we can process on a single machine. This necessitates a new way of training neural networks: in a distributed manner.
In this webinar, we walk through how to use TensorFlow™ and Horovod (an open-source library from Uber to simplify distributed model training) on Databricks to build a more effective recommendation system at scale. We will cover:
- The new Databricks Runtime for ML, shipped with pre-installed libraries such as Keras, Tensorflow, Horovod, and XGBoost to enable data scientists to get started with distributed Machine Learning more quickly
- The newly-released HorovodEstimator API for distributed, multi-GPU training of deep learning models against data in Apache Spark™
- How to make predictions at scale with deep learning pipelines
Real time analytics are crucial to many use cases. Apache Spark™ provides the framework and high volume analytics to provide answers from your streaming data. Join us in this webinar and see a demonstration of how to build IoT and Clickstream Analytics Notebooks in Azure Databricks. These Notebooks will use Python and SQL code to capture data from Azure Events Hub and Azure IoT Hub, parse the data, and make it available to run in machine learning models. See how your organization can start taking advantage of your streaming data.
With GDPR enforcement rapidly approaching on May 25, many companies are still trying to figure out how to comply with one of the regulation’s biggest pain points - data subject requests (DSRs). Under GDPR, data subjects (individuals) in the EU have the right to request information on what personal data is collected, how it is being used, and to have that data changed or erased.
For many organizations that rely on data lakes to store their big data, sifting through millions of files to locate and modify records for a DSR is at minimum a massive effort. And trying to do this within prescribed timelines is near impossible.
Fortunately there’s a path forward. Through an optimized approach to data management, Databricks powered by Apache Spark™ makes it easy to quickly find, edit and erase data submerged deep within your data lake without disrupting your data pipelines.
Join this webinar to learn:
• The GDPR requirements of data subject requests
• The challenges big data and data lakes create for organizations
• How Databricks Delta, a powerful new offering within the Databricks Unified Analytics Platform improves data lake management and makes it possible to quickly find and surgically remove or modify individual records
• Best practices for GDPR data governance
• Live demo on how to easily fulfill data requests with Databricks
Sandy is going to highlight some key aspects of the new Spark-as-a-Service offering in Azure, from Databricks. Leveraging the power of Databricks notebooks to showcase loading and cleaning data in SQL and Scala, exploration and all the way through to having a model into production.
Azure Databricks in an Apache Spark™ based platform, providing the scale, collaborative platform, and integration with your Azure environment that makes it the best place to run your ML and AI workloads on Azure. This webinar will include an in-depth demo of key AI and ML use cases.
With 170+ global networks, Viacom is focused on providing an amazing audience experience to its billions of viewers around the world. Core to this strategy is leveraging big data and advanced analytics to offer the right content to the right audience and deliver it flawlessly on any device. To make this possible, Viacom set-out to build a real-time, scalable data analytics platform on Apache Spark™.
Join this webinar to learn how Viacom overcame the complexities of Spark with Databricks and AWS to build an end-to-end scalable self-service insights platform that delivers on a wide range of analytics use cases.
This webinar will cover:
- The challenges Viacom faced building a scalable, real-time data insights and AI platform
- How they overcame these challenges with Spark, AWS and Databricks
- How they leverage a unified analytics platform for data pipelines, analytics and machine learning to reduce video start delays and improve content delivery with stream analytics at scale
- What it takes to create a data driven culture with self-service analytics that meet the needs of business users, data analysts and data scientists
Learn the basics of Apache Spark™ on Azure Databricks. Designed by Databricks, in collaboration with Microsoft, Azure Databricks combines the best of Databricks and Azure to help customers accelerate innovation with one-click set up, streamlined workflows and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
This webinar will cover the following topics:
· RDDs, DataFrames, Datasets, and other fundamentals of Apache Spark.
· How to quickly setup Azure Databricks, relieving you of DataOps duties.
· How to use the Databricks interactive notebooks, which provide a collaborative space for your entire analytics team, and how you can schedule notebooks, immediately putting your work into production.
Building multiple ETL pipelines is very complex and time consuming, making it a very expensive endeavor. As the number of data sources and the volume of the data increases, the ETL time also increases, negatively impacting when an enterprise can derive value from the data.
Join Prakash Chockalingam, Product Manager and data engineering expert at Databricks, to learn how to avoid the common pitfalls of data engineering and how the Databricks Unified Analytics Platform can ensure performance and reliability at scale to lower total cost of ownership (TCO).
In this webinar, you will learn how Databricks can help to:
- Remove infrastructure configuration complexity to reduce DevOps efforts
- Optimize your ETL data pipelines for performance without compromising reliability
- Unify data engineering and data science to accelerate innovation for the business.
Data scientists and data engineers need a secure and scalable platform to collaborate on analytics. Register for this webinar and see how Azure Databricks provides a platform that enables teams to accelerate innovation, providing:
- A collaborative workspace to experiment with models and datasets, and then put jobs into action instantly.
- An automated infrastructure that enables you to autoscale compute and storage independently.
The live demo portion of the webinar will show how Azure Databricks can bring in streaming data, run it in a machine learning model, and then output the results to PowerBI for visualization.
The upcoming Spark 2.3 release marks a big step forward in speed, unification, and API support.
Reynold Xin and Jules Damji from Databricks will walk through how you can benefit from the upcoming improvements:
- New DataSource APIs that enable developers to more easily read and write data for Continuous Processing in Structured Streaming.
- PySpark support for vectorization, giving Python developers the ability to run native Python code fast.
- Improved performance by taking advantage of NVMe SSDs.
- Native Kubernetes support, marrying the best of container orchestration and distributed data processing.
Enterprise data science teams are driving big innovations in machine learning, but this has put them under increased pressure to deliver more models, more frequently, and more rapidly.
In this webinar, Forrester VP & Principal Analyst, Mike Gualtieri, will share data on the top trends in machine learning and lay out what data science teams need to do in order to maximize their output.
Chris Robison, Head of Data Science at Overstock.com and Craig Kelly, Group Product Manager at Overstock.com, will showcase how they utilized big data and machine learning to
-Create a one-to-one personalized shopping experience.
-Decrease cost of moving models to production by nearly 50%.
-Stand up new models 5x faster than before.