Hi [[ session.user.profile.firstName ]]


  • Date
  • Rating
  • Views
  • Databricks Product Demonstration
    Databricks Product Demonstration Jason Pohl Recorded: Mar 15 2017 45 mins
    This is a live demonstration of the Databricks virtual analytics platform.
  • Apache® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
    Apache® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models Richard Garris and Jules S. Damji Recorded: Mar 9 2017 61 mins
    Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?

    In this webinar, we will discuss best practices from Databricks on how our customers productionize machine learning models, do a deep dive with actual customer case studies, and show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
  • How Smartsheet operationalized Apache Spark with Databricks
    How Smartsheet operationalized Apache Spark with Databricks Francis Lau, Senior Director, Product Intelligence at Smartsheet Recorded: Feb 23 2017 61 mins
    Apache Spark is red hot, but without the compulsory skillsets, it can be a challenge to operationalize — making it difficult to build a robust production data pipeline that business users and data scientists across your company can use to unearth insights.

    Smartsheet is the world’s leading SaaS platform for managing and automating collaborative work. With over 90,000 companies and millions of users, it helps teams get work done ranging from managing simple task lists to orchestrating the largest sporting events and construction projects.

    In this webinar, you will learn how Smartsheet uses Databricks to overcome the complexities of Spark to build their own analysis platform that enables self-service insights at will, scale, and speed to better understand their customers’ diverse use cases. They will share valuable patterns and lessons learned in both technical and adoption areas to show how they achieved this, including:

    How to build a robust metadata-driven data pipeline that processes application and business systems data to provide a 360 view of customers and to drive smarter business systems integrations.
    How to provide an intuitive and valuable “pyramid” of datasets usable by both technical and business users.
    Their roll-out approach and materials used for company-wide adoption allowing users to go from zero to insights with Spark and Databricks in minutes.
  • Apache® Spark™ - The Unified Engine for All Workloads
    Apache® Spark™ - The Unified Engine for All Workloads Tony Baer, Principal Analyst at Ovum Recorded: Jan 12 2017 63 mins
    The Apache® Spark™ compute engine has gone viral – not only is it the most active Apache big data open source project, but it is also the fastest growing big data analytics workload, on and off Hadoop. The major reason behind Spark’s popularity with developers and enterprises is its flexibility to support a wide range of workloads including SQL query, machine learning, streaming, and graph analysis.

    This webinar features Ovum analyst Tony Baer, who will explain the real-world benefits to practitioners and enterprises when they build a technology stack based on a unified approach with Apache Spark.

    This webinar will cover:
    Findings around the growth of Spark and diverse applications using machine learning and streaming.
    The advantages of using Spark to unify all workloads, rather than stitching together many specialized engines like Presto, Storm, MapReduce, Pig, and others.
    Use case examples that illustrate the flexibility of Spark in supporting various workloads.
  • Apache® Spark™ MLlib 2.x: Migrating ML Workloads to DataFrames
    Apache® Spark™ MLlib 2.x: Migrating ML Workloads to DataFrames Joseph K. Bradley and Jules S. Damji Recorded: Dec 8 2016 61 mins
    In the Apache® Spark™ 2.x releases, Machine Learning (ML) is focusing on DataFrame-based APIs. This webinar is aimed at helping users take full advantage of the new APIs. Topics will include migrating workloads from RDDs to DataFrames, ML persistence for saving and loading models, and the roadmap ahead.
  • How to Evaluate Cloud-based Apache® Spark™ Platforms
    How to Evaluate Cloud-based Apache® Spark™ Platforms Nik Rouda - ESG Global Recorded: Nov 16 2016 62 mins
    Since its release, Apache Spark has quickly become the fastest growing big data processing engine. But few companies have the domain expertise and resources to build their own Spark-based infrastructure - often times resulting in a mix of tools that are complex to stand up and time consuming to maintain.

    There are several cloud-based platforms available that allow you to harness the power of Spark while reaping the advantages of the cloud. This webinar features ESG Global senior analyst Nik Rouda who will share research and best practices to help decision makers evaluate the most popular cloud-based Apache Spark solutions and to understand the differences between them.
  • Databricks for Data Engineers
    Databricks for Data Engineers Prakash Chockalingam Recorded: Oct 26 2016 49 mins
    Apache Spark has become an indispensable tool for data engineering teams. Its performance and flexibility made ETL one of Spark’s most popular use cases. In this webinar, Prakash Chockalingam - seasoned data engineer and PM - will discuss how Databricks allows data engineering teams to overcome common obstacles while building production-quality data pipelines with Spark. Specifically, you will learn:

    - Obstacles faced by data engineering teams while building ETL pipelines;
    - How Databricks simplifies Spark development;
    - A demonstration of key Databricks functionalities geared towards making data engineers more productive.
  • How Edmunds.com Leverages Apache® Spark™ on Databricks to Improve Customer Conve
    How Edmunds.com Leverages Apache® Spark™ on Databricks to Improve Customer Conve Shaun Elliott, Christian Lugo Recorded: Oct 19 2016 60 mins
    Edmunds.com is a leading online car information and shopping marketplace serving nearly 20 million visitors each month to their website. With a 10x growth in data to 100x of TBs in the past for years, their engineering team was looking for ways to increase consumer engagement and conversion by improving the data integrity of Edmunds' website.

    Databricks simplifies the management of their Apache Spark infrastructure while accelerating data exploration at scale. Now they can quickly analyze large datasets to determine the best sources for car data on their website.

    In this webinar, you will learn:

    Why Edmunds.com moved from MapReduce to Databricks for ad hoc data exploration.
    How Databricks democratized data access across teams to improve decision making and feature innovation.
    Best practices for doing ETL and building a robust data pipeline with Databricks.
  • How Omega Point Helps Investors Optimize their Portfolios with Apache® Spark™ on
    How Omega Point Helps Investors Optimize their Portfolios with Apache® Spark™ on Omer Cedar, CEO, Omega Point and Eran Cedar, CTO, Omega Point Recorded: Aug 18 2016 56 mins
    Omega Point uses big data analytics to enable investment professionals to reduce risk while increasing their returns. Databricks enables Omega Point to uncover performance drivers of investment portfolios using massive volumes of market data. Join us to learn how Omega Point built a next-generation investment analytics platform to isolate critical market signals from noise with a big data architecture built with Apache Spark on Databricks.
  • Databricks' Data Pipelines: Journey and Lessons Learned
    Databricks' Data Pipelines: Journey and Lessons Learned Burak Yavuz Recorded: Aug 4 2016 58 mins
    With components like Spark SQL, MLlib, and Streaming, Spark is a unified engine for building data applications. In this talk, we will take a look at how we use Spark on our own Databricks platform.

    In this webinar, we discuss the role and importance of ETL and what are the common features of an ETL pipeline. We will then show how the same ETL fundamentals are applied and (more importantly) simplified within Databricks’ Data pipelines. By utilizing Apache Spark as its foundation, we can simplify our ETL processes using one framework. With Databricks, you can develop your pipeline code in notebooks, create Jobs to productionize your notebooks, and utilize REST APIs to turn all of this into a continuous integration workflow. We will provide tips and tricks of doing ETL with Spark and lessons learned from our pipeline.

Embed in website or blog