Hi [[ session.user.profile.firstName ]]

Introduction to Sparkling Water: Productionalizing H2O Models with Apache Spark

Spark is a powerful and robust open-source, general-purpose computation platform. It is an invaluable tool for users who want to munge, wrangle, clean and transform data before training a model. Spark Pipelines are also powerful constructs but have little support for easily plugging in advanced third-party machine learning libraries.

At the same time, many novice and advanced data scientists are leveraging the power of the H2O machine learning platform, a highly distributable and tunable machine learning library. The H2O platform provides the powerful MOJO concept (Model Object Optimized), making it easy to deploy trained models with a focus on scoring speed, traceability, exchangeability and backward compatibility.

In this webinar, Edgar will introduce H2O Sparkling Water, the glue between Spark and the H2O ML platform, allowing users to seamlessly incorporate advanced data science libraries with their Spark environments. We will demonstrate creation of Spark pipelines integrating H2O ML models and their deployments using Scala or Python. We will use H2O’s AutoML algorithm for automatic model selection and ensembling and show how to load that into production-grade model into Spark pipeline for deployment.
Recorded Feb 20 2020 55 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Edgar Orendain, Software Engineer, H2O.ai
Presentation preview: Introduction to Sparkling Water: Productionalizing H2O Models with Apache Spark

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • Enhancing Spark with H2O's Random Grid Search and AutoML using Sparkling Water May 7 2020 7:00 pm UTC 60 mins
    Jakub Háva, Team Lead and Senior Software Engineer, H2O.ai
    Learn more about how you can integrate large scale data preprocessing with Machine Learning using Sparkling Water. Sparkling Water enables training H2O-3 models leveraging Apache Spark clusters in a distributed manner. It also allows for using trained H2O-3 and Driverless AI models inside Apache Spark. We will demonstrate model training together with hyper-parameter tuning (Cartesian and Random GridSearch with time constraint) of various algorithms, using AutoML – training meta model combining different algorithms, hyper-parameter search and stacking (Ensemble method) all using Spark Pipeline API. We will also demonstrate how target encoding can be used with the Sparkling Water API.

    What will users learn:
    - How to use H2O's GridSearch in Sparkling Water environment
    - How to use AutoML in Sparkling Water environment
    - How to put the trained models into production
  • Key Terms and Ideas in Responsible AI Apr 23 2020 6:00 pm UTC 60 mins
    Benjamin Cox, Product Marketing Manager at H2O.ai
    As fields like explainable AI and ethical AI have continued to develop in academia and industry, we have seen a litany of new methodologies that can be applied to improve our ability to trust and understand our machine learning and deep learning models. As a result of this, we’ve seen several buzzwords emerge, such as responsible ai, explainable ai (XAI), machine learning interpretability (MLI), and ethical ai.

    In this webinar, we will look to explore and define these newish terms as H2O.ai sees them in hopes of fostering discussions between machine learning practitioners and researchers, and all the diverse types of professionals (e.g., social scientists, lawyers, risk specialists) it takes to make machine projects successful. We’ll close by discussing responsible machine learning as an umbrella term and by asking for your feedback.

    What you'll learn:
    - New methodologies to improve our ability to trust and understand our machine learning and deep learning models
    - New terms and ideas emerging out of the explainable AI and ethical AI fields
    - The concept of Responsible AI as an umbrella term for these new terms and ideas

    Presenter:
    Benjamin Cox, Product Marketing Manager at H2O.ai
  • Using Domain Traffic to Identify Malicious Behavior in Cybersecurity Apr 14 2020 3:00 am UTC 75 mins
    Ashrith Barthur
    In this talk, we focus on Cybersecurity and build an AI solution that identifies malicious domains that are being accessed from your organizational network. Why is this important? Domains are fundamental and enable malicious behavior. These domains enable data-exfiltration, command and control, and PII theft. Therefore identifying and blocking malicious domains become the first step in breaking the kill chain.

    Here we explain the fundamental design and approach of modeling malicious behavior in domains, and an application that is capable of classifying malicious domains. We also show how one can provide a system for SoC operators to take a look at the output and make a quick decision.

    Speaker's Bio:

    Ashrith Barthur:

    Ashrith Barthur is the security scientist designing anomalous detection algorithms at H2O.ai. He recently graduated from the Center of Education and Research in Information Assurance and Security (CERIAS) at Purdue University with a Ph.D. in Information security. He is specialized in anomaly detection on networks under the guidance of Dr. William S. Cleveland. He tries to break into anything that has an operating system, sometimes into things that don’t. He has been christened as “The Only Human Network Packet Sniffer” by his advisors. When he is not working he swims and bikes long distances.
  • Your AI Transformation Part 2: Real World Use Cases Apr 9 2020 6:00 pm UTC 60 mins
    Ingrid Burton, CMO, H2O.ai and Benjamin Cox, Product Marketing Manager, H2O.ai
    AI is unlocking new potential for every enterprise. Organizations are using AI and machine learning technology to inform business decisions, predict potential issues, and provide more efficient, customized customer experiences. We will walk through some real use cases we have been a part of across a variety of industries such as finance, retail, and healthcare as well as the impact AI brought to these businesses.

    A few of the companies use cases we will discuss are:
    -AT&T
    -PWC
    -Bill.com
    -G5

    H2O.ai is a visionary leader in AI and machine learning and is on a mission to democratize AI for everyone. We believe that every company can become an AI company, not just the AI Superpowers. We are empowering companies with our leading AI and Machine Learning platforms, our expertise, experience and training to embark on their own AI journey to become AI companies themselves. All companies in all industries can participate in this AI Transformation.

    Tune into this webinar to learn how companies are transforming their business with the power of AI and where to start.
  • H2O.ai in the Cloud Apr 2 2020 6:00 pm UTC 60 mins
    Vinod Iyengar, VP of Customer Success and Product, H2O.ai and Rafael Coss, Community Maker, H2O.ai
    From the earliest days, H2O.ai has been a visionary Silicon Valley open source software company that created and reimagined what is possible. Our vision has been to democratize AI for everyone. Not just a select few. We believe in making every company an AI company and giving access to innovative technologies as they embark on an AI transformation within their organization.

    With H2O Driverless AI, we believe we have set the standard in automatic machine learning by giving developers a simple way to deploy models on a cloud-agnostic platform with rich explainability and visualizations, strong NLP and time series capabilities and exemplary customer support.

    Join this webinar where we’ll focus on H2O.ai offerings in the cloud and how you can get started on your AI transformation journey today.

    Presenters:
    Vinod Iyengar, Vice President of Customer Success and Product at H2O.ai
    Rafael Coss, Community Maker at H2O.ai
  • The 5 Key AI Takeaways for Today's C-Suite Recorded: Mar 25 2020 61 mins
    Parul Pandey
    This discussion will explore real-world examples and how to democratize AI in your organization.

    1. Build a Data science culture
    2. Ask the right questions
    3. Connect to the community
    4. Technology considerations
    5. Trust in AI
  • Why Auto ML is Key to Your AI Transformation Recorded: Mar 19 2020 63 mins
    Rafael Coss, Community Maker, H2O.ai
    Automatic Machine Learning platforms are critical as companies embark on an AI transformation. Data scientists are in short supply for all but the largest technology companies. H2O Driverless AI is considered one of the most visionary and leading automatic machine learning platforms in the market today. With Driverless AI, expert and novice data scientists, data engineers, domain scientists, mathematicians, and statisticians in all businesses can develop highly accurate models that are ready to deploy.

    H2O Driverless AI delivers a robust set of capabilities, including:
    - Automatic feature engineering
    - Automatic pipeline generation for model scoring
    - Automatic Visualization
    - Bring your own recipe
    - Machine learning interpretability
    - Model selection and deployment
    - Natural Language Processing
    - Time-series

    Learn and watch how you can use this platform to train and deploy models to get results faster.
  • Your AI Transformation Recorded: Mar 12 2020 62 mins
    Ingrid Burton, CMO, H2O.ai and Benjamin Cox, Product Marketing Manager, H2O.ai
    AI is unlocking new potential for every enterprise. Organizations are using AI and machine learning technology to inform business decisions, predict potential issues, and provide more efficient, customized customer experiences. The results can enable a competitive edge for the business.

    H2O.ai is a visionary leader in AI and machine learning and is on a mission to democratize AI for everyone. We believe that every company can become an AI company, not just the AI Superpowers. We are empowering companies with our leading AI and Machine Learning platforms, our expertise, experience and training to embark on their own AI journey to become AI companies themselves. All companies in all industries can participate in this AI Transformation.

    Tune into this webinar to learn how companies are transforming their business with the power of AI and where to start.
  • Solving Real-World Problems with Machine Learning Recorded: Mar 10 2020 60 mins
    Ashrith Barthur, Sandip Sharma
    Description:

    In much of the 21st century, we have seen how machine learning is being used by virtually every fortune 500 company. But do we know how it is used? Is it a tool? Is it a template? Is it design, concept or a way of thinking? Or is it all in one coming together to solve problems in the world - but, one at a time?

    In this webinar, we showcase how we are solving the problem of identifying false positives in money laundering alerts, and optimizing them with machine learning. But machine learning takes a backseat, although it is the kernel of the entire solution we focus on how a real-world problem, with steps, is solved from end-to-end.

    Speaker's Bio:

    Ashrith Barthur:

    Ashrith Barthur is the security scientist designing anomalous detection algorithms at H2O.ai. He recently graduated from the Center of Education and Research in Information Assurance and Security (CERIAS) at Purdue University with a Ph.D. in Information security. He is specialized in anomaly detection on networks under the guidance of Dr. William S. Cleveland. He tries to break into anything that has an operating system, sometimes into things that don’t. He has been christened as “The Only Human Network Packet Sniffer” by his advisors. When he is not working he swims and bikes long distances.


    Sandip Sharma:

    Sandip is an entrepreneur and technology leader, who has a balance of work experience in both Financial Services Industry and Government. With +20 years of experience in business IT, Sandip thrives for developing and implementing innovative AI/ML solutions to the Whole-of-Government and Financial Services Industry on emerging digital technologies. He has Masters. Degree in Business IT – Financial Services, from Singapore Management University (SMU).
  • Towards Responsible AI Recorded: Mar 5 2020 56 mins
    Benjamin Cox, Product Marketing Manager, H2O.ai and Patrick Hall, Sr. Director of Product, H2O.ai
    AI and Machine Learning are front and center in the news on a daily basis. The initial reaction to "explaining" or understanding a model that was created has been centered around the concept of Explainable AI which is the technology answer to understand and trust a model with advanced techniques such as Lime, Shapley, Disparate Impact Analysis and more.

    H2O.ai has been innovating in the area of explainable AI for the last three years. However, over the last year, it has become clear that technology-driven Explainable AI is not enough.

    Companies, researchers and regulators would agree that Responsible AI encompasses not just the ability to understand and trust a model, but includes the ability to address ethics in AI, regulation in AI, and the human side of how we move forward with AI, well, in a responsible way.

    Tune into this webinar to learn about the factors that make up Responsible AI and how H2O.ai can help.
  • What's New in H2O Driverless AI Recorded: Mar 4 2020 60 mins
    Sairaam Varadarajan
    H2O Driverless AI employs the techniques of expert data scientists in an easy to use platform that helps scale your data science efforts. Driverless AI empowers data scientists to work on projects faster using automation and state-of-the-art computing power from GPUs to accomplish tasks in minutes that used to take months. In this webinar we'll highlight what's new in Driverless AI.
  • What's New in H2O Driverless AI Recorded: Feb 27 2020 55 mins
    Arno Candel, CTO at H2O.ai
    H2O Driverless AI employs the techniques of expert data scientists in an easy to use platform that helps scale your data science efforts. Driverless AI empowers data scientists to work on projects faster using automation and state-of-the-art computing power from GPUs to accomplish tasks in minutes that used to take months. In this webinar we'll highlight what's new in Driverless AI.

    Arno's bio:
    Arno Candel is the Chief Technology Officer at H2O.ai. He is the main committer of H2O-3 and Driverless AI and has been designing and implementing high-performance machine-learning algorithms since 2012. Previously, he spent a decade in supercomputing at ETH and SLAC and collaborated with CERN on next-generation particle accelerators.

    Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He was named “2014 Big Data All-Star” by Fortune Magazine and featured by ETH GLOBE in 2015. Follow him on Twitter: @ArnoCandel.
  • An Introduction to Machine Learning for All of Us Recorded: Feb 20 2020 37 mins
    Rafael Coss, Director of Technical Marketing, H2O.ai
    A Beginners Guide to Automatic Machine Learning

    Every company can be an artificial intelligence (AI) company. Machine Learning is a specific subset of AI that has exploded the applications and adoption of AI but many times has required special skills. In this session learn about the basics of ML and how Automatic ML is making these tools accessible to a wider community of people.

    Presenter: Rafael Coss, Director of Technical Marketing, H2O.ai
  • Introduction to Sparkling Water: Productionalizing H2O Models with Apache Spark Recorded: Feb 20 2020 55 mins
    Edgar Orendain, Software Engineer, H2O.ai
    Spark is a powerful and robust open-source, general-purpose computation platform. It is an invaluable tool for users who want to munge, wrangle, clean and transform data before training a model. Spark Pipelines are also powerful constructs but have little support for easily plugging in advanced third-party machine learning libraries.

    At the same time, many novice and advanced data scientists are leveraging the power of the H2O machine learning platform, a highly distributable and tunable machine learning library. The H2O platform provides the powerful MOJO concept (Model Object Optimized), making it easy to deploy trained models with a focus on scoring speed, traceability, exchangeability and backward compatibility.

    In this webinar, Edgar will introduce H2O Sparkling Water, the glue between Spark and the H2O ML platform, allowing users to seamlessly incorporate advanced data science libraries with their Spark environments. We will demonstrate creation of Spark pipelines integrating H2O ML models and their deployments using Scala or Python. We will use H2O’s AutoML algorithm for automatic model selection and ensembling and show how to load that into production-grade model into Spark pipeline for deployment.
  • Winning Solutions for Analytics: Reducing Lower Body Injuries in the NFL Recorded: Feb 13 2020 56 mins
    John Miller, Customer Data Scientist, H2O.ai
    It’s a great thing when someone hands you a well-defined machine learning problem: nice clean data, a scoring metric, and a representative test set. But the reality is often quite different. Data science teams must decide where to focus and how to apply machine learning in the best way. And when it’s time to report findings, it takes strong communication skills to be heard and get a decision.

    In this webinar, John will talk about how he applied these considerations to win two analytics challenges on Kaggle sponsored by the NFL: NFL 1st and Future - Analytics and NFL Punt Analytics Competition. Analytics challenges supply data and ask participants to provide recommendations and findings. Unlike, a typical Kaggle machine learning competition, there is no objective metric or score. Reports are evaluated by a panel of judges on how well they address the issue.

    After this webinar you will leave with:
    - Methods to identify and prioritize opportunities for analysis
    - How to apply machine learning in the context of an analytics problem
    - Tips on communicating with a business audience
    - Techniques to optimize the readability of Jupyter notebooks
  • H2O Driverless AI for CDS: Early Detection of Sepsis in the ICU Recorded: Feb 6 2020 60 mins
    Niki Athanasiadou MRes, PhD, Customer Data Scientist, H2O.ai
    Clinical decision support (CDS) systems are patient-focused alerts, reminders and clinical guidelines that help healthcare providers improve patient outcomes and enhance healthcare workflows. AI-backed CDS offers the opportunity for more ‘intelligent’ systems that can detect risk of disease more accurately and at an earlier time, when interventions might be more effective.

    In the use case presented in this webinar we will use H2O.ai’s award-winning automatic machine learning platform, H2O Driverless AI, to detect patient-specific risk of sepsis in Intensive Care Unit (ICU) six hours before it is actually diagnosed. As features, we will be using patient-specific vital signs, laboratory tests and basic demographic information, all typically available in the ICU. Finally, by delving into the advanced model explainablity capabilities that are available within Driverless AI, we will demonstrate how Driverless AI offers insights into possible paths for intervention that trained medical personnel can take advantage of.

    In this webinar, you will learn:
    - How to prepare ICU time-dependent data for machine learning.
    - How to handle imbalanced patient data to train accurate models for medical use.
    - How to leverage machine learning explainablity techniques for CDS.
  • Séries Temporelles et AutoML avec H2O Driverless AI Recorded: Jan 28 2020 58 mins
    Badr Chentouf, Senior Solution Engineer, H2O.ai
    Ce webinar présentera une introduction à l’utilisation de DriverlessAI, la plateforme d’Automatic Machine Learning, qui permet aux datascientistes de tous niveaux d’accélerer leurs projets de datascience.

    Dans ce webinar, nous ferons un focus sur le cas des séries temporelles, problématique transverse aux secteurs d’activité pour prédire des consommations, des ventes, des pannes, … sur un horizon de temps donné. Nous verrons comment DriverlessAI permet de répondre à cette problématique, et de gagner en précision et en rapidité grâce aux techniques d’AutoML.
  • Fairness in AI and Machine Learning Recorded: Jan 23 2020 48 mins
    Navdeep Gill, H2O.ai
    This webinar introduces methods that can uncover discrimination in your data and predictive models, including the adverse impact ratio (AIR), false positive and false negative rates, marginal effects, and standardized mean difference. Once discrimination is identified in a model, new models with less discrimination can usually be found, typically by more judicious feature selection or by tweaking hyperparameters. Mitigating discrimination in ML is important for both consumers and operators of ML. Consumers of ML deserve equitable decisions and predictions and operators of ML want to avoid reputational and regulatory damages.

    If you are a data scientist or analyst working on decisions that affect people's lives, then this presentation is for you!
  • Productionalizing H2O Driverless AI Models Recorded: Jan 16 2020 57 mins
    Nicholas Png, H2O.ai
    Training a good machine learning model is an extremely difficult process. Good data science practitioners must first determine if the data they have is useful at all. Next, do they have to cleanse or munge the data to put it into the proper format for the machine learning algorithm they are planning to use? Then, you might need to create new features based off the original data that provide better signal for predicting the target value, and consider what hyperparameters to use when training the algorithm. To name a few steps.

    However, this is only the first step in creating a useful model. The next step, and one that is arguably just as important is productionalizing a model. In many cases, companies have strict rules about how a model must behave or in what kind of infrastructure a model must run in production. As an example, some companies require only Java models, and data scientists who produced the model in R or Python must then pass their code to a data engineer who will take a month or two to translate the model from the original to Java. This kind of restriction is often times the major barrier to entry when it comes to pushing new machine learning models to production.
    Join our webinar to learn about common approaches to productionalizing models, and how to apply these practices to models produced by H2O Driverless AI.

    Join our webinar to learn:
    • Some common challenges associated with productionalizing models in different infrastructures
    • Good practices when productionalizing models, specifically related to models produced by Driverless AI
    • Some generic examples of how to productionalize a model
    • Time permitting: a live coding exercise to productionalize a Driverless AI Mojo
  • Responsible Machine Learning with H2O Driverless AI Recorded: Jan 9 2020 63 mins
    Navdeep Gill, H2O.ai
    Usage of AI and machine learning models is likely to become more commonplace as larger swaths of the economy embrace automation and data-driven decision-making. While these predictive systems can be quite accurate, they have been treated as inscrutable black boxes in the past, that produce only numeric predictions with no accompanying explanations. Unfortunately, recent studies and recent events have drawn attention to mathematical and sociological flaws in prominent weak AI and ML systems, but practitioners usually don’t have the right tools to pry open machine learning black-boxes and debug them.

    This presentation introduces several new approaches to that increase transparency, accountability, and trustworthiness in machine learning models. If you are a data scientist or analyst and you want to explain a machine learning model to your customers or managers (or if you have concerns about documentation, validation, or regulatory requirements), then this presentation is for you!
Democratize AI
H2O.ai is the maker of H2O, the world's best machine learning platform and Driverless AI, which automates machine learning. H2O is used by over 200,000 data scientists and more than 18,000 organizations globally. H2O Driverless AI does auto feature engineering and can achieve 40x speed-ups on GPUs.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Introduction to Sparkling Water: Productionalizing H2O Models with Apache Spark
  • Live at: Feb 20 2020 7:00 pm
  • Presented by: Edgar Orendain, Software Engineer, H2O.ai
  • From:
Your email has been sent.
or close