Hi [[ session.user.profile.firstName ]]

Enhancing Spark with H2O's Random Grid Search and AutoML using Sparkling Water

Learn more about how you can integrate large scale data preprocessing with Machine Learning using Sparkling Water. Sparkling Water enables training H2O-3 models leveraging Apache Spark clusters in a distributed manner. It also allows for using trained H2O-3 and Driverless AI models inside Apache Spark. We will demonstrate model training together with hyper-parameter tuning (Cartesian and Random GridSearch with time constraint) of various algorithms, using AutoML – training meta model combining different algorithms, hyper-parameter search and stacking (Ensemble method) all using Spark Pipeline API. We will also demonstrate how target encoding can be used with the Sparkling Water API.

What will users learn:
- How to use H2O's GridSearch in Sparkling Water environment
- How to use AutoML in Sparkling Water environment
- How to put the trained models into production
Recorded May 7 2020 51 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Jakub Háva, Team Lead and Senior Software Engineer, H2O.ai
Presentation preview: Enhancing Spark with H2O's Random Grid Search and AutoML using Sparkling Water

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • How to be Successful Right Away in Your New Data Science Role Oct 6 2020 6:00 pm UTC 60 mins
    Tom Ott, Senior Customer Solutions Engineer at H2O.ai
    Congratulations on your new role as a Data Scientist! A rewarding career as a Data Scientist goes beyond just coding in Python and R. It’s not a Venn Diagram, rather it’s your first step as an analytic professional in an ever changing environment. Success as a Data Scientist is less about getting the best AUC but creatively solving business problems.

    In this webinar, you will learn:
    - Are Data Scientists Born or Made?
    - Which is better? Coding or No Code?
    - Open Source vs Closed Source
    - Aligning your work to a business problem

    Tom Ott, Senior Customer Solutions Engineer at H2O.ai
  • How to Detect Fraud Quicker with AI Oct 1 2020 6:00 pm UTC 60 mins
    Ashrith Barthur, Principal Security Scientist at H2O.ai
    Electronic Fraud is prevalent in almost every walk of life these days. The directions in which society is moving forward, monetary instruments are only going to get more digital, and transactions are only going to get more electronic. In this almost-exponential growth fraudsters have a leg up. This is because legacy systems that are fighting are old, and have not accounted for newer fraudulent behaviors. While the new systems with ML models could be accurate but slow. For one to catch fraud in an acceptable time, the systems have to be fast and quickly modifiable to changing fraudulent methods.

    In This talk we speak about different methods which can make your AI systems faster, and valuable toward identifying fraud. These systems also maintain a high level of accuracy.

    Some of these methods that we would discuss are:
    - Different ways of implementing the models
    - Variations in hyper-parameters of models
    - Highly accurate features that are valuable that are modifiable

    At the end of this webinar you would be able to understand how to:
    - Build better features for fraud
    - How to build models and model implementations to speed up the decision

    - Ashrith Barthur, Principal Security Scientist at H2O.ai
  • Building Machine Learning Models at Scale with Sparkling Water Recorded: Sep 17 2020 49 mins
    Elena Boiarskaia, Senior Solutions Engineer at H2O.ai
    H2O-3 is an open source, in-memory, distributed machine learning platform that is optimized to build machine learning models on big data and easily deploy them in an enterprise environment with a MOJO. Spark is a powerful distributed cluster-computing framework for running large-scale data processing workloads. Sparkling Water combines the best of both worlds, by seamlessly integrating the H2O-3 ML library to run on top of Spark for building fast and accurate predictive models on big data at scale.

    In this webinar, you will learn about:
    - Leveraging the power of H2O-3 and Spark to build scalable machine learning models
    - Embedding Sparkling Water models inside SparkML pipelines
    - End-to-end Sparkling Water use cases from data preparation to model deployment
  • Enterprise Architect Guide to H2O Open Source for Model Building and Deployment Recorded: Sep 10 2020 59 mins
    Gregory Keys PhD, Senior Solutions Architect at H2O.ai
    The H2O Open Source Machine Learning Platform (H2O-3) empowers data scientists to train ML models at massive scale (GBs to TBs of training data) using familiar languages and IDEs on existing or greenfield distributed compute environments. Data scientists export a standardized scoring artifact of a trained model and dev-ops teams deploy these as low latency prediction software to diverse production systems (Rest endpoint, RDBMS, Kafka, etc) all via existing SDLC processes. Numerous top companies across verticals have leveraged the power and simplicity of H2O-3 to become innovative AI companies while adhering to strict enterprise security and governance provided by the platform. Let’s learn how.

    In this webinar, you will learn:
    - The value of H2O Open Source from the data scientist’s perspective
    - The value of H2O Open Source from the dev-ops and business owner perspective
    - Technical architecture of H2O Open Source ML Platform
    - Enterprise choices and best practices in implementing H2O Open Source on distributed compute for model building
    - Enterprise choices and best practices in deploying trained models to diverse production software environments
    - Enterprise security and governance controls of H2O Open Source

    Presenter: Gregory Keys PhD, Senior Solutions Architect at H2O.ai
  • How Expert Data Science Teams Use AutoML to Increase Scalability and Efficiency Recorded: Sep 3 2020 61 mins
    Travis Couture, Solutions Engineer at H2O.ai
    AutoML is helping lower the barrier to entry and accelerate the time it takes to build and deploy machine learning models. H2O's Driverless AI and Open Source library both feature AutoML capabilities that range from hyperparameter selection to advanced feature engineering and model ensembling. However, it is important to note that AutoML is not intended to replace expert data science teams but can instead be used by such teams to augment their scalability and efficiency while still allowing them to maintain control over the end to end machine learning lifecycle.

    In this webinar, you will learn:
    - Evolution of AutoML
    - How expert data science teams can leverage AutoML
    - How H2O Driverless AI allows expert Data Science teams to maintain control when using AutoML

    Travis Couture, Solutions Engineer at H2O.ai
  • AI Transformation: Raise a Forest of AI Apps, Not Just a Tree Recorded: Aug 27 2020 60 mins
    Vinod Iyengar, VP Customer Success and Product at H2O.ai and Rafael Coss, Community and Partner Maker at H2O.ai
    AI is a central part of every CIO’s strategy and in 2020, we’re seeing every enterprise looking to use data and AI to run business. An AI-driven enterprise is not only building machine learning models but infusing and integrating AI across every application in the enterprise. This requires internal and external applications to be AI-ready and agile to the demands of the enterprise and evolving machine learning models.

    In this webinar, you will learn:

    - The different stages of maturity as it relates to data science and analytics in an organization and consequently their AI maturity itself.
    - How an enterprise can move from just using data science to improve operational efficiency to actually creating new lines of business or revenue generation opportunities using their data assets.
    - How all of this can lead to more data, better models and ultimately help you build a really defensible moat against your competitors while continuing to delight your own customers to improve your margins.
    - What components are required to build this AI-aware organization and how to get there as soon as possible.
    - Demo of a modern AI application and review some of the key elements that make them native and agile.

    Vinod Iyengar, VP Customer Success and Product at H2O.ai
    Rafael Coss, Community and Partner Maker at H2O.ai
  • Responsible Automation: Towards Interpretable & Fair AutoML Recorded: Aug 13 2020 61 mins
    Erin LeDell, Chief Machine Learning Scientist at H2O.ai
    Automatic Machine Learning (AutoML) is a subfield of machine learning which aims to automate the training & tuning of machine learning models. One of the main goals of an AutoML tool is to train the “best” model possible in the least amount of computation time, with zero/minimal configuration by the user. AutoML tools reduce the expertise required for practitioners to train powerful machine learning models, which has expanded and accelerated the application of machine learning to problems in both academic research and industry. AutoML greatly speeds up the workflow and efficiency of even the most experienced data scientist.

    As automation and use of machine learning increases, in particular with the proliferation of open source AutoML tools, there’s an increased risk in misuse of, or harm by, machine learning models used in real world applications. In order to reduce the risk of harmful models being deployed, machine learning tools, and especially AutoML tools, can offer easy-to-use or automated interpretability and algorithmic fairness methods that can be used to evaluate and probe machine learning models. Interpretability and fairness methods should always be applied to machine learning models before they are deployed into production where they can make or influence important decisions affecting people’s lives.

    In this session, you will learn about:

    - Automated Machine Learning and open source H2O AutoML
    - Interpretability methods for H2O models
    - Algorithmic fairness (disparate impact) for H2O models
    - Demo using U.S. Home Mortgage Disclosure Act (HMDA) data
  • From GLM to GBM: The Future of AI in Lending and Insurance Recorded: Aug 6 2020 62 mins
    Patrick Hall, Advisory Consultant at H2O.ai and Michael Proksch, Senior Director, Customer Data Science at H2O.ai
    Insurance and credit lending are highly regulated industries that have relied heavily on mathematical modeling for decades. In order to provide explainable results for their models, data scientists and statisticians in both industries relied heavily on generalized linear models (GLMs). However, new machine learning algorithms like GBMs are not only more sophisticated estimators of risk, but due to a Nobel-laureate breakthrough known as Shapley values, they are now seemingly just as interpretable as traditional GLMs. More nuanced risk estimation means less payouts and write-offs for policy and credit issuers, but it also means a broader group of customers can participate in mainstream insurance and credit markets.

    In this webinar, you will learn about:
    - The Advantages of Machine Learning vs. Linear Models
    - Why you should think about the AI shift in perspective
    - How to move to new ML Methods (e.g. GBM)

    Patrick Hall, Advisory Consultant at H2O.ai
    Michael Proksch, Senior Director, Customer Data Science at H2O.ai
  • Further Exploration into Model Explainability with H2O Driverless AI 1.9 Recorded: Jul 30 2020 29 mins
    Benjamin Cox, Director of Product Marketing at H2O.ai
    With the latest release of H2O Driverless AI (1.9.0), we have added a litany of new features to enhance the user experience and empower companies to build models in the most responsible and transparent manner. With the addition of multiple fairness metrics such as, Disparate Impact Analysis, and leading edge explainable modeling methods such as Explainable Neural Networks (XNN) and GA2M, Driverless AI users are equipped to further explore model explainability techniques within the platform.

    In this webinar, you will learn about:
    - Disparate Impact Analysis and Standard Mean Difference
    - Exporting Decision tree model rules as txt & kernel explainer for Shapley Values
    - XNNs & GA2M

    Benjamin Cox, Director of Product Marketing at H2O.ai
  • State of The Art NLP Models in H2O Driverless AI 1.9 Recorded: Jul 23 2020 30 mins
    SRK, Kaggle Grandmaster/Data Scientist at H2O, Max Jeblick, Data Scientist, H2O, and Trushant Kalyanpur, Data Scientist, H2O
    H2O Driverless AI brings the best practices of the world’s leading data scientists to your team to build high-quality production-ready models in hours, not weeks or months. Driverless AI users can now use state-of-the-art contextual pretrained language models for their text related datasets. Advanced or novice data scientists can build models like BERT, DistilBERT, XLNET, Roberta with the power of full Driverless AI automation.

    In this webinar, you will learn about:
    - NLP features in Driverless AI 1.9
    - Demo of how to use BERT like models as modeling algorithms or for feature transformation
    - Custom BERT recipes for domain specific problems

    Sudalai Rajkumar (SRK), Kaggle Grandmaster and Data Scientist at H2O.ai
    Maximilian Jeblick, Kaggle Master and Data Scientist at H2O.ai
    Trushant Kalyanpur, Data Scientist at H2O.ai
  • More Use Cases and More Value with Automated Computer Vision Modeling Recorded: Jul 16 2020 32 mins
    Dan Darnell, VP of Product Marketing at H2O.ai and Yauhen Babakhin, Kaggle Competitions Grandmaster, Data Scientist at H2O.ai
    H2O Driverless AI brings the best practices of the world’s leading data scientists to your team to build high-quality production-ready models in hours, not weeks or months. Now, Driverless AI helps you solve more use cases with more data types using automatic machine learning (AutoML) for classification and regression with images. Users can now include images with other data types in a broader dataset or build models with images alone. Advanced or novice data scientists can build image-based models using state-of-the-art techniques, including TensorFlow CNNs, all with the power of full Driverless AI automation.

    In this webinar, you will learn about:
    - Visual AI features in Driverless AI 1.9
    - Image modeling use cases with images with other data types and with images stand alone
    - The Visual AI roadmap for Driverless AI
    - How to deploy image models as low latency MOJOs

    Dan Darnell, VP of Product Marketing at H2O.ai
    Yauhen Babakhin, Kaggle Competitions Grandmaster and Data Scientist at H2O.ai
  • Accelerate Your Enterprise AI on Snowflake with H2O.ai Recorded: Jul 14 2020 59 mins
    Yves Laurent, H2O.ai, Eric Gudgion, H2O.ai, Chris Pouliot, Snowflake and Isaac Kunen, Snowflake
    Organizations are looking to accelerate the adoption of machine learning (ML) by quickly and easily building and deploying models into production. The ML pipeline however can be complex and fraught with many barriers for businesses to take advantage of predictive capabilities. A new approach is needed to bring ML technology to the environment where users are most comfortable working with their data. That’s why Snowflake and H2O.ai provide users with a seamless experience for building and deploying ML models that can be used for scoring data and predictive insights.

    This webinar will cover how the Snowflake cloud data platform can be extended with H2O Driverless AI from within the customer’s Snowflake account. By using SQL commands in Snowflake, users can build and deploy models as a REST service to be used for scoring data and making predictions.

    What you will learn:
    - How Snowflake external functions are used with Driverless AI
    - How you can build and deploy models from within Snowflake
    - How to make predictions on data from your Snowflake account
    - How to integrate predictions in your business applications

    - Yves Laurent, Dir. Partnerships and Alliances, H2O.ai
    - Eric Gudgion, Sr. Solutions Architect, H2O.ai
    - Chris Pouliot, VP Data Science & Analytics, Snowflake
    - Isaac Kunen, Sr. Product Manager, Snowflake
  • Automatic Model Documentation with H2O Recorded: Jun 30 2020 46 mins
    Lauren DiPerna, Data Scientist at H2O.ai
    For many companies, model documentation is a requirement for any model to be used in the business. For other companies, model documentation is part of a data science team’s best practices. Model documentation includes how a model was created, training and test data characteristics, what alternatives were considered, how the model was evaluated, and information on model performance.

    Collecting and documenting this information can take a data scientist days to complete for each model. The model document needs to be comprehensive and consistent across various projects. The process of creating this documentation is tedious for the data scientist and wasteful for the business because the data scientist could be using that time to build additional models and create more value. Inconsistent or inaccurate model documentation can be an issue for model validation, governance, and regulatory compliance.

    Join us on Tuesday, June 30th, to learn how to create comprehensive, high-quality model documentation in minutes that saves time, increases productivity, and improves model governance.
  • What is AutoML? Recorded: Jun 25 2020 53 mins
    Rafael Coss, Community and Partner Maker, H2O.ai
    Our world is changing rapidly, and that implies many organizations will need to adapt quickly. AI is unlocking new potential for every enterprise. Organizations are using AI and machine learning technology to inform business decisions, predict potential issues, and provide more efficient, customized customer experiences. The results can enable a competitive edge for the business.

    AutoML or Automatic Machine Learning makes it easy to train and evaluate machine learning models. The automation of repetitive tasks allows people to focus on the data and the business problems they are trying to solve.

    Join us on Thursday, June 25th, to get practical tips and see AutoML in action with a real-world example. We’ll demonstrate how AutoML can augment your Data Scientists, supercharging your team and giving your organization the AI edge in record time.
  • Accelerate ROI by using H2O.ai with Pega Customer Decision Hub Recorded: Jun 23 2020 59 mins
    Vince Jeffs, Sr. Dir. Product Strategy for AI & Decisioning at Pegasystems and David Perona, Digital Transformation at H2O.ai
    How do you generate business value quickly with AI? By deploying your models into production and using them to identify opportunities and driving the conversation with the customer. Supercharge your results by pairing the market leading AI platform, H2O.ai, and the market leading Real-Time Interaction Management solution, Pega Customer Decision Hub.

    This presentation covers the end-to-end process from model training within Driverless AI to deploying the model within Pega CDH and using it to drive intelligent interactions.

    What you will learn:
    - How to build a customer churn model in Driverless AI
    - How to deploy the churn model in Driverless AI
    - How to import the Driverless AI model into Pega Prediction Studio
    - How to design a decision strategy within Pega CDH using the Driverless AI model
    - Real world examples of ROI using this approach

    Vince Jeffs, Senior Director of Product Strategy for AI & Decisioning at Pegasystems
    David Perona, Head of Digital Transformation and Decision Management at H2O.ai
  • Getting the Most Out of Your Machine Learning with Model Ops Recorded: Jun 18 2020 55 mins
    Dan Darnell, VP of Product Marketing at H2O.ai
    Machine learning models can now be quickly built using automatic machine learning technology. However, they are not generating economic value faster enough. This is due to deficiencies in the model deployment process.

    Join Dan Darnell, VP of Product Marketing at H2O.ai, on June 18th at 11am PT to understand more about these challenges and how Model Ops creates a set of practices that will allow your AI projects to scale, govern and generate value quickly.
  • Trends in Advanced Analytics and Data Science Recorded: Jun 11 2020 60 mins
    Lishuai Jing, GRUNDFOS | Dan Darnell, H2O.ai | Pragyansmita Nayak, Hitachi Vantara Federal | Paul Kowalczyk, Solvay
    Stay up-to-date on the latest tools and best practices that industry experts recommend in order to get the most value out of your advanced analytics and data science strategy.

    You'll come away with:
    - A better knowledge of the technology on offer to help scale your organization's approach to advanced analytics and data science
    - Key factors to consider when adopting an advanced analytics solution
    - Best practices for implementing a data science program and advanced analytics strategy that works for you
    - And more!

    Moderator: Lishuai Jing, Senior Data Scientist at GRUNDFOS
    Panelist: Dan Darnell, VP of Product Marketing at H2O.ai
    Panelist: Pragyansmita Nayak, Chief Data Scientist at Hitachi Vantara Federal
    Panelist: Paul Kowalczyk, Senior Data Scientist at Solvay
  • Accelerate Your Model Training with H2O-3 Recorded: Jun 4 2020 62 mins
    Megan Kurka, Customer Data Scientist at H2O.ai
    In this webinar, we introduce H2O-3, the #1 open source machine learning platform for the enterprise and how to use it to develop models for a variety of use cases. H2O-3 makes it possible for anyone to easily apply machine learning and predictive analytics to solve today’s most challenging business problems.

    We’ll walk you through demos and highlight features and capabilities of H2O-3 for typical data science workflows.

    What you will learn:
    - Overview of H2O-3 software
    - H2O-3 for data exploration
    - H2O-3 for feature engineering
    - AutoML in H2O-3
    - Model Interpretability in H2O-3
    - Live applied to real world datasets

    Megan Kurka, Customer Data Scientist at H2O.ai
  • Deploying Distributed AI and Machine Learning in Financial Services Recorded: May 28 2020 58 mins
    Dmitry Baev, Vice President of Solutions Engineering at H2O.ai
    The Financial Services industry has the potential to benefit from advanced application of Machine Learning and Artificial Intelligence by leveraging their vast data reserves in order to transform their businesses. However, keeping pace with new technologies for data science and Machine Learning can be overwhelming. Financial Services industry regulations can make it even more challenging to deploy and manage Machine Learning applications in large-scale distributed environments.

    In this talk we will cover how to leverage your existing Big Data investment to deliver leading edge data science using H2O. We will look at real customer use cases for AI and ML in Financial Services and discuss how to overcome deployment challenges in distributed Big Data environments in order to deliver transformational business results and faster time-to-value.

    Join this talk to learn how Financial Services organizations are extracting real business value with AI and ML.

    - How to train Machine Learning models at scale using distributed Big Data platforms
    - How to apply Automated Machine Learning (AutoML) to accelerate your model development pipelines
    - How to deploy Machine Learning models into production environments
    - AI in Financial Services success stories

    Dmitry Baev, Vice President of Solutions Engineering at H2O.ai
  • Not Just Another Black-Box - Extending Driverless AI Recorded: May 28 2020 47 mins
    Lena Rampula, Data Science Engineer, H2O.ai
    H2O Driverless AI is an automated machine learning platform - performing state of the art feature engineering and model training. Driverless AI will allow you to scale your data science efforts, making it faster to find an optimal solution to a variety of business use cases.

    At H2O.ai, we empower data scientists to use their domain expertise to extend Driverless AI by adding custom functions for feature engineering, models and scorers using simple Python code snippets. These "recipes" allow you to build your machine learning solution, using ingredients from Driverless AI and your own IP.

    In this webinar, we will introduce Driverless AI and show how data scientists can extend it, using recipes from the H2O.ai repository or using their own code.
Democratize AI
H2O.ai is the maker of H2O, the world's best machine learning platform and Driverless AI, which automates machine learning. H2O is used by over 200,000 data scientists and more than 18,000 organizations globally. H2O Driverless AI does auto feature engineering and can achieve 40x speed-ups on GPUs.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Enhancing Spark with H2O's Random Grid Search and AutoML using Sparkling Water
  • Live at: May 7 2020 7:00 pm
  • Presented by: Jakub Háva, Team Lead and Senior Software Engineer, H2O.ai
  • From:
Your email has been sent.
or close