Yochay Ettun (cnvrg.io), Michael Balint (NVIDIA)
Developing, experimenting, and deploying ML models at scale requires substantial tooling, scripting, tracking, versioning, and monitoring.
Data scientists want to do data science – and are slowed down by MLOps and DevOps tasks.
They lack user friendly tools needed to track experiments, attach resources, manage datasets and launch multiple ML pipelines.
In this webinar cnvrg.io CEO, Yochay Ettun will host a special guest from NVIDIA, Sr. Product Manager for NVIDIA DGX systems, Michael Balint, and discuss how to optimize the use of any NVIDIA DGX and NVIDIA GPU asset both on-prem or in the cloud with the cnvrg.io machine learning platform.
We will show best practices to reach high utilization of NVIDIA DGX systems, while conducting meta-scheduling across multiple heterogeneous Kubernetes/OpenShift/Linux server clusters.
In addition, we will introduce the concept of production flows, which automate hundreds of models from the data hub to deployment. We will wrap up with a real-life demo of flows, exercising many experiments across DGX platforms.
What you will learn:
- Creating a data science flow: from data to deployment, while attaching different NVIDIA DGX Kubernetes clusters to each step of the flow
- The concept of meta-scheduler: scheduling experiments disperse resources or other schedulers, accomplishing high utilization at scale
- How the NVIDIA DGX ecosystem with cnvrg.io makes GPU assets consumed easily, with one-click, bypassing complexity of MLOps
- How to leverage NGC containers in ML pipelines