Name: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Start: 2024-05-15T19:00:00Z
End: 2024-05-15T19:00:36.000Z
Location: BrightTALK

Customers turn to SambaNova to quickly deploy state-of-the-art generative AI capabilities within the enterprise. Our purpose-built enterprise-scale AI platform is the technology backbone for the next generation of AI computing.

Agentic AI is ultimately what is needed to enable AI to become production ready. While current chat-bot + RAG use cases were the first wave, they are ultimately limited in how much they can transform work and deliver on the productivity growth that we have all been promised. Agentic AIs can autonomously pursue complex goals and workflows with limited direct human supervision, making it possible to automate large amounts of knowledge work.

Join our interactive workshop to discover how to harness the lightning-fast inference speeds of SambaNova Cloud and unlock the full potential of Agentic AI. In this hands-on session, we'll cover: 

+ An Introduction to Agentic AI: Understand the concept, applications, and benefits of Agentic AI. 
+ How to Get Started with SambaNova Cloud: Obtain an API key through our Cloud portal and set up your environment. 
+ How to Build Your First Fast Inference App: Develop a ""Hello World"" application in Python and explore our AI Starter Kits. 
+ How to Build Agentic AI: Bootstrap a Agentic AI app using our Starter Kits. 

Accelerate your development journey with SambaNova Cloud. Get started for free at cloud.sambanova.ai.

Get Started Building AI Agents

Dive into the cutting-edge world of AI evaluation with our in-depth exploration of using Large Language Models (LLMs) as judges for other AI models. In this interview with SambaNova's ML team, we unveil a comprehensive framework that revolutionizes how we assess AI quality.

• How does the industry evaluate LLMs today
• The rationale behind using LLMs to evaluate other AI models
• Why is this important for SambaNova and our customers
• Future implications for AI development and quality control

LLM as Judge: The Ultimate Guide to Evaluating AI Models

Get Started Building AI Agents with SambaNova Cloud

Learn how to build complex reasoning applications that run 10x faster in a 10x smaller footprint on our new architecture that unlocks blazing fast handoffs for chained models on chip. We'll review the current state of the generative AI application market and dig into some of the pain points our customers were seeing at SambaNova Systems. We'll then cover what we built to solve those problems and how it works, what our customers are doing with it, and how you can easily try it yourself.

The New Way To Deliver Enterprise AI Applications

A walkthrough of our AI Starter Kit available on GitHub using Composition of Experts (CoE) models with Langchain. 

The repository provides a Python script and a Jupyter Notebook that demonstrate how to call SambaNova CoE models using the Langchain framework. The script offers different approaches for calling CoE models, including using SambaStudio with a named expert or routing and Sambaverse - a free tool where you can discover, compare and evaluate the best open source models.

AI Starter Kit: Using Composition of Experts Models with Langchain

The inference process in Large Language Models (LLMs) is often limited due to the absence of parallelism in the auto-regressive decoding process, resulting in most operations being restricted by the memory bandwidth of accelerators. While methods such as speculative decoding have been suggested to address this issue, their implementation is impeded by the challenges associated with acquiring and maintaining a separate draft model. 

Recorded at SambaNova Systems, this seminar covers Medusa, an efficient method that augments LLM inference by adding extra decoding heads to predict multiple subsequent tokens in parallel.

Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Machine Learning

Machine Learning Models

Artificial Intelligence

Big Data

Generative AI

Language Model

Computer Vision

The application development community features top thought leadership focusing on optimal practices in software development, SDLC methodology, mobile app development and application development platforms and tools. Join top software engineers and coders as they cover emerging trends in everything from enterprise app development to developing for mobile platforms such as Android and iOS.

Application Development

Enterprise applications have become a crucial piece of infrastructure for many businesses. The enterprise applications community on BrightTALK features the latest insights in enterprise application integration, enterprise application architecture and EAS software. Join the community to gain access to thousands of videos and webinars presented by recognized enterprise information systems experts and arm yourself with the knowledge you need to succeed.

Enterprise Applications

As an IT professional, many of the problems you face are multifaceted, complex and don’t lend themselves to simple solutions. The information technology community features useful and free information technology resources. Join to browse thousands of videos and webinars on ITIL best practices, IT security strategy and more presented by leading CTOs, CIOs and other technology experts.

Information Technology

Get powerful emerging technology insights from influential experts. Connect with thought leaders and colleagues to get the most up-to-date knowledge on technologies like 3D printing, pervasive robotics, natural interfaces, connected everything and cognitive systems.

Emerging Technologies

The research and development community on BrightTALK brings together leading professionals in healthcare, technology and manufacturing to present on current topics in their fields. From big data in healthcare to online forensics training and medical research practices, the community is an exceptional resource for those looking to expand their knowledge. Be part of the conversation by joining the community and participating in interactive live webinars.

Research and Development

Get powerful insights to run your business. CEOs, CFOs, CMOs, COOs, CTOs and CIOs are among the leaders in this community who connect with peers and experts to grow their businesses.

CxO Strategy

In the business management community, commercial experts will be sharing webinars and videos on an array of subjects. From sales and marketing, to finance, productivity and growth, this content will help you stay up-to-date with the current economic environment and provide timely management insight to drive business growth.

Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Presented by

About this talk

SambaNova Systems