Accelerating Generative AI – Options for Conquering the Dataflow Bottlenecks

Logo
Presented by

M. Baldi, AMD; R. Davis, NVIDIA; D. Eggleston, Microchip; D. McIntyre, Samsung; A. Rodriguez, Intel; Joe White, Dell

About this talk

Workloads using generative artificial intelligence trained on large language models are frequently throttled by insufficient resources (e.g., memory, storage, compute, or network dataflow bottlenecks). If not identified and addressed, these dataflow bottlenecks can constrain Gen AI application performance well below optimal levels. Given the compelling uses across natural language processing (NLP), video analytics, document resource development, image processing, image generation, and text generation, being able to run these workloads efficiently has become critical to many IT and industry segments. The resources that contribute to generative AI performance and efficiency include CPUs, DPUs, GPUs, FPGAs, plus memory and storage controllers. This webinar, with a broad cross-section of industry veterans, provides insight into the following: • Defining the Gen AI dataflow bottlenecks • Tools and methods for identifying acceleration options • Matchmaking the right xPU solution to the target Gen AI workload(s) • Optimizing the network to support acceleration options • Moving data closer to processing, or processing closer to data • The role of the software stack in determining Gen AI performance
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (98)
Subscribers (16691)
With today’s pressures on lowering our carbon footprint and cost constraints within organizations, IT departments are increasingly in the front line to formulate and enact an IT strategy that greatly improves energy efficiency and the overall performance of data centers. This channel will cover the strategic issues on ‘going green’ as well as practical tips and techniques for busy IT professionals to manage their data centers. Channel discussion topics will include: - Data center efficiency, monitoring and infrastructure management; - Data center design, facilities management and convergence; - Cooling technologies and thermal management And much more