Fine-Tuning Large Language Models: Empowering AI for Specialized Applications

Logo
Presented by

Binitha MT, Software Principal Engineer, Dell Technologies; Subhankar, Software Senior Engineer, ABB

About this talk

With the emergence of Conversational AI tools like Chat GPT and Google Bard, the world has been exposed to incredible new possibilities of technologies with the help of Large Language Models (LLM). A large language model is a type of artificial intelligence algorithm that uses deep learning techniques and massively large data sets accompanied with huge computation infrastructure. However, training LLMs is a complex task which requires substantial computational resources and infrastructure. Fine-tuning large language models (LLMs) for domain-specific data has emerged as a crucial technique to enhance their performance in specialized tasks and industries. In this talk we give an overview of the basic concepts of LLMs , their pre-training process, highlighting the transfer learning paradigm that forms the basis of fine-tuning. We will look into the preparatory steps required for successful fine-tuning, including dataset acquisition, cleaning, and structuring. Furthermore, we will discuss the workings of the fine-tuning process which involves adapting the pre-trained LLM’s parameters to domain-specific language patterns, contextual nuances, and task requirements. Architectural considerations, such as selecting appropriate model sizes, are explored in relation to the domain’s computational resources and target task complexity. We evaluate different fine-tuning approaches, ranging from traditional fine-tuning to more advanced techniques like adapter-based architectures. It covers techniques to prevent overfitting, including data augmentation, regularization, and transfer learning from related domains. Lastly, we will address the ethical scope of fine-tuning LLMs, highlighting potential challenges related to bias, fairness, and unintended consequences. They audience will gain an overall knowledge about LLM also they can know how to apply it on their specific data domains.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (143)
Subscribers (55205)
SNIA is a not-for-profit global organization made up of corporations, universities, startups, and individuals. The members collaborate to develop and promote vendor-neutral architectures, standards, and education for management, movement, and security for technologies related to handling and optimizing data. SNIA focuses on the transport, storage, acceleration, format, protection, and optimization of infrastructure for data.