Superb Tuning LLMs on a Single Shopper Graphic Card | by Naser Tamimi

Superb Tuning LLMs on a Single Shopper Graphic Card | by Naser Tamimi | Jan, 2024

GENERATIVE AI

Learnings from superb tuning a big language mannequin on a single shopper GPU

After we take into consideration Massive Language Fashions or every other generative fashions, the primary {hardware} that involves thoughts is GPU. With out GPUs, many developments in Generative AI, machine studying, deep studying, and knowledge science would’ve been unimaginable. If 15 years in the past, players had been enthusiastic in regards to the newest GPU applied sciences, at present knowledge scientists and machine studying engineers be a part of them and pursue the information on this subject too. Though often players and ML customers are two totally different sorts of GPUs and graphic playing cards.

Gaming customers often use shopper graphic playing cards (equivalent to NVIDIA GeForce RTX Collection GPUs), whereas ML and AI builders often observe information about Knowledge Middle and Cloud Computing GPUs (equivalent to V100, A100, or H100). Gaming graphic playing cards often have a lot much less GPU reminiscence (at most 24GB as of January 2024) in comparison with Knowledge Middle GPUs (within the vary of 40GB to 80GB often). Additionally, their worth is one other important distinction. Whereas most shopper graphic playing cards might be as much as $3000, most Knowledge Middle graphic playing cards begin from that worth and may go tens of hundreds of {dollars} simply.

Since many individuals, together with myself, might need a shopper graphic card for his or her gaming or each day use, they could be to see if they will use the identical graphic playing cards for coaching, fine-tuning, or inference of LLM fashions. In 2020, I wrote a complete article about whether or not we will use shopper graphic playing cards for knowledge science tasks (link to the article). At the moment, the fashions had been largely small ML or Deep Studying fashions and even a graphic card with 6GB of reminiscence may deal with many coaching tasks. However, on this article, I’m going to make use of such a graphic card for giant language fashions with billions of parameters.

For this text, I used my Geoforce 3090 RTX card which has 24GB of GPU reminiscence. On your reference, knowledge heart graphic playing cards equivalent to A100 and H100 have 40GB and 80GB of reminiscence respectively. Additionally, a typical AWS EC2 p4d.24xlarge occasion has 8 GPUs (V100) with a complete of 320GB of GPU reminiscence. As you’ll be able to see the distinction between a easy shopper…