Training GPT from Scratch with NanoGPT pt.2
What to expect?
After pt. 1 of our workshop of training GPT from scratch with nanoGPT - we're excited to continue the series! To recap what we learned, we will go over the code base developed by Andrej Karpathy and show how it can be run for various text-generation tasks. Throughout this talk, we will understand the mechanics behind GPT and gain a deeper insight into how it can be used and modified for our specific text generation needs. - a clear understanding of GPT - knowledge of how to run nanoGPT for your own purposes - the ability to track your model training using Weights and Biases As a live exercise, we will explore - how one can use nanoGPT with LoRA or adapters to perform efficient fine-tuning. - understand how to reduce the number of parameters one needs to train by injecting some custom layers and finally - train a small subset of the total model parameters (around 1%). This enables one to use bigger models and achieve very impressive results Register to learn how to train GPT from scratch with nanoGPT or any other LLM.