{"id":5647923,"date":"2023-05-19T11:01:21","date_gmt":"2023-05-19T15:01:21","guid":{"rendered":"https:\/\/lightning.ai\/pages\/?p=5647923"},"modified":"2023-07-21T09:41:58","modified_gmt":"2023-07-21T13:41:58","slug":"how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset","status":"publish","type":"post","link":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/","title":{"rendered":"How To Finetune GPT Like Large Language Models on a Custom Dataset"},"content":{"rendered":"<div class=\"takeaways card-glow p-4 my-4\"><h3 class=\"w-100 d-block\">Takeaways<\/h3> Learn how to finetune large language models (LLMs) on a custom dataset. We will be using <a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\">Lit-GPT<\/a>, an optimized collection of open-source LLMs for finetuning and inference. It supports &#8211; LLaMA 2, Falcon, StableLM, Vicuna, LongChat, and a couple of other top performing open source large language models.<\/div>\n<p>The AI community&#8217;s effort has led to the development of many high-quality open-source LLMs, including but not limited to LLaMA 2, Falcon, StableLM, and Pythia. You can finetune these models on a custom instruction dataset to adapt to your specific task, such as training a chatbot to answer financial questions.<\/p>\n<p>Lightning AI recently launched Lit-GPT, the second LLM implementation in the Lit-* series after <a href=\"https:\/\/github.com\/Lightning-AI\/lit-llama\">Lit-LLaMA<\/a>. The goal of these Lit-* series is to provide the AI\/ML community with a clean, solid, and optimized implementation of large language models with pretraining and finetuning support using <a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/lora-llm\/\">LoRA<\/a> and <a href=\"https:\/\/lightning.ai\/pages\/community\/article\/understanding-llama-adapters\/\">Adapter<\/a>.<\/p>\n<p>We will guide you through the process step by step, from installation to model download and data preparation to finetuning. If you have already completed a step or are confident about it, feel free to skip it.<\/p>\n<h2>Installing Lit-GPT \ud83e\udd9c<\/h2>\n<p>The Lit-GPT repository is available in the Lightning AI GitHub organization <a href=\"https:\/\/github.com\/Lightning-AI\/Lit-Parrot\">here<\/a>. To get started, clone the repository and install its dependencies.<\/p>\n<pre class=\"hljs collapse-false\">git clone &lt;https:\/\/github.com\/Lightning-AI\/lit-gpt\r\ncd lit-gpt\r\n\r\n<\/pre>\n<p>We are using <a href=\"https:\/\/github.com\/HazyResearch\/flash-attention\">FlashAttention<\/a>, a fast and memory-efficient implementation of attention, which is only available in PyTorch Nightly 2.1 at the moment of writing this article.<\/p>\n<pre class=\"hljs collapse-false\"># for cuda\r\npip install --index-url &lt;https:\/\/download.pytorch.org\/whl\/nightly\/cu118&gt; --pre 'torch&gt;=2.1.0dev'<\/pre>\n<pre class=\"hljs collapse-false\"># for cpu\r\npip install --index-url &lt;https:\/\/download.pytorch.org\/whl\/nightly\/cpu&gt; --pre 'torch&gt;=2.1.0dev'\r\n\r\n<\/pre>\n<p>Finally, install the dependencies using <code>pip install -r requirements.txt<\/code> .<\/p>\n<h2>Downloading the model weights<\/h2>\n<p>In order to use the model or finetune it we need a pre-trained weight. Thanks to the effort of open source teams, we have a bunch of open source weights that we can use for commercial purposes. Lit-GPT supports various LLMs including Llama 2, Falcon, Vicuna, and RedPajama-INCITE. You can check all the supported models <a class=\"notion-link-token notion-focusable-token notion-enable-hover\" tabindex=\"0\" href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/tree\/main\/tutorials\" rel=\"noopener noreferrer\" data-token-index=\"1\"><span class=\"link-annotation-unknown-block-id--215602724\">here<\/span><\/a>. We use the RedPajama-INCITE 3B parameter weights in this tutorial. You can find the instructions to download other supported weights like LLaMA 2 and Falcon in this <a class=\"notion-link-token notion-focusable-token notion-enable-hover\" tabindex=\"0\" href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/tree\/main\/tutorials\" rel=\"noopener noreferrer\" data-token-index=\"3\"><span class=\"link-annotation-unknown-block-id--215602724\">tutorial section<\/span><\/a>.<\/p>\n<pre class=\"hljs collapse-false language-python\">\r\n# download the model weights\r\npython scripts\/download.py --repo_id togethercomputer\/RedPajama-INCITE-Base-3B-v1<\/pre>\n<pre class=\"hljs collapse-false language-python\"># convert the weights to Lit-GPT format\r\npython scripts\/convert_hf_checkpoint.py --checkpoint_dir checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1\r\n\r\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5647924 size-large\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1024x359.png\" alt=\"\" width=\"1024\" height=\"359\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1024x359.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-300x105.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1536x538.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled.png 1866w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-300x105@2x.png 600w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>You will see, <code>gpt_neox<\/code> layers being mapped to the Lit-GPT layers in the terminal. After this step, you can find the downloaded weights in the <code>checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1<\/code> folder.<\/p>\n<h2>Prepare the dataset<\/h2>\n<p>In this tutorial, we will use the <a href=\"https:\/\/www.databricks.com\/blog\/2023\/04\/12\/dolly-first-open-commercially-viable-instruction-tuned-llm\">Dolly 2.0 instruction dataset<\/a> by Databricks for finetuning. Finetuning involves two main steps- first, we process the dataset in the Lit-GPT format and then we run the finetuning script on the processed dataset.<\/p>\n<p>Instruction datasets typically have three keys: instruction, input (optional context for the given instruction), and the expected response from the LLM. Below is a sample example of instruction data:<\/p>\n<pre class=\"hljs collapse-false\">[\r\n  {\r\n    \"instruction\": \"Arrange the given numbers in ascending order.\",\r\n    \"input\": \"2, 4, 0, 8, 3\",\r\n    \"output\": \"0, 2, 3, 4, 8\"\r\n  },\r\n...\r\n]<\/pre>\n<p class=\"collapse-false\">The <a href=\"https:\/\/huggingface.co\/datasets\/databricks\/databricks-dolly-15k\/resolve\/main\/databricks-dolly-15k.jsonl\">dolly 2.0 dataset<\/a> comes in <a href=\"https:\/\/jsonlines.org\/\">JSON Lines<\/a> format, which is plainly speaking a text file with rows of JSON data. It is a convenient format when processing one record at a time. The Dolly dataset contains the following keys &#8211;<\/p>\n<pre class=\"hljs collapse-false\">{\r\n  \"instruction\": \"When did Virgin Australia start operating?\",\r\n  \"context\": \"Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It is the largest airline by fleet size to use the Virgin brand. It commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route. It suddenly found itself as a major airline in Australia's domestic market after the collapse of Ansett Australia in September 2001. The airline has since grown to directly serve 32 cities in Australia, from hubs in Brisbane, Melbourne and Sydney.\",\r\n  \"response\": \"Virgin Australia commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route.\",\r\n  \"category\": \"closed_qa\"\r\n}<\/pre>\n<p>We need to rename <code>context<\/code> to <code>input<\/code> and <code>response<\/code> to <code>output<\/code> and we are all set to process our data.<\/p>\n<pre class=\"hljs collapse-false language-python\">with open(file_path, \"r\") as file:\r\n    data = file.readlines()\r\n    data = [json.loads(line) for line in data]\r\nfor item in data:\r\n    item[\"input\"] = item.pop(\"context\")\r\n    item[\"output\"] = item.pop(\"response\")\r\n\r\n<\/pre>\n<p>We can modify the existing <a href=\"https:\/\/github.com\/Lightning-AI\/lit-parrot\/blob\/main\/scripts\/prepare_alpaca.py\">Alpaca script<\/a> for our data preparation. This script downloads data from <a href=\"https:\/\/github.com\/tloen\/alpaca-lora\">tloen\u2019s Alpaca-lora<\/a> project and saves the processed data. It includes a <code>prepare<\/code> function that loads the raw instruction<br \/>\ndataset, creates prompts, and tokenizes them using the model tokenizer provided in the <code>checkpoint_dir<\/code>. The tokenized data is split into training and test sets based on the <code>test_split_size<\/code> provided and saved to the <code>destination_path<\/code>.<\/p>\n<p>To modify the Alpaca script, open it from <a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/blob\/main\/scripts\/prepare_alpaca.py#LL22C1-L61C55\">here<\/a> and edit the <code>prepare<\/code> function. This is how our final function would look after mapping the keys appropriately.<\/p>\n<pre class=\"hljs collapse-false language-python\">DATA_FILE = \"https:\/\/huggingface.co\/datasets\/databricks\/databricks-dolly-15k\/resolve\/main\/databricks-dolly-15k.jsonl\"\r\nDATA_FILE_NAME = \"dolly_data_cleaned_archive.json\"<\/pre>\n<pre class=\"hljs collapse-false language-python\">def prepare(\r\n    destination_path: Path = Path(\"data\/dolly\"),\r\n    checkpoint_dir: Path = Path(\"checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1\"),\r\n    test_split_size: int = 2000,\r\n    max_seq_length: int = 256,\r\n    seed: int = 42,\r\n    mask_inputs: bool = False, # as in alpaca-lora\r\n    data_file_name: str = DATA_FILE_NAME,\r\n) -&gt; None:\r\n    \"\"\"Prepare the Dolly dataset for instruction tuning.\r\n    The output is a training and validation dataset saved as `train.pt` and `val.pt`,\r\n    which stores the preprocessed and tokenized prompts and labels.\r\n    \"\"\"\r\n    destination_path.mkdir(parents=True, exist_ok=True)\r\n    file_path = destination_path \/ data_file_name\r\n    download(file_path)\r\n    tokenizer = Tokenizer(checkpoint_dir \/ \"tokenizer.json\", checkpoint_dir \/ \"tokenizer_config.json\")\r\n    with open(file_path, \"r\") as file:\r\n        data = file.readlines()\r\n        data = [json.loads(line) for line in data]\r\n    for item in data:\r\n    item[\"input\"] = item.pop(\"context\")\r\n    item[\"output\"] = item.pop(\"response\")\r\n    # Partition the dataset into train and test\r\n    train_split_size = len(data) - test_split_size\r\n    train_set, test_set = random_split(\r\n        data, lengths=(train_split_size, test_split_size), generator=torch.Generator().manual_seed(seed)\r\n        )\r\n    train_set, test_set = list(train_set), list(test_set)\r\n    print(f\"train has {len(train_set):,} samples\")\r\n    print(f\"val has {len(test_set):,} samples\")\r\n    print(\"Processing train split ...\")\r\n    train_set = [prepare_sample(sample, tokenizer, max_seq_length, mask_inputs) for sample in tqdm(train_set)]\r\n    torch.save(train_set, file_path.parent \/ \"train.pt\")\r\n    print(\"Processing test split ...\")\r\n    test_set = [prepare_sample(sample, tokenizer, max_seq_length, mask_inputs) for sample in tqdm(test_set)]\r\n    torch.save(test_set, file_path.parent \/ \"test.pt\")\r\n\r\n<\/pre>\n<p>Finally, let\u2019s run the script by providing the data path and the model checkpoint directory.<\/p>\n<pre class=\"hljs collapse-false\">python scripts\/prepare_yourscript.py \\\r\n  --destination_path data\/dolly \\\r\n  --checkpoint_dir checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1\r\n\r\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5647925 size-large\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-1024x136.png\" alt=\"\" width=\"1024\" height=\"136\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-1024x136.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-300x40.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-1536x203.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-2048x271.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-1-300x40@2x.png 600w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h2><\/h2>\n<h2>Finetuning the RedPajama-INCITE model<\/h2>\n<p>Once you have completed all the above steps, it is straightforward to start finetuning. You need to run the<br \/>\n<code>finetune\/adapter.py<\/code> script by providing your data path.<\/p>\n<pre class=\"hljs collapse-false\">python finetune\/adapter.py \\\r\n  --data_dir data\/dolly \\\r\n  --checkpoint_dir checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1\r\n  --out_dir out\/adapter\/dolly<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5647926 size-large\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2-1024x433.png\" alt=\"\" width=\"1024\" height=\"433\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2-1024x433.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2-300x127.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2-1536x649.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2.png 1642w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-2-300x127@2x.png 600w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>You can update the default number of GPUs, micro-batch size, and all the other hyperparameters in the finetuning script <a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/blob\/main\/finetune\/adapter.py#L23\">here<\/a>.<\/p>\n<p>You can play with your finetuned model using the <code>generate\/adapter.py<\/code> script by trying different prompts and turning the model temperature.<\/p>\n<pre class=\"hljs collapse-false\">python generate\/adapter.py \\\r\n  --adapter_path out\/adapter\/dolly\/lit_model_adapter_finetuned.pth \\\r\n  --checkpoint_dir checkpoints\/togethercomputer\/RedPajama-INCITE-Base-3B-v1 \\\r\n  --prompt \"who is the author of Game of thrones?\"\r\n\r\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5647927 size-large\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-3-1024x272.png\" alt=\"\" width=\"1024\" height=\"272\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-3-1024x272.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-3-300x80.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-3.png 1086w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/Untitled-3-300x80@2x.png 600w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h2><\/h2>\n<h2>Learn more about large language models and efficient finetuning techniques \ud83d\udc47<\/h2>\n<p>This article provided you with short step-by-step instructions to finetune your own large language model. We saw that thanks to parameter-efficient finetuning techniques, it\u2019s possible to do it on a single GPU. If you want to learn more about these techniques, check out our more in-depth guides below.<\/p>\n<ul>\n<li><a href=\"https:\/\/lightning.ai\/pages\/blog\/falcon-a-guide-to-finetune-and-inference\/\">Falcon \u2013 A guide to finetune and inference<\/a><\/li>\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/accelerating-large-language-models-with-mixed-precision-techniques\/\">Accelerating Large Language Models with Mixed-Precision Techniques<\/a><\/li>\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/lora-llm\/\">Parameter-Efficient LLM Finetuning With Low-Rank Adaptation (LoRA)<\/a><\/li>\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/article\/understanding-llama-adapters\/\">Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters<\/a><\/li>\n<\/ul>\n<p>We would love to hear what you have built with Lit-GPT. Join our <a href=\"https:\/\/discord.gg\/nnAuZvqTu3\">Discord community<\/a> to chat and ask your questions!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The AI community&#8217;s effort has led to the development of many high-quality open-source LLMs, including but not limited to LLaMA 2, Falcon, StableLM, and Pythia. You can finetune these models on a custom instruction dataset to adapt to your specific task, such as training a chatbot to answer financial questions. Lightning AI recently launched Lit-GPT,<a class=\"excerpt-read-more\" href=\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\" title=\"ReadHow To Finetune GPT Like Large Language Models on a Custom Dataset\">&#8230; Read more &raquo;<\/a><\/p>\n","protected":false},"author":16,"featured_media":5648310,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[29,41],"tags":[189,191,193,188,195],"glossary":[203,212,213,214,216,229],"acf":{"additional_authors":false,"mathjax":false,"default_editor":true,"show_table_of_contents":false,"hide_from_archive":false,"content_type":"Blog Post","sticky":true,"custom_styles":""},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI\" \/>\n<meta property=\"og:description\" content=\"The AI community&#8217;s effort has led to the development of many high-quality open-source LLMs, including but not limited to LLaMA 2, Falcon, StableLM, and Pythia. You can finetune these models on a custom instruction dataset to adapt to your specific task, such as training a chatbot to answer financial questions. Lightning AI recently launched Lit-GPT,... Read more &raquo;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\" \/>\n<meta property=\"og:site_name\" content=\"Lightning AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-05-19T15:01:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-07-21T13:41:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1360\" \/>\n\t<meta property=\"og:image:height\" content=\"1008\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"JP Hennessy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:site\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"JP Hennessy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\"},\"author\":{\"name\":\"JP Hennessy\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\"},\"headline\":\"How To Finetune GPT Like Large Language Models on a Custom Dataset\",\"datePublished\":\"2023-05-19T15:01:21+00:00\",\"dateModified\":\"2023-07-21T13:41:58+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\"},\"wordCount\":777,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg\",\"keywords\":[\"GPT\",\"large language models\",\"LLaMA\",\"LLMs\",\"nanoGPT\"],\"articleSection\":[\"Blog\",\"Tutorials\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\",\"url\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\",\"name\":\"How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg\",\"datePublished\":\"2023-05-19T15:01:21+00:00\",\"dateModified\":\"2023-07-21T13:41:58+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg\",\"width\":1360,\"height\":1008},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/lightning.ai\/pages\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How To Finetune GPT Like Large Language Models on a Custom Dataset\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/lightning.ai\/pages\/#website\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"name\":\"Lightning AI\",\"description\":\"The platform for teams to build AI.\",\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/lightning.ai\/pages\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\",\"name\":\"Lightning AI\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"width\":1744,\"height\":856,\"caption\":\"Lightning AI\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/LightningAI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\",\"name\":\"JP Hennessy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"caption\":\"JP Hennessy\"},\"url\":\"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/","og_locale":"en_US","og_type":"article","og_title":"How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI","og_description":"The AI community&#8217;s effort has led to the development of many high-quality open-source LLMs, including but not limited to LLaMA 2, Falcon, StableLM, and Pythia. You can finetune these models on a custom instruction dataset to adapt to your specific task, such as training a chatbot to answer financial questions. Lightning AI recently launched Lit-GPT,... Read more &raquo;","og_url":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/","og_site_name":"Lightning AI","article_published_time":"2023-05-19T15:01:21+00:00","article_modified_time":"2023-07-21T13:41:58+00:00","og_image":[{"width":1360,"height":1008,"url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg","type":"image\/jpeg"}],"author":"JP Hennessy","twitter_card":"summary_large_image","twitter_creator":"@LightningAI","twitter_site":"@LightningAI","twitter_misc":{"Written by":"JP Hennessy","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#article","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/"},"author":{"name":"JP Hennessy","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6"},"headline":"How To Finetune GPT Like Large Language Models on a Custom Dataset","datePublished":"2023-05-19T15:01:21+00:00","dateModified":"2023-07-21T13:41:58+00:00","mainEntityOfPage":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/"},"wordCount":777,"commentCount":0,"publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"image":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg","keywords":["GPT","large language models","LLaMA","LLMs","nanoGPT"],"articleSection":["Blog","Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/","url":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/","name":"How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/#website"},"primaryImageOfPage":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage"},"image":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg","datePublished":"2023-05-19T15:01:21+00:00","dateModified":"2023-07-21T13:41:58+00:00","breadcrumb":{"@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#primaryimage","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/05\/parrot.jpeg","width":1360,"height":1008},{"@type":"BreadcrumbList","@id":"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/lightning.ai\/pages\/"},{"@type":"ListItem","position":2,"name":"How To Finetune GPT Like Large Language Models on a Custom Dataset"}]},{"@type":"WebSite","@id":"https:\/\/lightning.ai\/pages\/#website","url":"https:\/\/lightning.ai\/pages\/","name":"Lightning AI","description":"The platform for teams to build AI.","publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/lightning.ai\/pages\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/lightning.ai\/pages\/#organization","name":"Lightning AI","url":"https:\/\/lightning.ai\/pages\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","width":1744,"height":856,"caption":"Lightning AI"},"image":{"@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/LightningAI"]},{"@type":"Person","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6","name":"JP Hennessy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","caption":"JP Hennessy"},"url":"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/"}]}},"_links":{"self":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5647923"}],"collection":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/comments?post=5647923"}],"version-history":[{"count":0,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5647923\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media\/5648310"}],"wp:attachment":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media?parent=5647923"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/categories?post=5647923"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/tags?post=5647923"},{"taxonomy":"glossary","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/glossary?post=5647923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}