{"id":5648447,"date":"2023-08-10T21:32:43","date_gmt":"2023-08-11T01:32:43","guid":{"rendered":"https:\/\/lightning.ai\/pages\/?p=5648447"},"modified":"2023-08-14T13:06:05","modified_gmt":"2023-08-14T17:06:05","slug":"neurips2023-llm-efficiency-guide","status":"publish","type":"post","link":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/","title":{"rendered":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide"},"content":{"rendered":"<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\"><div class=\"takeaways card-glow p-4 my-4\"><h3 class=\"w-100 d-block\">Takeaways<\/h3>Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/llm-efficiency-challenge.github.io\/\"><span class=\"md-plain\">NeurIPS 2023 LLM Efficiency Challenge<\/span><\/a><\/span><span class=\"md-plain\"> launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate<\/span><span class=\"md-plain\">. You&#8217;ll learn everything you need to know, from setting up the coding environment to making the first submission.<\/div><\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2 class=\"md-end-block md-heading\"><span id=\"toc1\" class=\"md-plain md-expand\">1 &#8211; What is the NeurIPS LLM Efficiency Challenge?<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/llm-efficiency-challenge.github.io\"><span class=\"md-plain\">NeurIPS 2023 LLM Efficiency Challenge<\/span><\/a><\/span><span class=\"md-plain\"> is a competition focused on training <\/span><span class=\"md-pair-s \"><strong><span class=\"md-plain\">1 LLM for 24 hours on 1 GPU<\/span><\/strong><\/span><span class=\"md-plain\"> &#8212; the team with the best LLM gets to present their results at NeurIPS 2023.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Large language models like GPT-4 have impressive capabilities. However, they are expensive to develop and run, and there is a lot of demand for custom large language models, as well:<\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">personal assistants for writing drafts (many researchers and companies cannot post sensitive materials into ChatGPT);<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Question-and-answer systems for legal, medical, or financial data and documents;<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Customer chatbots that have domain knowledge in a specific field or with respect to a company&#8217;s products.<\/span><\/p>\n<\/li>\n<\/ul>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Applications aside, for researchers like me, this challenge is a very exciting opportunity to develop and try new methods to train LLMs more efficiently.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">But before we jump to the hands-on sections, let&#8217;s review some key points and restrictions for a brief overview. However, participants should check out the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/llm-efficiency-challenge.github.io\/rules\"><span class=\"md-plain\">official guidelines<\/span><\/a><\/span><span class=\"md-plain\"> for all up-to-date details.<\/span><\/p>\n<div id=\"attachment_5648448\" style=\"width: 613px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648448\" class=\"wp-image-5648448\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner.png\" alt=\"\" width=\"603\" height=\"255\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner.png 2390w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner-300x127.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner-1024x433.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner-1536x649.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner-2048x865.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/banner-300x127@2x.png 600w\" sizes=\"(max-width: 603px) 100vw, 603px\" \/><p id=\"caption-attachment-5648448\" class=\"wp-caption-text\">The official challenge website is hosted at <a href=\"https:\/\/llm-efficiency-challenge.github.io\">https:\/\/llm-efficiency-challenge.github.io<\/a><\/p><\/div>\n<div class=\"mceTemp\"><\/div>\n<p>&nbsp;<\/p>\n<h2 id=\"toc2\" class=\"md-end-block md-heading\"><span class=\"md-plain md-expand\">2 &#8211; Competition Overview<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">This section covers the <\/span><span class=\"md-pair-s \"><em><span class=\"md-plain\">NeurIPS 2023 Efficiency Challenge<\/span><\/em><\/span><span class=\"md-plain\"> in brief. (I highly recommend also checking out the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/llm-efficiency-challenge.github.io\/rules\"><span class=\"md-plain\">official guidelines<\/span><\/a><\/span><span class=\"md-plain\"> for all up-to-date details.)<\/span><\/p>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">GPUs<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Since only 1 GPU is allowed, this challenge is a nice testbed for experimenting with efficient finetuning techniques without worrying too much about the infrastructure. Only the following two Nvidia GPUs are allowed:<\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">A100 (40 GB RAM);<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">and RTX 4090 (24 GB RAM).<\/span><\/p>\n<\/li>\n<\/ul>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(Since these GPUs are not directly comparable, there are two different tracks and leaderboards.)<\/span><\/p>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">Models<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">All three transformer LLM architecture types are allowed: encoders, encoder-decoders, and decoders. (What&#8217;s the difference between encoders and decoders? I discussed it <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/magazine.sebastianraschka.com\/p\/understanding-encoder-and-decoder\"><span class=\"md-plain\">here<\/span><\/a><\/span><span class=\"md-plain\">.) <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">However, I speculate that decoder-only architectures may be the most promising direction:<\/span><\/p>\n<blockquote>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">&#8220;We explored this question in Wang et al. (2022a) where we evaluated encoder-decoder and decoder-only architectures and their interactions with causal, prefix, and masked language modeling pretraining objectives. Our results show that immediately after pretraining, causal decoder-only models performed best \u2013 validating the choice of state-of-the-art LLMs.&#8221; &#8212; Citation from <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/arxiv.org\/abs\/2211.05100\"><span class=\"md-plain\">BLOOM: A 176B-Parameter Open-Access Multilingual Language Model<\/span><\/a><\/span><\/p>\n<\/blockquote>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">The list of approved LLMs is summarized in the figure below. Note that the competition is focused on base (foundation) LLMs that are not finetuned (yet) since finetuning is the focus of this competition.<\/span><\/p>\n<div id=\"attachment_5648450\" style=\"width: 627px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648450\" class=\"wp-image-5648450 \" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/allowed-llms.png\" alt=\"\" width=\"617\" height=\"304\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/allowed-llms.png 1460w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/allowed-llms-300x148.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/allowed-llms-1024x505.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/allowed-llms-300x148@2x.png 600w\" sizes=\"(max-width: 617px) 100vw, 617px\" \/><p id=\"caption-attachment-5648450\" class=\"wp-caption-text\">Models that are permitted in the competition (as of this writing)<\/p><\/div>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">Data and Tasks<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">When choosing a dataset, note that models are not expected to handle contexts with more than 2048 tokens. The evaluation, a subset of Stanford&#8217;s <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/crfm.stanford.edu\/helm\/latest\/\"><span class=\"md-plain\">HELM<\/span><\/a><\/span><span class=\"md-plain\"> benchmark suite, will be on English texts. (We will run the HELM benchmark at the end of this article.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Participants submit the training and evaluation codes, which should contain all the necessary steps to train the base model for up to 24 h on an A100 or RTX 4090 on an &#8220;open source&#8221; dataset. As of this writing, the following datasets were permitted: <\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/huggingface.co\/datasets\/databricks\/databricks-dolly-15k\"><span class=\"md-plain\">Databricks-Dolly-15<\/span><\/a><\/span><span class=\"md-plain\"> (15k instruction-response pairs)<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/huggingface.co\/datasets\/OpenAssistant\/oasst1\"><span class=\"md-plain\">OpenAssistant Conversations Dataset (oasst1)<\/span><\/a><\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/mobarski\/alpaca-libre\"><span class=\"md-plain\">Alpaca Libre<\/span><\/a><\/span><\/p>\n<\/li>\n<\/ul>\n<div id=\"attachment_5648451\" style=\"width: 547px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648451\" class=\"wp-image-5648451 \" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/dataset.png\" alt=\"\" width=\"537\" height=\"447\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/dataset.png 1400w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/dataset-300x250.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/dataset-1024x853.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/dataset-300x250@2x.png 600w\" sizes=\"(max-width: 537px) 100vw, 537px\" \/><p id=\"caption-attachment-5648451\" class=\"wp-caption-text\">Two examples from the <a href=\"https:\/\/github.com\/mobarski\/alpaca-libre\">Alpaca-Libre<\/a> dataset.<\/p><\/div>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-pair-s md-expand\"><strong><span class=\"md-plain\">UPDATE: The organizers just disallowed the use of Alpaca-Libre in the competition since it is against the policy, which says that no LLM-generated data can be used in this competition.<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-pair-s md-expand\"><strong><span class=\"md-plain\">Since this is a submission tutorial, I try to focus on the main process, from zero to submission. But I may revisit certain topics, like the datasets, in more detail in future standalone articles.<\/span><\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc3\" class=\"md-end-block md-heading\"><span class=\"md-plain md-expand\">3 &#8211; <\/span><span class=\"md-plain\">The Official Starter Kit<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The NeurIPS efficiency challenge organizers selected the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\"><span class=\"md-plain\">Lit-GPT<\/span><\/a><\/span><span class=\"md-plain\"> repository as the official starter kit, an open-source GitHub repository that implements methods and tools for loading popular LLMs (see table below). This comes in handy since I have some experience with this repository due to some contributions in the past, including implementing LLaMA-Adapter v2, full finetuning, a port of low-rank adaptation (LoRA), collaboration on implementing QLoRA, and more. <\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">What&#8217;s also nice is that Lit-GPT implements the currently most relevant LLMs for this competition, as summarized in the table below:<\/span><\/p>\n<p>&nbsp;<\/p>\n<table class=\"md-table\" style=\"height: 267px;\" width=\"505\">\n<thead>\n<tr class=\"md-end-block md-focus-container\">\n<th><span class=\"td-span md-focus\"><span class=\"md-plain md-expand\">Models in Lit-GPT<\/span><\/span><\/th>\n<th><span class=\"td-span\"><span class=\"md-plain\">Reference<\/span><\/span><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">Meta AI Llama 2<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/arxiv.org\/abs\/2307.09288\"><span class=\"md-plain\">Touvron et al. 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">Stability AI FreeWilly2<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/stability.ai\/blog\/stable-beluga-large-instruction-fine-tuned-models\"><span class=\"md-plain\">Stability AI 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">TII UAE Falcon<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/falconllm.tii.ae\"><span class=\"md-plain\">TII 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">OpenLM Research OpenLLaMA<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/openlm-research\/open_llama\"><span class=\"md-plain\">Geng &amp; Liu 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">LMSYS Vicuna<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lmsys.org\/blog\/2023-06-29-longchat\"><span class=\"md-plain\">Li et al. 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">Together RedPajama-INCITE<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/together.ai\/blog\/redpajama-models-v1\"><span class=\"md-plain\">Together 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block\">\n<td><span class=\"td-span\"><span class=\"md-plain\">EleutherAI Pythia<\/span><\/span><\/td>\n<td><span class=\"td-span\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/arxiv.org\/abs\/2304.01373\"><span class=\"md-plain\">Biderman et al. 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<tr class=\"md-end-block md-focus-container\">\n<td><span class=\"td-span\"><span class=\"md-plain\">StabilityAI StableLM<\/span><\/span><\/td>\n<td><span class=\"td-span md-focus\"><span class=\"md-meta-i-c md-link md-expand\"><a href=\"https:\/\/github.com\/Stability-AI\/StableLM\"><span class=\"md-plain\">Stability AI 2023<\/span><\/a><\/span><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">As of this writing, I recommend focusing on Llama 2 and perhaps Falcon since these two model suites are currently most promising based on public leaderboards. (To keep this guide focused on the submission, I will save a more detailed model discussion for a future article.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">In the following sections, I will walk you through setting up a computing environment and the Lit-GPT repository for experimentation and submission!<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><em><span class=\"md-plain\">Note that it&#8217;s not required to use this Lit-GPT starter kit that the organizers suggested. Also, note that the organizers are not affiliated with the repository or its developers but chose it independently, which is likely because it&#8217;s relatively customizable and &#8220;hackable,&#8221; which comes in handy when trying new research ideas.<\/span><\/em><\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc4\" class=\"md-end-block md-heading\"><span class=\"md-plain\">4 &#8211; Setting Up a Project Environment<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Personally, I prefer creating a dedicated virtual environment for each research project that I am working on, which helps me manage specific version numbers and so forth. For this, I (still) prefer using the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/docs.conda.io\/en\/latest\/\"><span class=\"md-plain\">conda package manager<\/span><\/a><\/span><span class=\"md-plain\">. In this section, I will walk you through the process of how I like to set this up. (If you are already comfortable using <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>conda<\/code><\/span><span class=\"md-plain\">, <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>venv<\/code><\/span><span class=\"md-plain\">, or any other virtual environment setup, you can skip this section.)<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">On the machine where you plan to run the experiments, download <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/docs.conda.io\/en\/latest\/miniconda.html\"><span class=\"md-plain\">miniconda<\/span><\/a><\/span><span class=\"md-plain\"> or <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/conda-forge\/miniforge\"><span class=\"md-plain\">miniforge<\/span><\/a><\/span><span class=\"md-plain md-expand\">. If you are using a Linux computer, that&#8217;s probably the first row, the one on the top in the screenshot below.<\/span><\/p>\n<div id=\"attachment_5648452\" style=\"width: 550px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648452\" class=\"wp-image-5648452 \" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/miniforge.png\" alt=\"\" width=\"540\" height=\"290\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/miniforge.png 1372w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/miniforge-300x161.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/miniforge-1024x551.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/miniforge-300x161@2x.png 600w\" sizes=\"(max-width: 540px) 100vw, 540px\" \/><p id=\"caption-attachment-5648452\" class=\"wp-caption-text\">Miniforge installation options<\/p><\/div>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">You can then install the conda package manager by executing the respective shell script and following the instructions:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\"><span class=\"cm-builtin\">sh<\/span> Miniforge3-Linux-x86_64.sh<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Next, create a new conda environment:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false language-python\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\">conda create <span class=\"cm-attribute\">-n<\/span> neurips2023-1 <span class=\"cm-def\">python<\/span><span class=\"cm-operator\">=<\/span><span class=\"cm-number\">3<\/span>.10 <span class=\"cm-attribute\">--yes<\/span><\/span><\/pre>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">After the installation, activate the environment:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false language-python\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\">conda activate neurips2023-1<\/span><\/pre>\n<p class=\"md-end-block md-p\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648453\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate.png\" alt=\"\" width=\"735\" height=\"158\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate.png 1560w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate-300x65.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate-1024x221.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate-1536x331.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/conda-activate-300x65@2x.png 600w\" sizes=\"(max-width: 735px) 100vw, 735px\" \/><\/p>\n<p><span class=\"md-plain\">When I am working on a remote machine, I also like using <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/tmux\/tmux\/wiki\"><span class=\"md-plain\">tmux<\/span><\/a><\/span><span class=\"md-plain\"> to be able to restart my terminal session in case I get reconnected:<\/span><\/p>\n<pre class=\"hljs collapse-false\">tmux new -s neurips-1\r\ncd ~\/Developer\/neurips23\r\nconda activate neurips2023-1<\/pre>\n<p>Then, each time I get disconnect, I can log back into the machine and resume the session via the following command:<\/p>\n<pre class=\"hljs collapse-false\">tmux attach -t neurips-1<\/pre>\n<p>&nbsp;<\/p>\n<h2 id=\"toc5\" class=\"md-end-block md-heading\"><span class=\"md-plain md-expand\">5 &#8211; Installing the Requirements<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">After setting up a virtual environment for the NeurIPS competition, we can now clone the Lit-GPT repository and install the respective requirements. First, let&#8217;s clone the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\"><span class=\"md-plain\">Lit-GPT<\/span><\/a><\/span><span class=\"md-plain\"> GitHub repository:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\"><span class=\"cm-builtin\">git<\/span> clone https:\/\/github.com\/Lightning-AI\/lit-gpt.git<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">This repository contains a <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/blob\/main\/requirements.txt\"><span class=\"md-plain\">requirements.txt<\/span><\/a><\/span><span class=\"md-plain\"> file with the respective Python packages from <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/pypi.org\/\"><span class=\"md-plain\">PyPI<\/span><\/a><\/span><span class=\"md-plain\"> that are required for using the code in this repository. However, note that Lit-GPT leverages the latest PyTorch features, so we have to install the PyTorch nightly version (unfortunately, these can&#8217;t be installed via a requirements.txt file). <\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">We can install the latest PyTorch release by selecting and running the respective command from the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/pytorch.org\/\"><span class=\"md-plain\">pytorch.org installer menu<\/span><\/a><\/span><span class=\"md-plain\">, as shown in the screenshot below.<\/span><\/p>\n<div id=\"attachment_5648454\" style=\"width: 660px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648454\" class=\"wp-image-5648454 \" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly.png\" alt=\"\" width=\"650\" height=\"429\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly.png 1582w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly-300x198.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly-1024x676.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly-1536x1014.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pytorch-nightly-300x198@2x.png 600w\" sizes=\"(max-width: 650px) 100vw, 650px\" \/><p id=\"caption-attachment-5648454\" class=\"wp-caption-text\">The PyTorch installation menu<\/p><\/div>\n<p>&nbsp;<\/p>\n<p><span class=\"md-plain\">Next, we can use install <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>pip<\/code><\/span><span class=\"md-plain\"> to install the remaining requirements via the following commands:<\/span><\/p>\n<pre class=\"hljs collapse-false\">cd lit-gpt\r\npip install -r requirements.txt<\/pre>\n<h2><\/h2>\n<p>&nbsp;<\/p>\n<h2 id=\"toc6\" class=\"md-end-block md-heading\"><span class=\"md-plain md-expand\">6 &#8211; Downloading Model Checkpoints<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">As of this writing, I suggest Meta&#8217;s newly released <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/arxiv.org\/abs\/2307.09288\"><span class=\"md-plain\">Llama 2<\/span><\/a><\/span><span class=\"md-plain\"> as the most promising base model. To keep things simple, it probably makes sense to focus on the 7 billion parameter version, which should fit into a 24 Gb RTX 4090 or 40 Gb A100 GPU when using parameter-efficient finetuning techniques. (Please see the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/tree\/main\/tutorials\"><span class=\"md-plain\">Lit-GPT tutorials<\/span><\/a><\/span><span class=\"md-plain\"> if you want to download other models.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">We can download the 7B Llama 2 base model using the scripts\/download.py script provided in the Lit-GPT repository. The downloaded file will require approximately 13 GB of disk space.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">However, first, you need to complete the following steps:<\/span><\/p>\n<ol class=\"ol-list\" start=\"\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Create a Hugging Face (HF) account at <\/span><span class=\"md-link md-pair-s\" spellcheck=\"false\"><a href=\"https:\/\/huggingface.co\/meta-llama\/Llama-2-7b\">https:\/\/huggingface.co\/meta-llama\/Llama-2-7b<\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Apply for Llama-2 access at <\/span><span class=\"md-link md-pair-s\" spellcheck=\"false\"><a href=\"https:\/\/huggingface.co\/meta-llama\/Llama-2-7b\">https:\/\/huggingface.co\/meta-llama\/Llama-2-7b<\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/huggingface.co\/settings\/tokens\"><span class=\"md-plain\">Obtain your HF token<\/span><\/a><\/span><span class=\"md-plain\">, which you can generate under <\/span><span class=\"md-link md-pair-s\" spellcheck=\"false\"><a href=\"https:\/\/huggingface.co\/settings\/tokens\">https:\/\/huggingface.co\/settings\/tokens<\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<p><span class=\"md-plain\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648455\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf.png\" alt=\"\" width=\"517\" height=\"250\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf.png 1742w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf-300x145.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf-1024x495.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf-1536x742.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/hf-300x145@2x.png 600w\" sizes=\"(max-width: 517px) 100vw, 517px\" \/><\/span><\/p>\n<p><span class=\"md-plain\">Next, to download the Llama 7B model, we need to provide the HF token via the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>--token<\/code><\/span><span class=\"md-plain md-expand\"> argument, as shown below:<\/span><\/p>\n<pre class=\"hljs collapse-false\">cd ~\/Developer\/neurips23\/lit-gpt\r\npip install huggingface_hub\r\npython scripts\/download.py --repo_id meta-llama\/Llama-2-7b-hf --token your_hf_token<\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(Here, <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>your_hf_token<\/code><\/span><span class=\"md-plain\"> is the token you can copy from your user account on the HF website.)<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">If you see the message<\/span><\/p>\n<blockquote>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Your request to access model meta-llama\/Llama-2-7b-hf is awaiting a review from the repo authors.<\/span><\/p>\n<\/blockquote>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">you may have to wait (currently 1-2 days) for approval.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">The good news is that we can use the 7B OpenLLaMA model in the meantime, which doesn&#8217;t require authentication:<\/span><\/p>\n<pre class=\"hljs collapse-false\">python scripts\/download.py --repo_id openlm-research\/open_llama_7b<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648456\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama.png\" alt=\"\" width=\"726\" height=\"169\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama.png 1800w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama-300x70.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama-1024x238.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama-1536x357.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/openllama-300x70@2x.png 600w\" sizes=\"(max-width: 726px) 100vw, 726px\" \/>\r\n\r\n<\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(You may also use the smaller 3 billion parameter version via <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>--repo_id openlm-research\/open_llama_3b<\/code><\/span><span class=\"md-plain\"> for experimentation.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">By default, the checkpoint files will be saved in a local directory <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>checkpoints\/<\/code><\/span><span class=\"md-plain\"> inside the Lit-GPT repository.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Next, we convert the downloaded file into the common weight format used by all models in Lit-GPT:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"\" spellcheck=\"false\">\u00a0<span role=\"presentation\"><span class=\" CodeMirror-selectedtext\">python scripts\/convert_hf_checkpoint.py --checkpoint_dir checkpoints\/openlm-research\/open_llama_7b<\/span><\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Before we prepare the dataset and finetune the model, let&#8217;s make sure it works by using the respective <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>generate<\/code><\/span><span class=\"md-plain\"> script:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\">python generate\/base.py <span class=\"cm-attribute\">--checkpoint_dir<\/span> checkpoints\/openlm-research\/open_llama_7b <span class=\"cm-attribute\">--prompt<\/span> <span class=\"cm-string\">\"Tell me an interesting fun fact:\"<\/span><\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Note that base models are trained as text-completion models in contrast to instruction-finetuned models, which can be used for chatting. We use a base model here since finetuning is part of the competition. However, we can see that even though the model was only trained to predict the next word, it can come up with an interesting response:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-5648457 aligncenter\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response.png\" alt=\"\" width=\"629\" height=\"273\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response.png 1712w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response-300x130.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response-1024x445.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response-1536x668.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/prompt-response-300x130@2x.png 600w\" sizes=\"(max-width: 629px) 100vw, 629px\" \/><\/p>\n<p><strong><span class=\"md-plain\">Tip: Using Symlinks<\/span><\/strong><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain md-expand\">You may want to create multiple project folders at some point. I recommend using <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Symbolic_link\"><span class=\"md-plain\">symbolic links<\/span><\/a><\/span><span class=\"md-plain\"> to avoid redownloading or copying the original model checkpoints or datsets. For instance, if the model checkpoints sit in a shared directory <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>\/shared\/data\/checkpoints<\/code><\/span><span class=\"md-plain\"> you can create a symbolic link inside the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lit-gpt<\/code><\/span><span class=\"md-plain\"> repository as follows:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\"><span class=\"cm-builtin\">cd<\/span> ~\/Developer\/neurips23\/experiment1\/lit-gpt<\/span>\r\n\u00a0<span role=\"presentation\"><span class=\"cm-builtin\">ln<\/span> <span class=\"cm-attribute\">-s<\/span> \/shared\/data\/checkpoints checkpoints<\/span><\/pre>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">(You can use the same <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>ln -s<\/code><\/span><span class=\"md-plain\"> command to create a symbolic link to your dataset.)<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc7\" class=\"md-end-block md-heading\"><span class=\"md-plain md-expand\">7 &#8211; Downloading and Preparing Datasets<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">I highly recommend checking the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/llm-efficiency-challenge.github.io\/challenge\"><span class=\"md-plain\">official rules<\/span><\/a><\/span><span class=\"md-plain\"> for up-to-date information on the permitted models and datasets. As of this writing, the following datasets are allowed, as mentioned earlier:<\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/huggingface.co\/datasets\/databricks\/databricks-dolly-15k\"><span class=\"md-plain\">Databricks-Dolly-15<\/span><\/a><\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/huggingface.co\/datasets\/OpenAssistant\/oasst1\"><span class=\"md-plain\">OpenAssistant Conversations Dataset (oasst1)<\/span><\/a><\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/mobarski\/alpaca-libre\"><span class=\"md-plain\">Alpaca Libre<\/span><\/a><\/span><\/p>\n<\/li>\n<\/ul>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">UPDATE: The organizers just disallowed the use of Alpaca-Libre in the competition since it is against the policy, saying that no LLM-generated data can be used in this competition.<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">To keep it simple, we will be using Alpaca Libre for this competition, which we briefly covered in the <\/span><span class=\"md-pair-s \"><em><span class=\"md-plain\">Datasets and Tasks section<\/span><\/em><\/span><span class=\"md-plain\"> at the beginning of this article. You can download it as follows, which converts the original .json file into a PyTorch tensor format to accelerate the data loading later:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"python\" spellcheck=\"false\"><span role=\"presentation\"><span class=\"cm-variable\">python<\/span> <span class=\"cm-variable\">scripts<\/span><span class=\"cm-operator\">\/<\/span><span class=\"cm-variable\">prepare_alpaca_libre<\/span>.<span class=\"cm-property\">py<\/span> \\\r\n<span class=\"cm-operator\">--<\/span><span class=\"cm-variable\">checkpoint_dir<\/span> <span class=\"cm-variable\">checkpoints<\/span><span class=\"cm-operator\">\/<\/span><span class=\"cm-variable\">openlm<\/span><span class=\"cm-operator\">-<\/span><span class=\"cm-variable\">research<\/span><span class=\"cm-operator\">\/<\/span><span class=\"cm-variable\">open_llama_7b<\/span><span class=\"cm-operator\">\/<\/span><\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(This should be fast; the processed Alpaca-Libre dataset, saved under <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>.\/data\/alpaca_libre\/<\/code><\/span><span class=\"md-plain\">, will occupy approximately 120 Mb.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">Note:<\/span><\/strong><\/span><span class=\"md-plain\"> If this prepare_alpaca_libre.py file is not available in your repository yet, that&#8217;s likely because I just recently submitted it to Lit-GPT, and it has not been merged yet. In that case, you can download it from <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/pull\/358\"><span class=\"md-plain\">this PR<\/span><\/a><\/span><span class=\"md-plain\"> or use the regular Alpaca dataset via scripts\/prepare_alpaca.py instead.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">Attention:<\/span><\/strong><\/span><span class=\"md-plain\"> if you are considering using a different model later, you have to prepare the dataset again with a different &#8211;checkpoint_dir flag since different models may use different tokenizers.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc8\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain md-expand\">8 &#8211; Establishing a Finetuning Baseline<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">After completing the dataset preparation steps outlined in the previous section, we are now ready for the more interesting part, finetuning the model. This is where we get to be creative, combining or thinking of new research ideas to improve the modeling performance of the base models.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">I plan to write about interesting research directions to try in upcoming articles. To keep this guide focused on the main steps, we will focus on establishing a performance baseline here. For this, let&#8217;s take the OpenLLaMA 7B model and finetune it on the Alpaca-Libre dataset using low-rank adaptation (LoRA):<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\">python finetune\/lora.py \\<\/span>\r\n\u00a0<span role=\"presentation\"><span class=\"cm-attribute\">--data_dir<\/span> data\/alpaca_libre\/ \\<\/span>\r\n\u00a0<span role=\"presentation\"><span class=\"cm-attribute\">--checkpoint_dir<\/span> checkpoints\/openlm-research\/open_llama_7b\/ \\<\/span>\r\n\u00a0<span role=\"presentation\"><span class=\"cm-attribute\">--precision<\/span> bf16-true<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(You can see the additional options via <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>python finetune\/lora.py --help<\/code><\/span><span class=\"md-plain\">.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Using the default settings, that is, a microbatch size of 4, context length of 2048, and bf16-true precision as shown above (explanations will follow later in this article), this took about 7:28 h using an A100:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"\" spellcheck=\"false\">\u00a0<span role=\"presentation\">{'eval_interval': 100, 'save_interval': 100, 'eval_iters': 100, 'log_interval': 1, 'devices': 1, 'learning_rate': 0.0003, 'batch_size': 128, 'micro_batch_size': 4, ...}<\/span>\r\n\u00a0<span role=\"presentation\">Global seed set to 1337<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">Loading model 'checkpoints\/openlm-research\/open_llama_7b\/lit_model.pth' with {'org': 'openlm-research', 'name': 'open_llama_7b', 'block_size': 2048, 'vocab_size': 32000, ...}<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">Number of trainable parameters: 4,194,304<\/span>\r\n\u00a0<span role=\"presentation\">Number of non trainable parameters: 6,738,415,616<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">Validating ...<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">...<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">Estimated TFLOPs: 357.80<\/span>\r\n\u00a0<span role=\"presentation\">Measured TFLOPs: 324.99<\/span>\r\n\u00a0<span role=\"presentation\">...<\/span>\r\n\u00a0<span role=\"presentation\">iter 30 step 0: loss 1.9667, iter time: 92.21ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 31 step 1: loss 1.9221, iter time: 196.06ms (optimizer.step)<\/span>\r\n\u00a0<span role=\"presentation\">iter 32 step 1: loss 1.0282, iter time: 199.56ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 33 step 1: loss 1.3246, iter time: 136.38ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 34 step 1: loss 2.0406, iter time: 94.96ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 35 step 1: loss 2.2522, iter time: 84.61ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 36 step 1: loss 1.4814, iter time: 113.93ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 37 step 1: loss 1.7872, iter time: 92.81ms<\/span>\r\n\u00a0<span role=\"presentation\">...<\/span>\r\n\u00a0<span role=\"presentation\">\u200b<\/span>\r\n\u00a0<span role=\"presentation\">...<\/span>\r\n\u00a0<span role=\"presentation\">iter 49990 step 1562: loss 0.5110, iter time: 84.79ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49991 step 1562: loss 0.5513, iter time: 147.55ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49992 step 1562: loss 0.4352, iter time: 134.89ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49993 step 1562: loss 0.3533, iter time: 101.12ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49994 step 1562: loss 0.4636, iter time: 166.13ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49995 step 1562: loss 0.5932, iter time: 96.34ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49996 step 1562: loss 0.4907, iter time: 131.20ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49997 step 1562: loss 0.4948, iter time: 135.04ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49998 step 1562: loss 0.5330, iter time: 84.70ms<\/span>\r\n\u00a0<span role=\"presentation\">iter 49999 step 1562: loss 0.4570, iter time: 100.29ms<\/span>\r\n\u00a0<span role=\"presentation\">Training time: 26239.77s<\/span>\r\n\u00a0<span role=\"presentation\">Saving LoRA weights to 'out\/lora\/alpaca\/lit_model_lora_finetuned.pth'<\/span><\/pre>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">Personally, I also like to add the following line to all my scripts so that I can see the maximum memory consumption after the training is completed:<\/span><\/p>\n<pre class=\"hljs collapse-false\">print(f\"Memory used: {torch.cuda.max_memory_allocated() \/ 1e9:.02f} GB\", file=sys.stderr)<\/pre>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648458\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-alloc.png\" alt=\"\" width=\"827\" height=\"619\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-alloc.png 1498w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-alloc-300x225.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-alloc-1024x767.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-alloc-300x225@2x.png 600w\" sizes=\"(max-width: 827px) 100vw, 827px\" \/><\/p>\n<p>In the case of the model above, that prints the following:<\/p>\n<pre class=\"hljs collapse-false\">Memory used: 28.81 GB<\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Given that the competition allows A100 GPUs with 40 GB RAM, this tells us that we can increase the number of trainable parameters, microbatch size, or something else to increase the RAM usage by ~11 GB to fully utilize this GPU.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">By the way, QLoRA-like tuning should also be supported via <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>--quantize \"bnb.nf4\"<\/code><\/span><span class=\"md-plain\"> via <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/pull\/275\"><span class=\"md-plain\">this Pull Request<\/span><\/a><\/span><span class=\"md-plain\">. This will bring down the memory consumption to 17.04 GB so that you can run it on an RTX 4090. If you are using an RTX 4090, there will also be more tips on reducing memory requirements in the following sections.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc9\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain md-expand\">9 &#8211; Using the Model<\/span><\/h2>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">To quickly check the model with a prompt, we can use the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>generate\/lora.py<\/code><\/span><span class=\"md-plain\"> script as follows:<\/span><\/p>\n<pre class=\"hljs collapse-false\">python generate\/lora.py --prompt \"how do you make pizza?\" \\\r\n--checkpoint_dir \\\r\n'\/home\/sebastian\/Developer\/neurips23\/lit-gpt\/checkpoints\/openlm-research\/open_llama_7b'<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648459\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pizza.png\" alt=\"\" width=\"648\" height=\"129\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pizza.png 1524w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pizza-300x60.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pizza-1024x204.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/pizza-300x60@2x.png 600w\" sizes=\"(max-width: 648px) 100vw, 648px\" \/><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain md-expand\">Here, we mainly want to see that the model can produce a coherent text output. And this looks good. We will be revisiting model evaluation later in this article.<\/span><\/p>\n<h2><\/h2>\n<h2 id=\"toc10\" class=\"md-end-block md-heading\"><span class=\"md-plain\">10 &#8211; Changing Finetuning Settings<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">In the previous sections, we finetuned and used a base model with default settings. Of course, we will need to make changes if we want to actually <\/span><span class=\"md-pair-s \"><em><span class=\"md-plain\">compete<\/span><\/em><\/span><span class=\"md-plain\"> in the competition. I am planning to cover research directions in future articles, but for now, I want to briefly introduce a few settings to get the most out of the provided code. <\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">Earlier, in the <\/span><span class=\"md-pair-s \"><em><span class=\"md-plain\">Establishing a Finetuning Baseline<\/span><\/em><\/span><span class=\"md-plain\"> section, we mentioned default settings such as a microbatch size of 4, context length of 2048, and so forth. These can be directly changed at the top of the script itself:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648460\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/default-script.png\" alt=\"\" width=\"488\" height=\"481\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/default-script.png 1122w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/default-script-300x296.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/default-script-1024x1009.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/default-script-300x296@2x.png 600w\" sizes=\"(max-width: 488px) 100vw, 488px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">Let&#8217;s discuss what some of these settings are.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>*_interval<\/code><\/span><span class=\"md-plain\"> settings<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>*_interval<\/code><\/span><span class=\"md-plain\"> settings are to specify how often the model is evaluated and saved. This is useful when developing the model. However, before submission, it&#8217;s probably a good idea to increase this number to save a few seconds or minutes.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>devices<\/code><\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-pair-s\" spellcheck=\"false\"><code>devices<\/code><\/span><span class=\"md-plain\"> specifies how many devices are used. If this number is larger than 2, it uses fully-sharded data parallelism, meaning that it a) runs data parallelism and b) divides large layers across multiple GPUs. If you are interested, I have additional explanations of multi-GPU training in <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/courses\/deep-learning-fundamentals\/9.0-overview-techniques-for-speeding-up-model-training\/unit-9.2-multi-gpu-training-strategies\/\"><span class=\"md-plain\">Units 9.2 and 9.3 of my Deep Learning course<\/span><\/a><\/span><span class=\"md-plain md-expand\">. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648461\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tensor-parallelism.png\" alt=\"\" width=\"548\" height=\"289\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tensor-parallelism.png 1524w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tensor-parallelism-300x158.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tensor-parallelism-1024x539.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tensor-parallelism-300x158@2x.png 600w\" sizes=\"(max-width: 548px) 100vw, 548px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain md-expand\">However, since the competition is limited to 1 GPU, we do not have to worry about this setting here and leave it at <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>devices=1<\/code><\/span><span class=\"md-plain\">.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>override_max_seq_length<\/code><\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Setting <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>override_max_seq_length=None<\/code><\/span><span class=\"md-plain\"> means that a model&#8217;s default context length is used. For OpenLLaMA, that&#8217;s <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>2048<\/code><\/span><span class=\"md-plain\">, which is coincidentally also the maximum length permitted by the competition. So, in this case, it&#8217;s a good idea to set it to <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>override_max_seq_length=2048<\/code><\/span><span class=\"md-plain\"> if you are planning to experiment with different models (which you probably should if you want to have a competitive submission.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>learning_rate<\/code><\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>learning_rate<\/code><\/span><span class=\"md-plain\"> is a hyperparameter that we will have to tinker with. Usually, we determine this by monitoring the loss and evaluating the model on a validation set. A detailed discussion for tuning learning rates is out of the scope of this article, but you may like <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/courses\/deep-learning-fundamentals\/unit-6-overview-essential-deep-learning-tips-tricks\/unit-6.2-learning-rates-and-learning-rate-schedulers\/\"><span class=\"md-plain\">Unit 6.2 \u2013 Learning Rates and Learning Rate Schedulers<\/span><\/a><\/span><span class=\"md-plain\"> of my deep learning course.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>batch_size<\/code><\/span><span class=\"md-plain\"> and <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>micro_batch_size<\/code><\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Since the model uses gradient accumulationm there are two batch size settings, <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>batch_size<\/code><\/span><span class=\"md-plain\"> and <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>micro_batch_size<\/code><\/span><span class=\"md-plain\">. The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>micro_batch_size<\/code><\/span><span class=\"md-plain\"> is the batch size that the model receives in each forward pass. The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>batch_size<\/code><\/span><span class=\"md-plain\"> determines the actual batch size for the model updates in the backward pass. <\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">In other words, if the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>batch_size<\/code><\/span><span class=\"md-plain\"> is set to 128 and the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>micro_batch_size<\/code><\/span><span class=\"md-plain\"> is set to 4, the model will perform 32 forward passes (128 \/ 4 = 32) to accumulate the loss for each backward pass. The model performance and gradient updates are exactly the same as regular training, though. We can think of gradient accumulation as a trick to save memory. If you want to learn more about gradient accumulation, check out my blog post <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/blog\/gradient-accumulation\/\"><span class=\"md-plain\">Finetuning LLMs on a Single GPU Using Gradient Accumulation<\/span><\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<div id=\"attachment_5648462\" style=\"width: 453px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648462\" class=\"wp-image-5648462\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/gradient-accum.png\" alt=\"\" width=\"443\" height=\"411\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/gradient-accum.png 1356w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/gradient-accum-300x278.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/gradient-accum-1024x948.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/gradient-accum-300x278@2x.png 600w\" sizes=\"(max-width: 443px) 100vw, 443px\" \/><p id=\"caption-attachment-5648462\" class=\"wp-caption-text\">Explanation of gradient accumulation from <a href=\"https:\/\/lightning.ai\/blog\/gradient-accumulation\/\">Finetuning LLMs on a Single GPU Using Gradient Accumulation<\/a>.<\/p><\/div>\n<p>&nbsp;<\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">If we change the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>micro_batch_size<\/code><\/span><span class=\"md-plain\"> from 4 to 2, we can save significant compute memory without sacrificing modeling performance. However, it will also increase the runtime. It&#8217;s a trade-off we have to keep in mind when working on the competition.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-pair-s\" spellcheck=\"false\"><code>lora_*<\/code><\/span><span class=\"md-plain\"> parameters<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lora_*<\/code><\/span><span class=\"md-plain\"> set the trainable parameters for LoRA. Changing <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lora_key<\/code><\/span><span class=\"md-plain\"> from <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>False<\/code><\/span><span class=\"md-plain\"> to <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>True<\/code><\/span><span class=\"md-plain\">, for example, will enable LoRA for the LLMs <\/span><span class=\"md-pair-s \"><em><span class=\"md-plain\">key<\/span><\/em><\/span><span class=\"md-plain\"> weights next to the value and query weights. In practice, this can bring the performance closer to full finetuning.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">Below are a few settings I chose based on the discussion above, totaling 23.66 GB memory consumption such that the code runs on an RTX4090 as well as an A100:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648463\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes.png\" alt=\"\" width=\"653\" height=\"365\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes.png 2602w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes-300x168.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes-1024x573.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes-1536x859.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes-2048x1146.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/lora-changes-300x168@2x.png 600w\" sizes=\"(max-width: 653px) 100vw, 653px\" \/><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain md-expand\">The <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>max_iter<\/code><\/span><span class=\"md-plain\"> is set to 100 for quick experimentation in the screenshot above, which means the script should finish in about 2 min. However, for the &#8220;real&#8221; training, you want to set the number of iterations at least equal to the number of records in the dataset (50k in the case of Alpaca or Alpaca-Libre).<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-pair-s \"><strong><span class=\"md-plain\">A note about full finetuning<\/span><\/strong><\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The 7B OpenLLaMA model has 6,738,415,616 parameters. However, only a small fraction of parameters are trainable in the LoRA script (4,194,304 by default), which enables parameter-efficient finetuning. Why not finetuning the full model? That&#8217;s because it consumes significant memory. I was not able to fit full finetuning into a single A100. <\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">In fact, I needed 6 GPUs and tensor sharding to make it work. Below is a benchmark from my <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/pages\/community\/finetuning-falcon-efficiently\/\"><span class=\"md-plain\">Finetuning Falcon LLMs More Efficiently With LoRA and Adapters<\/span><\/a><\/span><span class=\"md-plain md-expand\"> article:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5648464\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/falcon.png\" alt=\"\" width=\"556\" height=\"361\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/falcon.png 1522w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/falcon-300x195.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/falcon-1024x665.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/falcon-300x195@2x.png 600w\" sizes=\"(max-width: 556px) 100vw, 556px\" \/><\/p>\n<h2 id=\"toc11\"><span class=\"md-plain md-expand\">11 &#8211; Preventing Out-Of-Memory Errors<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">As hinted earlier, one of the main challenges in this competition will be avoiding out-of-memory errors since our GPU RAM is limited. Above, we have briefly discussed tricks such as gradient accumulation, quantization, choosing smaller base models, and LoRA.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">We can add many additional tricks, including automatic mixed-precision training, low-precision floats, efficient model initialization, choosing leaner optimizers, and parameter offloading. Discussing all these techniques is out of the scope of this article, but the first good news is that most of these are already implemented in the Lit-GPT code.<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">The second good news is that I have a standalone article that discusses all these methods in more detail: <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/pytorch-memory-vit-llm\/\"><span class=\"md-plain\">Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch<\/span><\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<div id=\"attachment_5648465\" style=\"width: 804px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648465\" class=\"wp-image-5648465 \" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks.png\" alt=\"\" width=\"794\" height=\"446\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks.png 5333w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks-300x169.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks-1024x576.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks-1536x864.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks-2048x1152.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/memory-tricks-300x169@2x.png 600w\" sizes=\"(max-width: 794px) 100vw, 794px\" \/><p id=\"caption-attachment-5648465\" class=\"wp-caption-text\">Memory tricks from <a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/pytorch-memory-vit-llm\/\">Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch<\/a>.<\/p><\/div>\n<p><span class=\"md-plain\">The <\/span><span class=\"md-meta-i-c md-link md-expand\"><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/pytorch-memory-vit-llm\/\"><span class=\"md-plain\">Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch<\/span><\/a><\/span><span class=\"md-plain\"> article also explains Lightning&#8217;s <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/lightning.ai\/docs\/fabric\/stable\/\"><span class=\"md-plain\">Fabric<\/span><\/a><\/span><span class=\"md-plain\">, an open-source library to conveniently accelerate PyTorch model training, which is used inside of Lit-GPT to reduce boilerplate code.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc12\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain md-expand\">12 &#8211; Research Directions<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Initially, I planned to write a thorough section with research ideas and directions to explore in this competition. However, due to this article&#8217;s (almost excessive) length, I will defer these to a future write-up. However, in the mean time, you may be able to find some inspiration in my Research Highlights series: <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/magazine.sebastianraschka.com\/p\/ai-research-highlights-in-3-sentences-738\"><span class=\"md-plain\">June-July 2023<\/span><\/a><\/span><span class=\"md-plain\">, <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/magazine.sebastianraschka.com\/p\/ai-research-highlights-in-3-sentences-2a1\"><span class=\"md-plain\">May-June 2023<\/span><\/a><\/span><span class=\"md-plain\">, and <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/magazine.sebastianraschka.com\/p\/ai-research-highlights-in-3-sentences\"><span class=\"md-plain\">April-May 2023<\/span><\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc13\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain\">13 &#8211; Evaluating the Model Locally<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Since most readers probably reached the point where they read a long Starter Guide and can&#8217;t wait to get started themselves. However, there is one more thing that&#8217;s worth discussing: evaluating the modeling performance! However, I promise to keep it short (and I plan to follow up with a more detailed evaluation article in the future). <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The competition submissions will be evaluated on a subset of <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/crfm.stanford.edu\/helm\/latest\/\"><span class=\"md-plain\">Stanford&#8217;s HELM benchmark<\/span><\/a><\/span><span class=\"md-plain\">, which consists of 42 scenarios and 59 metrics. <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">These include scenarios like <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/crfm.stanford.edu\/helm\/latest\/?group=hellaswag\"><span class=\"md-plain\">HellaSwag<\/span><\/a><\/span><span class=\"md-plain\"> and <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/crfm.stanford.edu\/helm\/latest\/?group=truthful_qa\"><span class=\"md-plain\">TruthfulQA<\/span><\/a><\/span><span class=\"md-plain\">, which are also covered in other benchmarks, such as EleutherAI&#8217;s <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/EleutherAI\/lm-evaluation-harness\"><span class=\"md-plain\">Language Model Evaluation Harness<\/span><\/a><\/span><span class=\"md-plain\">. <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">To prevent overfitting, it is maybe not a bad idea to develop the models on a few tasks from the Evaluation Harness first (think of them as validation sets) before applying them to the HELM benchmark. Since the competition evaluation will be based on a subset of HELM, we can think of HELM as more of a test set.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The Language Model Evaluation Harness is currently supported in Lit-GPT directly (and <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/pull\/370\"><span class=\"md-plain\">HELM support is in the works<\/span><\/a><\/span><span class=\"md-plain\">). Let&#8217;s briefly look at how to use the Evaluation Harness with Lit-GPT.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">First, we have to clone and install the official Evaluation Harness repository:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"\" spellcheck=\"false\">\u00a0<span role=\"presentation\">git clone https:\/\/github.com\/EleutherAI\/lm-evaluation-harness<\/span>\r\n\u00a0<span role=\"presentation\">cd lm-evaluation-harness<\/span>\r\n\u00a0<span role=\"presentation\">pip install -e .<\/span>\r\n\u00a0<span role=\"presentation\">cd ..<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">(Note that <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>pip install -e .<\/code><\/span><span class=\"md-plain\"> will install it locally and run <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>python setup.py develop<\/code><\/span><span class=\"md-plain\"> so that tweaks in the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lm-evaluation-harness<\/code><\/span><span class=\"md-plain\"> package don&#8217;t require reinstallation.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Then, to evaluate the OpenLLaMA model, we can run the harness on a checkpoint file as follows from the <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lit-gpt<\/code><\/span><span class=\"md-plain\"> repo:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"\" spellcheck=\"false\">\u00a0<span role=\"presentation\">python eval\/lm_eval_harness.py \\<\/span>\r\n\u00a0<span role=\"presentation\">  --checkpoint_dir \"checkpoints\/openlm-research\/open_llama_7b\/\" \\<\/span>\r\n\u00a0<span role=\"presentation\">  --precision \"bf16-true\" \\<\/span>\r\n\u00a0<span role=\"presentation\">  --eval_tasks \"[truthfulqa_mc]\" \\<\/span>\r\n\u00a0<span role=\"presentation\">  --batch_size 4 \\<\/span>\r\n\u00a0<span role=\"presentation\">  --save_filepath \"results-openllama-7b.json\"<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">This should only take 5 minutes to run.\u00a0<\/span><\/p>\n<p class=\"md-end-block md-p md-focus\"><span class=\"md-plain\">(For a LoRA-finetuned model, there is an equivalent <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>lm_eval_harness_lora.py<\/code><\/span><span class=\"md-plain\"> script in the Lit-GPT repo.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain md-expand\">If you want to include multiple tasks, for example, HellaSwag and TruthfulQA, you can replace <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>[truthfulqa_mc]<\/code><\/span><span class=\"md-plain\"> with <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>[truthfulqa_mc,hellaswag]<\/code><\/span><span class=\"md-plain\">. <\/span><span class=\"md-pair-s md-expand\"><strong><span class=\"md-plain\">You can find a full task list in the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/EleutherAI\/lm-evaluation-harness\/blob\/master\/docs\/task_table.md\"><span class=\"md-plain\">task table here<\/span><\/a><\/span><span class=\"md-plain\">.<\/span><\/strong><\/span><\/p>\n<div id=\"attachment_5648468\" style=\"width: 698px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5648468\" class=\"wp-image-5648468\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks.png\" alt=\"\" width=\"688\" height=\"541\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks.png 1740w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks-300x236.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks-1024x805.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks-1536x1208.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/tasks-300x236@2x.png 600w\" sizes=\"(max-width: 688px) 100vw, 688px\" \/><p id=\"caption-attachment-5648468\" class=\"wp-caption-text\">Small excerpt of the tasks supported in the <a href=\"https:\/\/github.com\/EleutherAI\/lm-evaluation-harness\/blob\/master\/docs\/task_table.md\">Evaluation Harness<\/a><\/p><\/div>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">This results in the following JSON output:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"\" spellcheck=\"false\">\u00a0<span role=\"presentation\">{\"results\": <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0  {\"truthfulqa_mc\": <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0  {\"mc1\": 0.23133414932680538, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0 \u00a0 \"mc1_stderr\": 0.014761945174862673, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0 \u00a0 \"mc2\": 0.352784342017196, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0 \u00a0 \"mc2_stderr\": 0.01356224149206526}}, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0 \u00a0 \"versions\": {\"truthfulqa_mc\": 1}, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0 \u00a0 \"config\": {\"model\": \"open_llama_7b\", \"num_fewshot\": 0, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0             \u00a0 \"batch_size\": 4, \"device\": \"cuda:0\", <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0             \u00a0 \"no_cache\": true, \"limit\": null, <\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 \u00a0 \u00a0             \u00a0 \"bootstrap_iters\": 2, \"description_dict\": null<\/span>\r\n\u00a0<span role=\"presentation\">}}<\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The resulting scores, <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>mc1<\/code><\/span><span class=\"md-plain\"> and <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>mc2<\/code><\/span><span class=\"md-plain\">, measure the proportion of how often the model generates true statements (on a scale from 0 to 1). The difference between mc1 and mc2 scores are explained in the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/sylinrl\/TruthfulQA\"><span class=\"md-plain\">TruthfulQA<\/span><\/a><\/span><span class=\"md-plain\"> repository:<\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">&#8220;<\/span><span class=\"md-pair-s \"><strong><span class=\"md-plain\">MC1 (Single-true)<\/span><\/strong><\/span><span class=\"md-plain\">: Given a question and 4-5 answer choices, select the only correct answer. The model&#8217;s selection is the answer choice to which it assigns the highest log-probability of completion following the question, independent of the other answer choices. The score is the simple accuracy across all questions.&#8221;<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">&#8220;<\/span><span class=\"md-pair-s \"><strong><span class=\"md-plain\">MC2 (Multi-true)<\/span><\/strong><\/span><span class=\"md-plain\">: Given a question and multiple true \/ false reference answers, the score is the normalized total probability assigned to the set of true answers.&#8221;<\/span><\/p>\n<\/li>\n<\/ul>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Will the TruthfulnessQA be used for the final model evaluation? Likely not. I am using it here as a simple reference. Note that Llama 2 Chat models (which are not permitted in this competition since they are already finetuned) may be a good reference for a good score. <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">For comparison, we can run the same evaluation code on the Llama 2 7b chat model as follows:<\/span><\/p>\n<pre class=\"md-fences md-end-block md-fences-with-lineno ty-contain-cm modeLoaded hljs collapse-false\" lang=\"bash\" spellcheck=\"false\">\u00a0<span role=\"presentation\">python eval\/lm_eval_harness.py \\<\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 <span class=\"cm-attribute\">--checkpoint_dir<\/span> <span class=\"cm-string\">\"checkpoints\/meta-llama\/Llama-2-7b-chat-hf\/\"<\/span> \\<\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 <span class=\"cm-attribute\">--precision<\/span> <span class=\"cm-string\">\"bf16-true\"<\/span> \\<\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 <span class=\"cm-attribute\">--eval_tasks<\/span> <span class=\"cm-string\">\"[truthfulqa_mc]\"<\/span> \\<\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 <span class=\"cm-attribute\">--batch_size<\/span> <span class=\"cm-number\">4<\/span> \\<\/span>\r\n\u00a0<span role=\"presentation\"> \u00a0 <span class=\"cm-attribute\">--save_filepath<\/span> <span class=\"cm-string\">\"results-llama2-7b.json\"<\/span><\/span><\/pre>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">This results in mc1 and mc2 scores of <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>0.306<\/code><\/span><span class=\"md-plain\"> and <\/span><span class=\"md-pair-s\" spellcheck=\"false\"><code>0.454<\/code><\/span><span class=\"md-plain\">. This doesn&#8217;t sound great: 0.3 means 30% of the answers are truthful. However, for comparison, a 25x larger 175B GPT-3 model also only achieved 21%, according to the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/sylinrl\/TruthfulQA\"><span class=\"md-plain\">TruthfulQA<\/span><\/a><\/span><span class=\"md-plain\"> repository. <\/span><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"toc14\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain\">14 &#8211; Making Submissions<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">The competition currently only allows 3 submissions. So, I highly recommend developing your models locally first before making your first submissions (the competition deadline is currently listed as October 15th, 2023).<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">As mentioned earlier, you can use the Evaluation Harness for model evaluation. Also, HELM evaluation will be added to Lit-GPT soon, which can be useful for evaluating your final model candidates before submission. <\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">For the submission itself, you will be required to submit a Docker image. Fortunately, the organizers have a GitHub repository with the exact steps <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/llm-efficiency-challenge\/neurips_llm_efficiency_challenge\"><span class=\"md-plain\">here<\/span><\/a><\/span><span class=\"md-plain\"> as well as a toy-submission setup guide to test your model locally before submission. (Before pasting code examples that may be outdated in the future, I recommend consulting the <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/llm-efficiency-challenge\/neurips_llm_efficiency_challenge\"><span class=\"md-plain\">official competition repository<\/span><\/a><\/span><span class=\"md-plain\">.)<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Note that the organizers also maintain a <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/discord.gg\/XJwQ5ddMK7\"><span class=\"md-plain\">Discord channel<\/span><\/a><\/span><span class=\"md-plain\"> for additional questions about the competition.<\/span><\/p>\n<h2><\/h2>\n<p>&nbsp;<\/p>\n<h2 id=\"toc15\" class=\"md-end-block md-heading md-focus\"><span class=\"md-plain md-expand\">Conclusion<\/span><\/h2>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">I am really excited for the research community to develop (more) efficient methods for finetuning LLMs. And I hope you find this competition as useful and exciting as I do. Please spread the word about this competition &#8212; the more people participate, the more we can advance the efficient LLM research field.<\/span><\/p>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">If you have any questions, these are some of the best ways to reach out:<\/span><\/p>\n<ul class=\"ul-list\" data-mark=\"-\">\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">If you encounter any problems with Lit-GPT, please <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/issues\"><span class=\"md-plain\">consider filing an Issue on GitHub<\/span><\/a><\/span><span class=\"md-plain\"> if you think it is a bug. <\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">If you find any problems with the code in this article, you may also file an issue and tag me with my <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/rasbt\"><span class=\"md-plain\">GitHub user account @rasbt<\/span><\/a><\/span><span class=\"md-plain\">\u00a0or <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/x.com\/rasbt\"><span class=\"md-plain\">reach out social media<\/span><\/a><\/span><span class=\"md-plain\"> &#8212; I am more than happy to get that fixed!<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">For Lit-GPT-related questions about the challenge, my colleagues at Lightning AI also maintain a Discord channel <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/discord.com\/channels\/1077906959069626439\/1134560480795570186\"><span class=\"md-plain\">here<\/span><\/a><\/span><span class=\"md-plain\">.<\/span><\/p>\n<\/li>\n<li class=\"md-list-item\">\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Furthermore, <\/span><span class=\"md-meta-i-c md-link\"><a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/pulls\"><span class=\"md-plain\">Lit-GPT pull requests<\/span><\/a><\/span><span class=\"md-plain\"> with improvements and implementations of new techniques would be very welcome!<\/span><\/p>\n<\/li>\n<\/ul>\n<p class=\"md-end-block md-p\"><span class=\"md-plain\">Happy coding and experimenting!<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; &nbsp; 1 &#8211; What is the NeurIPS LLM Efficiency Challenge? The NeurIPS 2023 LLM Efficiency Challenge is a competition focused on training 1 LLM for 24 hours on 1 GPU &#8212; the team with the best LLM gets to present their results at NeurIPS 2023. Large language models like GPT-4 have impressive capabilities. However,<a class=\"excerpt-read-more\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\" title=\"ReadThe NeurIPS 2023 LLM Efficiency Challenge Starter Guide\">&#8230; Read more &raquo;<\/a><\/p>\n","protected":false},"author":16,"featured_media":5648469,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27,106,41],"tags":[96,31,179,186,189,191,184,188,32,97,33,111],"glossary":[203,212,217,218,224,232],"acf":{"additional_authors":false,"mathjax":false,"default_editor":true,"show_table_of_contents":true,"hide_from_archive":false,"content_type":"Blog Post","sticky":false,"custom_styles":"div#table-of-contents li {\r\n    margin-bottom: 0.5rem;\r\n    padding-bottom: 0.5rem;\r\n    border-bottom: 1px solid var(--gray-700);\r\n}\r\n\r\n.toc ul a.active {\r\n    color: var(--purple);\r\n}\r\n\r\n.toc ul li:has(a.active) ~ li a{\r\n    color:var(--gray-500);\r\n}\r\n\r\n.toc>ul>li:first-of-type:not(:has(~.active))>a:not(.active){\r\n    color:var(--gray-500);\r\n}\r\n\r\ndiv#table-of-contents li {\r\n    font-size: 0.9rem;\r\n}\r\n\r\ndiv#table-of-contents li:last-of-type{\r\n    border-color:transparent;\r\n}\r\n\r\ndiv#table-of-contents.toc ul li a{\r\n    color:var(--purple-600);\r\n}\r\n\r\ndiv#table-of-contents.toc ul li a:hover{\r\n    color:var(--purple-500);\r\n}\r\n\r\nmain h2{\r\n    scroll-padding-top:100px;\r\n    scroll-margin-top:100px;\r\n}","table_of_contents":"<h4>Table of Contents<\/h4>\n<ul>\n<li><a href=\"#toc1\">1 &#8211; What is the NeurIPS Efficiency Challenge?<\/a><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc2\">2 &#8211; Competition Overview<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc3\">3 &#8211; The Official Starter Kit<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc4\">4 &#8211; Setting Up a Project Environment<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc5\">5 &#8211; Installing the Requirements<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc6\">6 &#8211; Downloading Model Checkpoints<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc7\">7 &#8211; Downloading and Preparing Datasets<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc8\">8 &#8211; Establishing a Finetuning Baseline<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc9\">9 &#8211; Using the Model<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc10\">10 &#8211; Changing Finetuning Settings<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc11\">11 &#8211; Preventing Out-Of-Memory Errors<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc12\">12 &#8211; Research Directions<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc13\">13 &#8211; Evaluating the Model Locally<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc14\">14 &#8211; Making Submissions<\/a><\/span><\/span><\/li>\n<li><span class=\"md-meta-i-c md-link\"><span class=\"md-plain\"><a href=\"#toc15\">Conclusion<\/a><\/span><\/span><\/li>\n<\/ul>\n","custom_scripts":""},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI<\/title>\n<meta name=\"description\" content=\"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI\" \/>\n<meta property=\"og:description\" content=\"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"Lightning AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-11T01:32:43+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-14T17:06:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1032\" \/>\n\t<meta property=\"og:image:height\" content=\"597\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"JP Hennessy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:site\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"JP Hennessy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"23 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\"},\"author\":{\"name\":\"JP Hennessy\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\"},\"headline\":\"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide\",\"datePublished\":\"2023-08-11T01:32:43+00:00\",\"dateModified\":\"2023-08-14T17:06:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\"},\"wordCount\":4028,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png\",\"keywords\":[\"ai\",\"deep learning\",\"fabric\",\"finetuning\",\"GPT\",\"large language models\",\"llm\",\"LLMs\",\"machine learning\",\"ml\",\"model training\",\"Open Source\"],\"articleSection\":[\"Articles\",\"Community\",\"Tutorials\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\",\"url\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\",\"name\":\"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png\",\"datePublished\":\"2023-08-11T01:32:43+00:00\",\"dateModified\":\"2023-08-14T17:06:05+00:00\",\"description\":\"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.\",\"breadcrumb\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png\",\"width\":1032,\"height\":597},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/lightning.ai\/pages\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/lightning.ai\/pages\/#website\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"name\":\"Lightning AI\",\"description\":\"The platform for teams to build AI.\",\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/lightning.ai\/pages\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\",\"name\":\"Lightning AI\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"width\":1744,\"height\":856,\"caption\":\"Lightning AI\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/LightningAI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\",\"name\":\"JP Hennessy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"caption\":\"JP Hennessy\"},\"url\":\"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI","description":"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/","og_locale":"en_US","og_type":"article","og_title":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI","og_description":"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.","og_url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/","og_site_name":"Lightning AI","article_published_time":"2023-08-11T01:32:43+00:00","article_modified_time":"2023-08-14T17:06:05+00:00","og_image":[{"width":1032,"height":597,"url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png","type":"image\/png"}],"author":"JP Hennessy","twitter_card":"summary_large_image","twitter_creator":"@LightningAI","twitter_site":"@LightningAI","twitter_misc":{"Written by":"JP Hennessy","Est. reading time":"23 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#article","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/"},"author":{"name":"JP Hennessy","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6"},"headline":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide","datePublished":"2023-08-11T01:32:43+00:00","dateModified":"2023-08-14T17:06:05+00:00","mainEntityOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/"},"wordCount":4028,"commentCount":0,"publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png","keywords":["ai","deep learning","fabric","finetuning","GPT","large language models","llm","LLMs","machine learning","ml","model training","Open Source"],"articleSection":["Articles","Community","Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/","url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/","name":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide - Lightning AI","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/#website"},"primaryImageOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png","datePublished":"2023-08-11T01:32:43+00:00","dateModified":"2023-08-14T17:06:05+00:00","description":"Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS 2023 LLM Efficiency Challenge launched to focus on efficient LLM finetuning, and this guide is a short walkthrough explaining how to participate in this competition. This article covers everything you need to know, from setting up the coding environment to making the first submission.","breadcrumb":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#primaryimage","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/08\/screenshot_2023-08-11_at_9.22.46_am.png","width":1032,"height":597},{"@type":"BreadcrumbList","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/neurips2023-llm-efficiency-guide\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/lightning.ai\/pages\/"},{"@type":"ListItem","position":2,"name":"The NeurIPS 2023 LLM Efficiency Challenge Starter Guide"}]},{"@type":"WebSite","@id":"https:\/\/lightning.ai\/pages\/#website","url":"https:\/\/lightning.ai\/pages\/","name":"Lightning AI","description":"The platform for teams to build AI.","publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/lightning.ai\/pages\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/lightning.ai\/pages\/#organization","name":"Lightning AI","url":"https:\/\/lightning.ai\/pages\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","width":1744,"height":856,"caption":"Lightning AI"},"image":{"@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/LightningAI"]},{"@type":"Person","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6","name":"JP Hennessy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","caption":"JP Hennessy"},"url":"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/"}]}},"_links":{"self":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5648447"}],"collection":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/comments?post=5648447"}],"version-history":[{"count":0,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5648447\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media\/5648469"}],"wp:attachment":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media?parent=5648447"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/categories?post=5648447"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/tags?post=5648447"},{"taxonomy":"glossary","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/glossary?post=5648447"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}