{"id":5648406,"date":"2023-07-22T19:54:32","date_gmt":"2023-07-22T23:54:32","guid":{"rendered":"https:\/\/lightning.ai\/pages\/?p=5648406"},"modified":"2024-07-02T13:42:59","modified_gmt":"2024-07-02T17:42:59","slug":"how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon","status":"publish","type":"post","link":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/","title":{"rendered":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon"},"content":{"rendered":"<header>\n<p class=\"page-description\"><div class=\"takeaways card-glow p-4 my-4\"><h3 class=\"w-100 d-block\">Takeaways<\/h3>In this blog, we&#8217;ll learn how to build a chatbot using open-source LLMs. We will be using <a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\">Lit-GPT<\/a> and LangChain. Lit-GPT is an optimized collection of open-source LLMs for finetuning and inference. It supports &#8211; Falcon, Llama 2, Vicuna, LongChat, and other top-performing open-source large language models.<\/div>\n<\/header>\n<p>&nbsp;<\/p>\n<blockquote><p>This article is outdated with respect to the latest LitGPT version. Please check the <a href=\"https:\/\/github.com\/Lightning-AI\/litgpt\">official repo<\/a> for latest code samples.<\/p>\n<p>&nbsp;<\/p><\/blockquote>\n<div class=\"page-body\">\n<h2 id=\"6043da4a-aacc-4751-8d6d-e42359b26874\" class=\"\">Advancements of LLMs in Chatbot Development<\/h2>\n<p id=\"18635d2f-1d51-46a9-990b-5ce18c7f0336\" class=\"\">Chatbots have become integral to many industries, including e-commerce, customer service, and healthcare. With advancements in large language models, building a chatbot has become easier than ever. Chatbots can find relevant information from a large knowledge base and present it to the user. It can connect to a database, query from a vector database, and answer questions based on documents.<\/p>\n<p id=\"da001bc2-4c78-4726-9823-a44585bc8ada\" class=\"\">LLMs have the ability to generate answers based on the given prompt. They can now even consume a whole document and respond to queries based on the document. In-context learning provides a way where LLM learns how to solve a new task at inference time without any change in the model weights. The prompt contains an example of the task that model is expected to do. The prompt engineering discussion can go much longer but we will be focusing on how to build your first MVP chatbot.<\/p>\n<h2 id=\"9238fe46-60a9-4bfd-bac1-76d04b13e978\" class=\"\">Making conversation with an LLM<\/h2>\n<p id=\"a2e1eed1-c6e1-44ab-a3d7-71b8004d8675\" class=\"\">Foundational large language models (LLMs) are trained to predict the next word. These LLMs are then finetuned on an instruction dataset, which consists of <em>user input<\/em>, <em>context<\/em>, and <em>expected output<\/em>, to follow human instruction and generate relevant responses. An example of an instruction prompt to make the LLM behave like a chatbot is provided below.<\/p>\n<pre id=\"fa1988d9-5b72-4576-8c8f-aee32e5bbecc\" class=\"code hljs collapse-false\"><code>A chat between a curious user and an artificial intelligence assistant.\r\nThe assistant gives helpful, detailed, and polite answers to the user's questions.\r\n\r\nUSER: {input}\r\nASSISTANT:<\/code><\/pre>\n<p id=\"fc4c02b8-888c-42ca-922d-f7eaeb12a30f\" class=\"\">The first two lines serve as instructions to the LLM, directing it to provide helpful, detailed, and polite responses as a chatbot assistant. The third line, <code>USER: {input}<\/code>, represents the user input, where <code>{input}<\/code> will be replaced by the user&#8217;s query. The LLM will then start predicting the next words to produce a response to the user&#8217;s prompt.<\/p>\n<p id=\"c809cc12-9941-4b79-b725-2b5b5845724d\" class=\"\">In this tutorial, we will use an instruction-tuned model and provide the user input as a prompt. Let&#8217;s create our first chatbot by using the prompt defined above. We will use LongChat, which is a LLaMA-like model trained on a chat dataset with a context length of 16K (<a href=\"https:\/\/help.openai.com\/en\/articles\/4936856-what-are-tokens-and-how-to-count-them\">~12K words<\/a>).<\/p>\n<p id=\"0c083c3d-9582-4ce4-afde-035b66698149\" class=\"\">We will define our prompt template as <code>longchat_prompt_template<\/code>, and for each user query, we will format the string and feed it to the model.<\/p>\n<pre id=\"5039783f-a366-41bb-99c5-1c5db969fcc2\" class=\"code hljs collapse-false language-python\"><code>longchat_template = \"\"\"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\r\nUSER: {input}\r\nASSISTANT:\"\"\"\r\n\r\noutput = longchat_template.format(input=\"My name is Aniket?\")<\/code><\/pre>\n<p id=\"c03f8874-8d6f-40a6-a068-f63ac278bbd8\" class=\"\">We will be using the <code>llm-inference<\/code> library, which is just a wrapper over Lit-GPT to provide an API interface for loading the model and using the generation method. First, we load our model with 4-bit quantization using <code>bnb.nf4<\/code> that will require about 6GB GPU memory.<\/p>\n<pre id=\"4f5f67fd-2f91-4e77-9575-cd7744386359\" class=\"code hljs collapse-false language-python\"><code># pip install llm-inference\r\nfrom llm_inference import LLMInference, prepare_weights\r\nfrom rich import print\r\n\r\npath = str(prepare_weights(\"lmsys\/longchat-7b-16k\"))\r\nmodel = LLMInference(checkpoint_dir=path, quantize=\"bnb.nf4\")<\/code><\/pre>\n<p id=\"863ee7c9-434c-444f-9891-382fe6c57ccb\" class=\"\">Next, use the <a href=\"http:\/\/model.chat\"><code>model.chat<\/code><\/a> method, an API interface for the Lit-GPT <a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/blob\/main\/chat\/base.py\">chat script<\/a>, to generate the response from the formatted template with user input.<\/p>\n<pre id=\"3bc88772-c276-4c3f-b0bc-b570d7c9a9fd\" class=\"code hljs collapse-false language-python\"><code># pip install llm-inference\r\nfrom llm_inference import LLMInference, prepare_weights\r\nfrom rich import print\r\n\r\npath = str(prepare_weights(\"lmsys\/longchat-7b-16k\"))\r\nmodel = LLMInference(checkpoint_dir=path, quantize=\"bnb.nf4\")\r\n\r\nlongchat_template = \"\"\"A chat between a curious user and an artificial intelligence assistant.\r\nThe assistant gives helpful, detailed, and polite answers to the user's questions.\r\nUSER: {input}\r\nASSISTANT:\"\"\"\r\n\r\noutput = model.chat(longchat_template.format(input=\"My name is Aniket?\"))\r\nprint(output)<\/code><\/pre>\n<figure id=\"571e51b2-091f-49ab-8062-0f7b93a6481f\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648407\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\" alt=\"\" width=\"1656\" height=\"1074\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png 1656w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0-300x195.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0-1024x664.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0-1536x996.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0-300x195@2x.png 600w\" sizes=\"(max-width: 1656px) 100vw, 1656px\" \/><\/figure>\n<h2 id=\"efe62758-a9d5-4820-8ef1-526215909128\" class=\"\">Assistant doesn\u2019t have a memory<\/h2>\n<p id=\"b90bea06-1ff3-4c83-9fea-b984039bf533\" class=\"\">Our chatbot works great, but there is an issue: our assistant is stateless and forgets the previous interactions. As you can see below, the user prompts the bot with their name and then, in the second interaction, asks &#8220;What is my name?&#8221; However, the assistant is not able to provide the requested information due to its lack of memory.<\/p>\n<figure id=\"db67cb93-0952-477a-8074-56902140e3a8\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648408\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1.png\" alt=\"\" width=\"2008\" height=\"1004\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1.png 2008w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1-300x150.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1-1024x512.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1-1536x768.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/1-300x150@2x.png 600w\" sizes=\"(max-width: 2008px) 100vw, 2008px\" \/><\/figure>\n<p id=\"5918a2fc-6a7e-463c-a883-6ccfe481ac2c\" class=\"\">Fixing this issue is not too difficult. We can provide the context to the LLM through the prompt and it should be able to look it up. We can store our conversation history and inject it into the prompt like the example below where the user provides their name as context, and the model was able to answer their name in a follow-up question.<\/p>\n<figure id=\"6e11bb56-644a-4f4c-8f95-d211ba6fda6d\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648409\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2.png\" alt=\"\" width=\"2056\" height=\"830\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2.png 2056w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2-300x121.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2-1024x413.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2-1536x620.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2-2048x827.png 2048w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/2-300x121@2x.png 600w\" sizes=\"(max-width: 2056px) 100vw, 2056px\" \/><\/figure>\n<h2 id=\"cf68f211-e4d2-4d83-8b04-1898d681188d\" class=\"\">Memory and prompting &#8211; Maintaining a continued conversation<\/h2>\n<p id=\"f64bc834-88fb-4ff7-bb68-1acd04aed7d7\" class=\"\">LLMs are stateless interfaces that provide language understanding and retrieval capabilities. However, LLMs do not have memory, so subsequent queries are not affected by earlier calls. Therefore, if you provide your name in one query and expect the LLM to remember it for the next query, it will not happen.<\/p>\n<p id=\"902f446f-a76e-4855-92c6-23f927a6b4c6\" class=\"\">To give LLMs the ability to remember previous interactions, we can store the conversation history in the prompt template as context for the model. We create the following prompt template &#8211;<\/p>\n<pre id=\"4774f6ee-a9a2-43c4-8096-a0f0d6e837f8\" class=\"code hljs collapse-false\"><code>A chat between a curious user and an artificial intelligence assistant.\r\nThe assistant gives helpful, detailed, and polite answers to the user's questions.\r\nContext:\r\n{history}\r\nUSER: {input}\r\nASSISTANT:<\/code><\/pre>\n<p id=\"e5990ad9-6a23-4ee5-a26d-95fc20b8924a\" class=\"\"><code>{history}<\/code> is replaced with the previous conversations, and <code>{input}<\/code> is replaced by the current query from the user. We keep updating the <code>history<\/code> and <code>input<\/code> for each interaction. <code>LangChain<\/code> provides some useful classes for formatting prompts and updating the context using more advanced ways, like looking up context from a <a href=\"https:\/\/python.langchain.com\/docs\/modules\/memory\/how_to\/vectorstore_retriever_memory\">vector database<\/a>.<\/p>\n<p id=\"287f9c23-f4d7-4a31-bd84-93f8d1459c77\" class=\"\">We will use the <code>PromptTemplate<\/code> class with <code>history<\/code> and <code>input<\/code> as variables.<\/p>\n<pre id=\"f3e86dbc-3ba4-416d-a99d-cf1e8b15a8fc\" class=\"code hljs collapse-false language-python\"><code>from langchain.prompts import PromptTemplate\r\n\r\nlongchat_template = \"\"\"A chat between a curious user and an artificial intelligence assistant.\r\nThe assistant gives helpful, detailed, and polite answers to the user's questions.\r\nContext:\r\n{history}\r\nUSER: {input}\r\nASSISTANT:\"\"\"\r\n\r\nlongchat_prompt_template = PromptTemplate(\r\n    input_variables=[\"input\", \"history\"], template=longchat_template\r\n)\r\nprint(longchat_prompt_template.format(\r\n    input = \"What is my name?\",\r\n    history =\"USER: Hi, I am Aniket!\\nAssistant: How can I help you Aniket?\"\r\n))<\/code><\/pre>\n<p id=\"23fe038d-5528-493e-b61a-9fa66cdb0145\" class=\"\">You can format the <code>longchat_prompt_template<\/code> using <code>longchat_prompt_template.format<\/code> method by providing <code>input<\/code> and <code>history<\/code> .<\/p>\n<figure id=\"59e8ff92-5111-4c3c-b1c6-476b83fd3167\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648410\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3.png\" alt=\"\" width=\"1906\" height=\"1072\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3.png 1906w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3-300x169.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3-1024x576.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3-1536x864.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/3-300x169@2x.png 600w\" sizes=\"(max-width: 1906px) 100vw, 1906px\" \/><\/figure>\n<p id=\"fbe6dc7d-d9d6-4898-a80e-bb305ec35767\" class=\"\">Next, we create a conversation chain using the <code>ConversationChain<\/code> class. It takes the LLM, prompt template, and a memory manager object as input. We will use <code>ConversationBufferMemory<\/code>, which stores all our conversations and update the prompt template <code>history<\/code> on each interaction.<\/p>\n<pre id=\"1cc3b52d-1d2d-40ba-8960-97fe6b1fadb6\" class=\"code hljs collapse-false language-python\"><code>from langchain.chains import ConversationChain\r\nfrom llm_chain import LitGPTLLM\r\nfrom llm_inference import LLMInference\r\n\r\npath = \"checkpoints\/lmsys\/longchat-7b-16k\"\r\nmodel = LLMInference(checkpoint_dir=path, quantize=\"bnb.nf4\")\r\n\r\n\r\nllm = LitGPTLLM(model=model)\r\nconversation = ConversationChain(\r\n    llm=llm,\r\n    memory=ConversationBufferMemory(ai_prefix=\"Assistant\", human_prefix=\"User\"),\r\n    prompt=longchat_prompt_template,\r\n)\r\n\r\nconversation(\"hi, I am Aniket\")[\"response\"]<\/code><\/pre>\n<figure id=\"025f7252-1969-4add-8988-fa3e7a1a3da5\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648411\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/4.png\" alt=\"\" width=\"1106\" height=\"374\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/4.png 1106w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/4-300x101.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/4-1024x346.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/4-300x101@2x.png 600w\" sizes=\"(max-width: 1106px) 100vw, 1106px\" \/><\/figure>\n<p id=\"7d74cf53-54d7-407b-91c0-3f91a1d8df30\" class=\"\">You can access the memory using the <code>conversation<\/code> object and manipulate it as well. To print the current conversation we can run <code>print(conversation.memory.buffer)<\/code><\/p>\n<figure id=\"30fa244b-f805-4388-be7f-2d98a328b084\" class=\"image\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5648412\" src=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5.png\" alt=\"\" width=\"1680\" height=\"1012\" srcset=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5.png 1680w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5-300x181.png 300w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5-1024x617.png 1024w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5-1536x925.png 1536w, https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/5-300x181@2x.png 600w\" sizes=\"(max-width: 1680px) 100vw, 1680px\" \/><\/figure>\n<p id=\"137b38bb-73c7-43aa-b2d3-217c6df82fac\" class=\"\">This memory is updated into the prompt as context after each conversation.<\/p>\n<h2 id=\"3e839336-634c-466a-8d73-6d6ea4e0c672\" class=\"\">QA over Documents as context<\/h2>\n<p id=\"8ee9f000-1d8a-4a7b-8057-73db3432b5e8\" class=\"\">The chatbots can be further extended to do more complex tasks like extracting documents from a database and answering questions based on the given document as context. You can build a document QnA bot with this technique. We won\u2019t go deep here in this blog, but if you\u2019re curious, you can replace the model memory with another LLM that searches for relevant documents and update the context in the prompt template with the extracted document. You can read more about it on the LangChain example <a href=\"https:\/\/python.langchain.com\/docs\/use_cases\/question_answering\/\">here<\/a>.<\/p>\n<h2 id=\"12d530a7-1c85-4451-8b4a-f2c70fcfb5ad\" class=\"\">Conclusion<\/h2>\n<div>\n<p id=\"394bde01-32df-4de6-b1a6-ac42cc6e09f0\" class=\"\">In conclusion, building a chatbot using open-source LLMs has become easier than ever with advancements in large language models. The ability of LLMs to generate answers based on the given prompt, even from a whole document, has made them an integral part of many industries. By using instruction-tuned models and providing user input as a prompt, we can create chatbots that provide helpful, detailed, and polite responses. While chatbots are stateless, we can use chat history as context for the model to remember previous interactions. Finally, the chatbots can be further extended to do more complex tasks like extracting documents from a database and answering questions based on the given document as context.<\/p>\n<\/div>\n<h2 id=\"bb1f537a-9b70-49cf-b204-6b84b633a64f\" class=\"\">Resources<\/h2>\n<ul id=\"7e862fd2-14d0-456e-b04f-8787f8af0fa4\" class=\"bulleted-list\">\n<li><a href=\"https:\/\/lightning.ai\/pages\/blog\/how-to-finetune-gpt-like-large-language-models-on-a-custom-dataset\/\">How To Finetune GPT Like Large Language Models on a Custom Dataset<\/a><\/li>\n<\/ul>\n<ul id=\"df9df259-5461-44c2-a519-0ccbbf6e3080\" class=\"bulleted-list\">\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/accelerating-large-language-models-with-mixed-precision-techniques\/\">Accelerating Large Language Models with Mixed-Precision Techniques<\/a><\/li>\n<\/ul>\n<ul id=\"aa28bed1-2ee3-443c-abc2-d0e762953be4\" class=\"bulleted-list\">\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/lora-llm\/\">Parameter-Efficient LLM Finetuning With Low-Rank Adaptation (LoRA)<\/a><\/li>\n<\/ul>\n<ul id=\"bb16bb0b-9f9f-4c2a-9869-b52981246c6d\" class=\"bulleted-list\">\n<li><a href=\"https:\/\/lightning.ai\/pages\/community\/article\/understanding-llama-adapters\/\">Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters<\/a><\/li>\n<\/ul>\n<p id=\"2ac6a20d-dd4e-49e7-9cac-9a713a0d4654\" class=\"\">Join our\u00a0<a href=\"https:\/\/discord.gg\/nnAuZvqTu3\">Discord community<\/a>\u00a0to chat and ask your questions!<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; This article is outdated with respect to the latest LitGPT version. Please check the official repo for latest code samples. &nbsp; Advancements of LLMs in Chatbot Development Chatbots have become integral to many industries, including e-commerce, customer service, and healthcare. With advancements in large language models, building a chatbot has become easier than ever.<a class=\"excerpt-read-more\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\" title=\"ReadHow to build a chatbot using open-source LLMs like Llama 2 and Falcon\">&#8230; Read more &raquo;<\/a><\/p>\n","protected":false},"author":16,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[29,106,41],"tags":[],"glossary":[],"acf":{"additional_authors":[{"author_name":"Aniket Maurya","author_url":""}],"mathjax":false,"default_editor":true,"show_table_of_contents":false,"hide_from_archive":false,"content_type":"Blog Post","sticky":true,"custom_styles":"","code_embed":false,"tabs":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI\" \/>\n<meta property=\"og:description\" content=\"&nbsp; This article is outdated with respect to the latest LitGPT version. Please check the official repo for latest code samples. &nbsp; Advancements of LLMs in Chatbot Development Chatbots have become integral to many industries, including e-commerce, customer service, and healthcare. With advancements in large language models, building a chatbot has become easier than ever.... Read more &raquo;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\" \/>\n<meta property=\"og:site_name\" content=\"Lightning AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-07-22T23:54:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-07-02T17:42:59+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\" \/>\n<meta name=\"author\" content=\"JP Hennessy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:site\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"JP Hennessy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\"},\"author\":{\"name\":\"JP Hennessy\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\"},\"headline\":\"How to build a chatbot using open-source LLMs like Llama 2 and Falcon\",\"datePublished\":\"2023-07-22T23:54:32+00:00\",\"dateModified\":\"2024-07-02T17:42:59+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\"},\"wordCount\":1103,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\",\"articleSection\":[\"Blog\",\"Community\",\"Tutorials\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\",\"url\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\",\"name\":\"How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\",\"datePublished\":\"2023-07-22T23:54:32+00:00\",\"dateModified\":\"2024-07-02T17:42:59+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/lightning.ai\/pages\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to build a chatbot using open-source LLMs like Llama 2 and Falcon\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/lightning.ai\/pages\/#website\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"name\":\"Lightning AI\",\"description\":\"The platform for teams to build AI.\",\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/lightning.ai\/pages\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\",\"name\":\"Lightning AI\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"width\":1744,\"height\":856,\"caption\":\"Lightning AI\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/LightningAI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\",\"name\":\"JP Hennessy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"caption\":\"JP Hennessy\"},\"url\":\"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/","og_locale":"en_US","og_type":"article","og_title":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI","og_description":"&nbsp; This article is outdated with respect to the latest LitGPT version. Please check the official repo for latest code samples. &nbsp; Advancements of LLMs in Chatbot Development Chatbots have become integral to many industries, including e-commerce, customer service, and healthcare. With advancements in large language models, building a chatbot has become easier than ever.... Read more &raquo;","og_url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/","og_site_name":"Lightning AI","article_published_time":"2023-07-22T23:54:32+00:00","article_modified_time":"2024-07-02T17:42:59+00:00","og_image":[{"url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png","type":"","width":"","height":""}],"author":"JP Hennessy","twitter_card":"summary_large_image","twitter_creator":"@LightningAI","twitter_site":"@LightningAI","twitter_misc":{"Written by":"JP Hennessy","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#article","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/"},"author":{"name":"JP Hennessy","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6"},"headline":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon","datePublished":"2023-07-22T23:54:32+00:00","dateModified":"2024-07-02T17:42:59+00:00","mainEntityOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/"},"wordCount":1103,"commentCount":0,"publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png","articleSection":["Blog","Community","Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/","url":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/","name":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon - Lightning AI","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/#website"},"primaryImageOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage"},"image":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage"},"thumbnailUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png","datePublished":"2023-07-22T23:54:32+00:00","dateModified":"2024-07-02T17:42:59+00:00","breadcrumb":{"@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#primaryimage","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/07\/0.png"},{"@type":"BreadcrumbList","@id":"https:\/\/lightning.ai\/pages\/community\/tutorial\/how-to-build-a-chatbot-using-open-source-llms-like-llama-2-and-falcon\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/lightning.ai\/pages\/"},{"@type":"ListItem","position":2,"name":"How to build a chatbot using open-source LLMs like Llama 2 and Falcon"}]},{"@type":"WebSite","@id":"https:\/\/lightning.ai\/pages\/#website","url":"https:\/\/lightning.ai\/pages\/","name":"Lightning AI","description":"The platform for teams to build AI.","publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/lightning.ai\/pages\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/lightning.ai\/pages\/#organization","name":"Lightning AI","url":"https:\/\/lightning.ai\/pages\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","width":1744,"height":856,"caption":"Lightning AI"},"image":{"@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/LightningAI"]},{"@type":"Person","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6","name":"JP Hennessy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","caption":"JP Hennessy"},"url":"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/"}]}},"_links":{"self":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5648406"}],"collection":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/comments?post=5648406"}],"version-history":[{"count":0,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5648406\/revisions"}],"wp:attachment":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media?parent=5648406"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/categories?post=5648406"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/tags?post=5648406"},{"taxonomy":"glossary","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/glossary?post=5648406"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}