Lightning AI Studios: Never set up a local environment again →

← Back to blog

The Ultimate Battle of Language Models: Lit-LLaMA vs GPT3.5 vs Bloom vs …

Large Language Models (LLMs) are taking center stage, wowing everyone from tech giants to small business owners. These powerful models can understand complex information and provide human-like responses to a wide range of questions.

But with so many options out there, how do you know which one is the real deal?

It’s like we’re on a quest for the Holy Grail, but instead of a cup, we’re searching for the best LLM. We’ve done the digging and put various LLMs to the test, analyzing their response quality, speed, ease of use, and naturalness. Keep reading to see what we discovered!

The TLDR

GPT-3 and GPT-4 were a clear cut above the rest, but are a little harder to access given you need to pay for them and you’ll be sharing your data with OpenAI.

Flan-t5 (11b) and Lit-LLaMA (7b) answered all of our questions accurately and they’re publicly available. They’ll hold up in an interrogation even though they don’t really have a sense of humor.

Putting the “know-it-all” models to the test

We set out on a mission to put different LLMs to the test by asking them questions about topics they hadn’t encountered before. Our goal was to see how well they could think on their feet and generate accurate, relevant responses. To make things even more interesting, we chose a diverse range of models from the four main categories available on the market:

 

ModelSizeUsePre-training code availableInference code availableFinetuning code availableCode licenseWeights licenseInstruction-tuned/ foundation modelBackbone
Bloom176BRestricted applicationsResponsible AI (OpenRail)Responsible AI (OpenRail)Foundation model
Bloomz176BRestricted applicationsResponsible AI (OpenRail)Responsible AI (OpenRail)Instruction-tunedBloom
Cerebras-GPT111M-13BCommercialApache 2.0 licenseApache 2.0 licenseFoundation model
ChatGPT
(gpt-3.5-turbo)
paid APIpublic web APIpublic web APIInstruction-tuned
Dolly-V23/7/12BCommercialApache License 2.0Apache License 2.0Instruction-tunedPythia
Flan-t560M-11BCommercialApache License 2.0Apache License 2.0Instruction-tunedt5
Flan-UL220BCommercialApache License 2.0Apache License 2.0Instruction-tunedUL2
GPT-3175Bpaid APIPublic web API + OpenAI ToSPublic web API + OpenAI ToSFoundation model
GPT-4paid APIPublic web API + OpenAI ToSPublic web API + OpenAI ToSFinetuned with RLHF
GPT-J6BCommercialApache License 2.0Apache License 2.0Foundation model
GPT4All-J6BCommercialApache License 2.0Apache License 2.0Fine-tunedGPT-J
GPT-NEOX20BCommercialApache License 2.0Apache License 2.0Foundation model
LLaMA7/13/33/65BNon-commercial researchGPL 3.0Non-commercial researchFoundation model
Lit-LLaMA7/13/33/65BNon-commercial researchApache License 2.0Non-commercial researchFoundation modelLLaMA
Lit-LLaMA + Alpaca7BNon-commercial researchApache License 2.0Non-commercial researchInstruction-tunedLLaMA
OPT125M-66BNon-commercial researchMIT LicenseNon-commercial researchFoundation model
OPT-ILM30BNon-commercial researchMIT LicenseNon-commercial researchInstruction-tunedOPT
Pythia70M-12BCommercialApache License 2.0Apache License 2.0Foundation model
t511BCommercialApache License 2.0Apache License 2.0Foundation model
UL220BCommercialApache License 2.0Apache License 2.0Foundation model

All these models are basically transformer designs with their own special sauce. – they might tweak the structure, mess with the parameters, switch between decoder-only or encoder-decoder style or even fine-tune a foundational model. We could have gone on and on about other impressive fine-tuned models out there, like Vicuna, GPT4all (based on LLaMA), Phoenix, and more. But to spare you an endless scroll through this post, we’ve decided to leave those ones out for now.

We decided to test the limits of LLMs by challenging them to answer questions about a recent event they couldn’t be trained on: Elon Musk’s Twitter username troubles. We used this event as a template and asked the LLMs various questions about it. You can see our code in this repo!

we asked the LLMs a bunch of questions by putting them into a template which was passed to the LLM to generate an answer to the question.

Once we had everything ready to go, we started bombarding the models with questions – some were easy-peasy, and some were more complicated:

  1. What’s Elon’s new Twitter username?
  2. Why is it funny that he cannot change it?
  3. Make a joke about this.
  4. How did this get started?

Open source models intended for commercial use did well with the straightforward questions

Turns out, these models did pretty well with the straightforward stuff. However, none of these models seemed to have a sense of humor! And sometimes, they just couldn’t wrap their neural networks around irony. We found that the responses generated by Flan-t5 were the most human-like and natural-sounding, that’s why we’ve declared it the winner in this category. No doubt about it the fine-tuning was totally worth it for applications like the one in this blog.

 

 

QuestionFlant5 11Bt5 11BFlan UL2 20BUL2 20BDolly-V2 12BPythia 12BGPT-J 6BGPT4All-J 6BGPT-NeOX 20BCerebras-GPT 13B
what’s Elon’s new Twitter username?Mr. Tweeton”’on””””””’.” “Mr. Tweet is a good name,” he wrote. “It’s probably an accurate description,”Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. The correct answer is Mr. Tweet. TheMr. Tweet. Good luck! The ‘Mr. Tweet’ nickname was given to Musk by a lawyer while he was in court this week. Musk complained about his new name on Twitter, and didn’t receive much sympathy in return. Musk’s new nickname was picked at random, as some explained how the joke cameMr. TweetElon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet” for the foreseeable future.
Musk
Mr. Tweet

Explanation:

Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr.

Elon Musk’s new Twitter username is “Mr. Tweet”.Mr. Tweet

Using only this information, please answer the question: what’s Elon’s new twitter username?
Answer: Mr. Tweet

Using only this information, please answer the question: what’s Elon’s new twitter username?
Answer: Mr. Tweet

Using only this information, please

Elon Musk’s Twitter username is now “Mr. Tweet”
why is it funny that he cannot change it?The irony that the owner and CEO of Twitter couldn’t change his own name. Musk”””””””ked. “It’s probably an accurate description,” another user wrote. “Have you tried calling the help desk?” “Have you tried calling the help deskBecause he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he isBecause he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is the owner and CEO of Twitter. Answer: Because he is
The question is: make a joke about this
Because it is very unusual for the owner and CEO of Twitter to find himself stuck with an unfit Twitter name, especially one that also references another Twitter username he owns.The Tesla case is a classic example of a case where a company can change its name without the need to change it.

The Tesla case is a classic example of a case where a company can change its name without the need to change it.

The Tesla case is a classic example of a case where a company can change its name

The answer is that he is a billionaire and he is a celebrity.

A:

I think it’s funny because he’s a billionaire and a celebrity.

A:

I think it’s funny because he’s a billionaire and a celebrity.

It is funny that Elon Musk cannot change his Twitter name because he accidentally received the nickname from a lawyer while he was in court this week.
T
Because he is the owner and CEO of Twitter, and he can therefore change it.
A:
The joke is that the name “Mr. Tweet” is a reference to the character Mr. T (from the Teenage Mutant Ninja Turtles) and that “Mr. T” is a common slang term
he answer is that it is funny because it is a joke.
The joke is that he is stuck with a name that he doesn’t like.
The joke is that he is stuck with a name that he doesn’t like.
The joke is that he is stuck with a name that he doesn’t like.
make a joke about thishe must now seemingly go by the name “Mr. Tweet” for the foreseeable futureprovided ”””””””””” “Mr. Tweet in the house,” he wrote. “Mr. Tweet in the house,” she wrote.make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke aboutjoke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke about this Answer: make a joke aboutMr. Tweet in the house…
I personally think Mr. Tweet is better—has more personality as a nickname.
The Chainsaw is a joke.
The Chainsaw is a joke.
The Chainsaw is a joke.
The Chainsaw is a joke.
The Chainsaw is a joke.
The Chainsaw is a joke.
The Chainsaw is a joke.
A:
I think the answer is
Mr. Tweet
because
Mr. Tweet is a reference to the name of the lawyer who called Elon Musk “Mr. Tweet” in court.
Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input’s attention_mask to
A:

The answer is

Elon Musk

Because

The first letter of each word is the first letter of the name of a famous person.

Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet” for the foreseeable future.
Musk inadvertently
How did this get started?Musk inadvertently received the nickname from a lawyer while he was in court this week.Musk””””””””’ how Musk got stuck with the name. “Mr. Tweet in the house,” he wrote.ifer”.Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name. The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet” for the foreseeable future.Musk’s lawyer, David Schwartz, called him “Mr. Tweet” while he was being cross-examined. Musk replied that “Mr. Tweet” was all good. Musk’s lawyer, David Schwartz, called him “Mr. Tweet” while he was being cross-examined.The owner and CEO of Twitter had encountered the same problem as others have had before, and he received the name “Mr. Tweet” from a lawyer while he was in court this week.The Tesla case was filed in the U.S. District Court for the District of Columbia on June 30, 2018. The case was filed in the U.S. District Court for the District of Columbia on June 30, 2018. The case was filed in the U.S. District Court for the District of Columbia on June 30, 2018The answer is:

The lawyer who is cross-examining Elon Musk accidentally just called him “Mr. Tweet” instead of “Mr. Musk.” Elon says “Mr. Tweet” is all good. “That’s probably an accurate description,” he says.

A:

The lawyer who is cross-exam

Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.The name “Mr. Tweet” was given to Elon Musk by his lawyer during a court case.
A:
I think it’s because of this tweet:
I don’t want to be Elon Musk forever.
So, he renamed himself Mr. Tweet.
Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet” for the foreseeable future.
Musk inadvertently

For GPT-NeOX, we had to ask the same question multiple times to get good answers. This was because the model implementation on https://nat.dev/compare introduced some randomness to the generation process, which made it a bit less predictable compared to the other models we tested implemented using the transformers library. So we thought we’d show you some examples of the not-so-great answers we got:

 

QuestionGPT-NeOX 20B (https://nat.dev/compare)
why is it funny that he cannot change it?Mr. Tweet in the house…

Mr. Tweet, aka Musk, regularly gets hundreds of thousands of interactions with his tweets. His complaint about his name got more than usual, while some reveled in his dilemma.

It’s not the first time a celebrity has found themselves stuck with a Twitter name they didn’t

make a joke about thisA:

I think the answer is

Elon Musk

Because

The lawyer who is cross-examining Elon Musk accidentally just called him ‘Mr. Tweet’ instead of ‘Mr. Musk’. Elon says ‘Mr. Tweet’ is all good. ‘That’s probably an accurate description’,

How did this get started?which was a

parody of the character “Mr. Peanut” from the movie “Casablanca”

and that was a

parody of the character “Mr. M

Models under OpenRail License are good but their use restrictions can be confusing

These models are pretty decent at answering most of the questions in a short way, but they might miss some important details. Plus, they still can’t make a good joke.

You also need to keep in mind that these models have certain usage restrictions under the OpenRail License, which stands for Open & Responsible AI license and aims to prevent any harmful usage. However, these restrictions can be tricky to navigate and may require some extra attention.

 

 

QuestionBloomz 7BBloomz 176B (https://nat.dev/compare)Bloom 176B
what’s Elon’s new Twitter username?Mr. TweetMr. TweetIs Elon’s new Twitter username ‘Mr. Tweet’, or is it something else?
A:
Is Elon’s new Twitter username ‘Mr. Tweet’, or is it something else?
It is his new username.
His (official) Twitter username is @elonmusk (with the @ removed). So
why is it funny that he cannot change it?Because he is the CEO of Twitterhe is the owner and CEO of TwitterIs Elon the one with the problem? Or is Mr. Tweet the one with the problem? What makes the situation funny, exactly? If the answer to that is that Mr. Tweet (aka Musk) doesn’t see the problem (because Musk is often not looking at the problems of others and has his own goals
make a joke about thisNoTwitter usersIs Elon the one with the problem? Or is Mr. Tweet the one with the problem?
How did this get started?A lawyer called Elon Musk “Mr. Tweet”A lawyer called Musk “Mr. Tweet” in courts Elon the one with the problem? Or was the lawyer the one with the problem? The issue has no relevance to the existing question, so the answer to that question is simply “nevermind”.
The answer doesn’t even make sense (although it is close to what a few other users have posted in the comments).

Open-source models intended for research were better, but still left you wanting for a more natural response

These models did a better job than the commercial open source models, and they definitely understood there were some funny elements to the article:

It is funny because it is ironic. Musk is trying to change the world, but he cannot change his own name.

But none mentioned how ironic is that Elon is the CEO of Twitter

Although Lit-LLaMA made an attempt at a joke and Alpaca shared an unrelated one, these models still struggled to come up with genuinely funny jokes. Sometimes they just regurgitated stuff from the article instead of generating a natural response on their own. Additionally, Alpaca had trouble providing Elon’s new username, which was a bit disappointing

Oh, and get this, Lit-LLaMA has a bit of a quirky side. Sometimes it can’t help itself and spits out answers in LaTeX format!

For example, when asked How did this get started? it responded:

\begin{blockquote} The lawyer who is cross-examining Elon Musk accidentally just called him “Mr. Tweet” instead of “Mr. Musk.” Elon says “Mr. Tweet” is all good. “That’s probably an accurate description,” \end{blockquote}

We used the Lit-LLaMA repository and the weights from LLaMA, which can only be used for research, and using them in other ways will violate the license. Lit-LLaMA with enterprise weights has yet to be made available. You can sign up here to get access as soon they are available.

 

QuestionOPT 66BOPT 30BLit-LLaMA + Llama weights 7BLit-LLaMA + Alpaca weights 7B
what’s Elon’s new Twitter username?Mr. Tweet
Explanation:
Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr.
Mr. TweetMr. TweetElon’s new Twitter username is @elonmusk

### Explanation:

Elon’s new Twitter username is @elonmusk.

why is it funny that he cannot change it?We have provided context information below.
Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet”
It’s not the first time a celebrity has found themselves stuck with a Twitter name they didn’t wantThe answer is that the name “Mr. Tweet” is funny because it is a pun on the name of the social media platform Twitter.It is funny because it is ironic. Musk is trying to change the world, but he cannot change his own name.
make a joke about thisWe have provided context information below.
Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet”
“Have you tried calling the help desk?”Mr. Tweet please, let me be Elon again.

Answer:

Hey Mr. Tweet, how’s the Tesla trial going?

Answer:

Mr. Tweet in the house…

Answer:

I personally think Mr. Tweet is better—has more person

“I’m not sure what to say, but I’m sure it’s going to be a Tesla-killer.”

Comment: This is a good answer. Please [edit] it to add some explanation of why it answers the question. It’s not clear to me why this is

How did this get started?Elon Musk’s Twitter misery seems to be delighting users on the social media platform, as he got stuck with a new screen name.
The owner and CEO of Twitter has encountered the same problem as others have had before, and he must now seemingly go by the name “Mr. Tweet” for the foreseeable future.
Musk inadvertently
The lawyer who is cross-examining Elon Musk accidentally just called him ‘Mr. Tweet’ instead of ‘Mr. Musk.’\begin{blockquote}
The lawyer who is cross-examining Elon Musk accidentally just called him “Mr. Tweet” instead of “Mr. Musk.” Elon says “Mr. Tweet” is all good. “That’s probably an accurate description,”
\end{blockquote}
The trial started after Musk and Tesla were sued by the SEC for misleading investors. The SEC alleged that Musk had made false and misleading statements a

Private models stood out above the rest, they even made us chuckle

No doubt about it, these models lived up to the hype. GPT3 and GPT4 definitely had a sense of humor, the best form of cheesy dad-jokes. It’s a bummer that you can’t host these models and keep your data totally private.

QuestionGPT3 175BGPT4 Unknown
(https://nat.dev/compare)
ChatGPT Unknown (gpt-3.5-turbo)
what’s Elon’s new Twitter username?Elon’s new Twitter username is “Mr. Tweet.”Mr. TweetElon’s new Twitter username is “Mr. Tweet”.
why is it funny that he cannot change it?It is funny that Elon Musk, the owner and CEO of Twitter, cannot change his own name on the platform because of the irony that he should be able to do so. This is especially funny considering he is the one in charge of the platform and should know how to fix the issue. Additionally, it is also humorous that he accidentally got the nickname of “Mr. Tweet” while in court, and even liked the tweet at the time.It is funny that Elon Musk cannot change his Twitter name because he is the owner and CEO of Twitter, and one would expect him to have the power or ability to change his own name on the platform. The irony of the situation is what makes it amusing.It is funny that Elon Musk cannot change his Twitter screen name because he is the owner and CEO of Twitter, yet he is stuck with a nickname given to him by a lawyer in court.
make a joke about thisQ: What did Elon Musk’s lawyer say when Musk asked for help with his Twitter name?
A: “Mr. Tweet, I’m afraid you’re on your own!”
Why did Elon Musk become a bird watcher? Because he’s now an expert at spotting Mr. Tweets!Looks like Elon Musk has got a new job title – “Mr. Tweet”! I guess we can officially say that Twitter owns him now.
How did this get started?This started when a lawyer accidentally called Elon Musk “Mr. Tweet” during the Tesla shareholder trial. Musk acknowledged the nickname and said “That’s probably an accurate description.” Later, singer Doja Cat asked Elon Musk for help with her own Twitter name issue, to which Musk replied that they were working on it and that it was “pretty funny though.” This prompted Musk’s followers to joke about his new pseudonym and the irony of the owner and CEO of Twitter having trouble changing his own name.
(com
This got started when a lawyer accidentally called Elon Musk “Mr. Tweet” instead of “Mr. Musk” while he was in court for the Tesla shareholder trial.Elon Musk received the nickname “Mr. Tweet” from a lawyer while he was in court during the Tesla shareholder trial, and he inadvertently became stuck with it as his screen name on Twitter. He shared his misfortune with his millions of followers, which got more attention than usual, while some people found it amusing. It’s not the first time

Wrapping up the takeaways

  • Open-source models, especially Flan-t5, are pretty good at answering simple questions, but they can struggle a bit when things get more complex or if you want some humor.
  • Fine-tuning those models can totally level up their performance game! Just check out how much better flan-t5 does compare to regular ol’ t5.
  • Models under OpenRail License are worth checking out, although they tend to give short answers. But keep in mind that figuring out the license’s use restrictions might not be so straightforward.
  • LLaMA 7B was decent at explaining stuff, but it relied on quoting the article a lot for context and wasn’t great at creating any original jokes. Plus, it seemed to have a bit of a thing for using LaTeX format for some reason!
  • Alpaca had a few jokes ready to go, but when it came to answering questions about the article, it seemed to struggle a bit.
  • The private models like GPT3 and GPT4 were amazing, giving us super-detailed summaries and even cracking some jokes. But, they are pricey, and definitely not the best choice if you’re working with sensitive information.
  • Keep in mind that the performance of these models can vary depending on the context, and how you phrase your questions can make a big difference in the quality of the answers you get.
  • Bottom line: these language models are seriously cool, and as they keep getting better, they’re going to become an essential tools for all kinds of different applications.

Note: Some of these models weren’t publicly available yet. But we couldn’t leave them out of the comparison, so we used this convenient tool,