Falcon 180B is a Large Language Model (LLM) that was released on September 6th, 2023 by the Technology Innovation Institute. This model is a descendant of the Falcon 40B model. Here’s a quick overview of the model:
- 180B parameter model in two versions (base and chat)
- trained on 3.5 trillion tokens using the RefinedWeb dataset
- context width of
Falcon 180B is the largest publicly available model on the Hugging Face model hub. It is about the size of ChatGPT (GPT-3.5) which has 175B parameters. Is it the best?
Fine-tuning Large Language Model (LLM) on a Custom Dataset with QLoRA | MLExpert - Crush Your…
Can you train your own LLM using your own data? Can you accomplish this without sharing your data with third-party…
While the Falcon 180B model is publicly available, the commercial use is very restrictive. Please, refer to the license for more details and consult your legal team.
Model Variants (Base and Chat)
The Falcon 180B model comes in two versions — base and chat.
Falcon-180B (Base): it’s a causal decoder-only model. This model is great for further fine-tuning on your own data.
Falcon-180B-Chat: similar to the base version, this is also a 180 billion parameter causal decoder-only model. However, it takes things a step further by fine-tuning on a mix of Ultrachat5, Platypus6, and Airoboros7 instruction (chat) datasets.
How Good is the Model?
In terms of what it can do, Falcon 180B is a real powerhouse. It’s at the top of the charts for open-access models, and it gives even the big-name proprietary models, like PaLM-2, a run for their money. While it’s a bit tricky to rank them at this point, it’s safe to say that Falcon 180B stands shoulder-to-shoulder with PaLM-2 Large, making it one of the most powerful publicly available language models out there.