Introducing Stable LM Zephyr 3B: A New Addition to Stable LM, Bringing Powerful LLM Assistants to Edge Devices

Key Takeaways:

  • Stable LM Zephyr 3B is a 3 billion parameter Large Language Model (LLM), 60% smaller than 7B models, allowing accurate, and responsive output on a variety of devices without requiring high-end hardware. 

  • Download the model weights here.

  • See an example notebook for how to optimize speed for this model here.

  • This model is being released under a non-commercial license that permits non-commercial use.

Today, we are releasing Stable LM Zephyr 3B: a new chat model representing the latest iteration in our series of lightweight LLMs, preference tuned for instruction following and Q&A-type tasks. This model is an extension of the pre-existing Stable LM 3B-4e1t model and is inspired by the Zephyr 7B model from HuggingFace. With Stable LM Zephyr's 3 billion parameters, this model efficiently caters to a wide range of text generation needs, from simple queries to complex instructional contexts on edge devices. 

Training Insights

The development of Stable LM Zephyr 3B focused on creating a model that performs well in text generation and aligns human preferences. Inspired by Zephyr 7B, we adapted its training pipeline, with the first step being supervised fine-tuning on multiple instruction datasets, including UltraChat, MetaMathQA, Evol Wizard Dataset, & Capybara Dataset. In the second step, we aligned the model with the Direct Preference Optimization (DPO) algorithm utilizing the UltraFeedback dataset. This dataset is from the OpenBMB research group and comprises 64,000 prompts and corresponding model responses. Recently released models, such as Zephyr-7B, Neural-Chat-7B, and Tulu-2-DPO-70B, successfully used Direct Preference Optimization (DPO). However, Stable Zephyr is one of the first models of this type yet with the efficient size of 3B parameters.

Model Performance

Benchmarked on platforms such as MT Bench and AlpacaEval, Stable LM Zephyr 3B demonstrates superior capabilities in generating contextually relevant, coherent, and linguistically accurate text. 

In these tests, Stable LM Zephyr 3B’s performance was found to be competitive with several larger sized models, such as Falcon-4b-Instruct, WizardLM-13B-v1, Llama-2-70b-chat, and Claude-V1.

MT-Bench Score is calculated using LLMs to evaluate models on open-ended questions & AlpacaEval focuses on a model’s ability to follow general user instructions. 

More details on our testing, datasets, and safety can be found in our model card. In summary, our performance tests have shown that Stable LM Zephyr 3B is capable of surpassing models of larger size tailored for similar use cases, showcasing the power and efficiency inherent in this new model.

Enabling Diverse Applications

Stable LM Zephyr 3B is a lightweight yet accurate model equipped to handle multiple linguistic tasks efficiently and accurately. The model has been strengthened to assist in instructional and Q&A-type tasks. It's versatile enough for various complex applications, from crafting creative content like copywriting and summarization to aiding in developing instructional design and content personalization. Additionally, it offers a powerful and insightful analysis based on the input data. All this is accomplished while retaining its efficient 3 billion parameter size, 60% smaller than 7b models, enabling use on devices lacking the computational power of dedicated high-end systems.

Commercial Applications

If you want to use this model for your commercial products or purposes, please contact us here to learn more.

You can also stay updated on our progress by signing up for our newsletter, following us on Twitter, Instagram, LinkedIn, and joining our Discord Community.

Previous
Previous

Behind the Compute: Building the New AI Supercomputer

Next
Next

Introducing Japanese Stable LM Beta