IBM has announced the third-generation of its open source Granite LLM family, which features a number of different models ideal for various use cases.
“Reflecting our focus on the balance between powerful and practical, the new IBM Granite 3.0 models deliver state-of-the-art performance relative to model size while maximizing safety, speed and cost-efficiency for enterprise use cases,” IBM wrote in a blog post.
The Granite 3.0 family includes general purpose models, more guardrail and safety focused ones, and mixture-of-experts models.
The main model in this family is Granite 3.0 8B Instruct, an instruction-tuned, dense decoder-only model that offers strong performance in RAG, classification, summarization, entity extraction, and tool use. It matches open models of similar sizes on academic benchmarks and exceeds them for enterprise tasks and safety, according to IBM.
“Trained using a novel two-phase method on over 12 trillion tokens of carefully vetted data across 12 different natural languages and 116 different programming languages, the developer-friendly Granite 3.0 8B Instruct is a workhorse enterprise model intended to serve as a primary building block for sophisticated workflows and tool-based use cases,” IBM wrote.
This release also includes new Granite Guardian models that safeguard against social bias, hate, toxicity, profanity, violence, and jailbreaking, as well as perform RAG-specific checks like groundedness, context relevant, and answer relevance.
There are also a number of other models in the Granite 3.0 family, including:
- Granite-3.0-8B-Base, Granite-3.0-2B-Instruct and Granite-3.0-2B-Base, which are general purpose LLMs
- Granite-3.0-3B-A800M-Instruct and Granite-3.0-1B-A400M-Instruct, which are Mixture of Experts models that minimize latency and cost
- Granite-3.0-8B-Instruct-Accelerator, which are speculative decoders that offer better speed and efficiency
All of the models are available under the Apache 2.0 license on Hugging Face, and Granite 3.0 8B and 2B and Granite Guardian 3.0 8B and 2B are available for commercial use on watsonx.
The company also revealed that by the end of 2024, it plans to expand all model context windows to 128K tokens, further improve multilingual support, and introduce multimodal image-in, text-out capabilities.
And in addition to releasing these new Granite models, the company also revealed the upcoming availability of the newest version of the watsonx Code Assistant, as well as plans to release new tools for developers building, customizing, and deploying AI through watsonx.ai.