Type something to search...
SmolLM2: Very Good Alternatives to Qwen2.5 and Llama 3.2

SmolLM2: Very Good Alternatives to Qwen2.5 and Llama 3.2

And it’s fully open!

Hugging Face has doubled down on their SmolLM initiative.

They released SmolLM2: 1.7B, 360M, and 135M models trained on 11T tokens (against 1T for SmolLM). They released based and instruct versions:

  • Hugging Face Collection: SmolLM2 (Apache 2.0 license)

They used new datasets for pre-training that they will release soon. To make the instruct versions, they used a recipe similar to what they did to train Zephyr (SFT+DPO on ultrafeedback).

It looks like SmolLM2 performs very well:

Note that Hugging Face fully releases the pre-training data and the recipe they used to prevent data contamination. In other words, their published evaluation results are probably accurate and fully reproducible.

Hugging Face used its own framework for pre-training, Nanotron. I’ve never written about Nanotron but I think it’s a very interesting project that deserves to be better known, especially if you are interested in understanding how pre-training is done. I’ll try to find the time to publish an article explaining Nanotron before 2025!

Meta also released a series of small models, MobileLLM:

  • Hugging Face Collection: MobileLLM (CC-BY-NC)

This is a new release but note that these models are actually quite old. They were trained for this work published in February 2024:

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Learn everything you need about using and fine-tuning Large Language Models with my new book “LLMs on a Budget”:

Related Posts

10 Creative Ways to Use ChatGPT Search The Web Feature

10 Creative Ways to Use ChatGPT Search The Web Feature

For example, prompts and outputs Did you know you can use the “search the web” feature of ChatGPT for many tasks other than your basic web search? For those who don't know, ChatGPT’s new

Read More
📚 10 Must-Learn Skills to Stay Ahead in AI and Tech 🚀

📚 10 Must-Learn Skills to Stay Ahead in AI and Tech 🚀

In an industry as dynamic as AI and tech, staying ahead means constantly upgrading your skills. Whether you’re aiming to dive deep into AI model performance, master data analysis, or transform trad

Read More
10 Powerful Perplexity AI Prompts to Automate Your Marketing Tasks

10 Powerful Perplexity AI Prompts to Automate Your Marketing Tasks

In today’s fast-paced digital world, marketers are always looking for smarter ways to streamline their efforts. Imagine having a personal assistant who can create audience profiles, suggest mar

Read More
10+ Top ChatGPT Prompts for UI/UX Designers

10+ Top ChatGPT Prompts for UI/UX Designers

AI technologies, such as machine learning, natural language processing, and data analytics, are redefining traditional design methodologies. From automating repetitive tasks to enabling personal

Read More
100 AI Tools to Finish Months of Work in Minutes

100 AI Tools to Finish Months of Work in Minutes

The rapid advancements in artificial intelligence (AI) have transformed how businesses operate, allowing people to complete tasks that once took weeks or months in mere minutes. From content creat

Read More
17 Mindblowing GitHub Repositories You Never Knew Existed

17 Mindblowing GitHub Repositories You Never Knew Existed

Github Hidden Gems!! Repositories To Bookmark Right Away Learning to code is relatively easy, but mastering the art of writing better code is much tougher. GitHub serves as a treasur

Read More