SmolLM2: Very Good Alternatives to Qwen2.5 and Llama 3.2

Rifx.Online
Technology , Machine Learning , Data Science
10 Nov, 2024

And it’s fully open!

Hugging Face has doubled down on their SmolLM initiative.

They released SmolLM2: 1.7B, 360M, and 135M models trained on 11T tokens (against 1T for SmolLM). They released based and instruct versions:

Hugging Face Collection: SmolLM2 (Apache 2.0 license)

They used new datasets for pre-training that they will release soon. To make the instruct versions, they used a recipe similar to what they did to train Zephyr (SFT+DPO on ultrafeedback).

It looks like SmolLM2 performs very well:

Note that Hugging Face fully releases the pre-training data and the recipe they used to prevent data contamination. In other words, their published evaluation results are probably accurate and fully reproducible.

Hugging Face used its own framework for pre-training, Nanotron. I’ve never written about Nanotron but I think it’s a very interesting project that deserves to be better known, especially if you are interested in understanding how pre-training is done. I’ll try to find the time to publish an article explaining Nanotron before 2025!

Meta also released a series of small models, MobileLLM:

Hugging Face Collection: MobileLLM (CC-BY-NC)

This is a new release but note that these models are actually quite old. They were trained for this work published in February 2024:

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Learn everything you need about using and fine-tuning Large Language Models with my new book “LLMs on a Budget”:

Tags :

10 Creative Ways to Use ChatGPT Search The Web Feature

For example, prompts and outputs Did you know you can use the “search the web” feature of ChatGPT for many tasks other than your basic web search? For those who don't know, ChatGPT’s new

📚 10 Must-Learn Skills to Stay Ahead in AI and Tech 🚀

Rifx.Online
Technology , Generative AI , Data Science
16 Nov, 2024

In an industry as dynamic as AI and tech, staying ahead means constantly upgrading your skills. Whether you’re aiming to dive deep into AI model performance, master data analysis, or transform trad

10 Powerful Perplexity AI Prompts to Automate Your Marketing Tasks

In today’s fast-paced digital world, marketers are always looking for smarter ways to streamline their efforts. Imagine having a personal assistant who can create audience profiles, suggest mar

10+ Top ChatGPT Prompts for UI/UX Designers

Rifx.Online
Technology , Design , Programming/Scripting
19 Jan, 2025

AI technologies, such as machine learning, natural language processing, and data analytics, are redefining traditional design methodologies. From automating repetitive tasks to enabling personal

100 AI Tools to Finish Months of Work in Minutes

The rapid advancements in artificial intelligence (AI) have transformed how businesses operate, allowing people to complete tasks that once took weeks or months in mere minutes. From content creat

17 Mindblowing GitHub Repositories You Never Knew Existed

Rifx.Online
Programming , Technology/Web , Education
26 Dec, 2024

Github Hidden Gems!! Repositories To Bookmark Right Away Learning to code is relatively easy, but mastering the art of writing better code is much tougher. GitHub serves as a treasur