Lightweight AI models for text and vision tasks
Top 16.0% on sourcepulse
Smol Models provides a family of efficient, lightweight AI models for text (SmolLM2) and vision-language (SmolVLM) tasks. Targeting on-device deployment and strong performance, these models are suitable for researchers and developers seeking compact yet capable AI solutions.
How It Works
The SmolLM2 family offers language models in 135M, 360M, and 1.7B parameter sizes, with instruction-tuned variants for assistant-like interactions. SmolVLM is a multimodal model capable of processing both images and text for tasks like visual QA and image description, supporting multiple images per conversation. The repository is structured into text/
, vision/
, and tools/
directories for organized development and inference.
Quick Start & Requirements
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint)
from transformers import AutoProcessor, AutoModelForVision2Seq
processor = AutoProcessor.from_pretrained("HuggingFaceTB/SmolVLM-Instruct")
model = AutoModelForVision2Seq.from_pretrained("HuggingFaceTB/SmolVLM-Instruct")
transformers
library.Highlighted Details
tools/smol_tools
directory.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README does not specify the exact license under which the models and code are distributed, which may impact commercial adoption. Detailed performance benchmarks for each model size are not provided.
4 days ago
1+ week