smollm  by huggingface

Lightweight AI models for text and vision tasks

created 9 months ago
3,054 stars

Top 16.0% on sourcepulse

GitHubView on GitHub
Project Summary

Smol Models provides a family of efficient, lightweight AI models for text (SmolLM2) and vision-language (SmolVLM) tasks. Targeting on-device deployment and strong performance, these models are suitable for researchers and developers seeking compact yet capable AI solutions.

How It Works

The SmolLM2 family offers language models in 135M, 360M, and 1.7B parameter sizes, with instruction-tuned variants for assistant-like interactions. SmolVLM is a multimodal model capable of processing both images and text for tasks like visual QA and image description, supporting multiple images per conversation. The repository is structured into text/, vision/, and tools/ directories for organized development and inference.

Quick Start & Requirements

  • SmolLM2:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    checkpoint = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
    tokenizer = AutoTokenizer.from_pretrained(checkpoint)
    model = AutoModelForCausalLM.from_pretrained(checkpoint)
    
  • SmolVLM:
    from transformers import AutoProcessor, AutoModelForVision2Seq
    processor = AutoProcessor.from_pretrained("HuggingFaceTB/SmolVLM-Instruct")
    model = AutoModelForVision2Seq.from_pretrained("HuggingFaceTB/SmolVLM-Instruct")
    
  • Prerequisites: Hugging Face transformers library.
  • Resources: Models are designed for efficient on-device execution.
  • Documentation: SmolLM2 Documentation, SmolVLM Documentation, Local Inference Guide.

Highlighted Details

  • Includes continual pretraining code for Llama 3.2 3B on FineMath & FineWeb-Edu with nanotron.
  • Features FineMath, a public dataset for mathematics pretraining.
  • SmolVLM supports processing multiple images within a single conversation.
  • Offers lightweight AI-powered tools within the tools/smol_tools directory.

Maintenance & Community

Licensing & Compatibility

  • The specific license for the models and code is not explicitly stated in the README. Compatibility for commercial use or closed-source linking requires clarification.

Limitations & Caveats

The README does not specify the exact license under which the models and code are distributed, which may impact commercial adoption. Detailed performance benchmarks for each model size are not provided.

Health Check
Last commit

4 days ago

Responsiveness

1+ week

Pull Requests (30d)
16
Issues (30d)
4
Star History
833 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.