modelzoo  by Cerebras

Model zoo for Cerebras hardware

Created 3 years ago
1,065 stars

Top 35.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a collection of deep learning models and utilities optimized for Cerebras hardware, targeting researchers and engineers who need to train and deploy models efficiently on Cerebras systems. It offers reference implementations, configuration files, and tools to streamline workflows, enabling faster development and deployment of advanced AI models.

How It Works

The ModelZoo leverages a comprehensive Command-Line Interface (CLI) as a single entry point for all tasks, including data preprocessing, model training, and validation. It includes optimized reference implementations and configuration files for a wide range of NLP, vision, and multimodal models like Llama, Mixtral, and DINOv2. The system supports advanced training optimizations such as custom training loops, custom model implementations, and sequence length scaling techniques like rotary position embedding (RoPE) scaling.

Quick Start & Requirements

Highlighted Details

  • Includes reference implementations for numerous popular models (Llama, Mixtral, DINOv2, LLaVA, etc.).
  • Provides tools for checkpoint conversion (Cerebras ↔ HuggingFace) and PyTorch model porting.
  • Supports advanced training optimizations like µParam (μP) scaling and RoPE scaling.
  • Features a CLI for streamlined data preprocessing, model training, and validation.

Maintenance & Community

The project is maintained by Cerebras. Further community and roadmap details are not explicitly provided in the README.

Licensing & Compatibility

  • License: Apache License 2.0.
  • Compatibility: Designed specifically for Cerebras hardware. Commercial use is permitted under the Apache 2.0 license.

Limitations & Caveats

This ModelZoo is specifically optimized for Cerebras hardware, implying limited utility or performance on non-Cerebras systems. Access to Cerebras hardware is a prerequisite for utilizing the full functionality of the repository.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

3.4%
1k
Framework for scaling multimodal model training across accelerators
Created 5 months ago
Updated 3 weeks ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 11 months ago
Updated 2 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
13 more.

torchtitan by pytorch

0.7%
4k
PyTorch platform for generative AI model training research
Created 1 year ago
Updated 21 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 14 hours ago
Feedback? Help us improve.