zoology  by HazyResearch

Playground for efficient language model architecture research

Created 2 years ago
251 stars

Top 99.9% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Zoology provides ML researchers a simplified framework for understanding and testing language model architectures on synthetic tasks. It enables debugging and reproducing results from efficient language model research before costly large-scale pretraining.

How It Works

The project offers a straightforward playground with limited dependencies and understandable architecture implementations. It supports a diverse range of efficient models, including Mamba, RetNet, Hyena, and Based, facilitating the study of recall and throughput trade-offs on tasks like Multi-Query Associative Recall (MQAR). Its design prioritizes simplicity for research and debugging.

Quick Start & Requirements

Installation involves cloning the repository and installing with pip install -e .[extra,analysis]. Core dependencies include torch, einops, tqdm, pydantic, and wandb. Optional dependencies like mamba_ssm and conv1d may have compatibility issues (e.g., mamba_ssm with PyTorch 2.5, fla module with Python 3.10+). Experiments are launched via python -m zoology.launch <config_path>, with parallel execution supported via Ray (-p flag). Weights and Biases is used for logging.

Highlighted Details

  • Facilitates reproduction of key results from papers on efficient language models.
  • Integrates implementations for numerous state-of-the-art efficient architectures.
  • Offers a flexible configuration system using Pydantic models for experiments and tasks.
  • Supports custom synthetic task creation by subclassing DataSegmentConfig.

Maintenance & Community

Developed by the HazyResearch group. No specific community channels (Discord/Slack) or roadmap details are provided in the README.

Licensing & Compatibility

The README includes a GitHub license shield but does not explicitly state the license type or any restrictions for commercial use or closed-source linking.

Limitations & Caveats

The training harness is simplified and not intended for large-scale model training. Some dependencies have specific version requirements and can be problematic. The MQAR synthetic task is noted as simplistic. A specific version of the RWKV-7 file requires manual state size computation updates.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

alpaca_farm by tatsu-lab

0.1%
840
RLHF simulation framework for accessible instruction-following/alignment research
Created 2 years ago
Updated 1 year ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

1.7%
2k
Framework for scaling multimodal model training across accelerators
Created 10 months ago
Updated 2 days ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
7 more.

lingua by facebookresearch

0.0%
5k
LLM research codebase for training and inference
Created 1 year ago
Updated 6 months ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
3 more.

Hands-On-Large-Language-Models by HandsOnLLM

0.7%
20k
Code examples for "Hands-On Large Language Models" book
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.