zoology  by HazyResearch

Playground for efficient language model architecture research

Created 2 years ago
260 stars

Top 97.6% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Zoology provides ML researchers a simplified framework for understanding and testing language model architectures on synthetic tasks. It enables debugging and reproducing results from efficient language model research before costly large-scale pretraining.

How It Works

The project offers a straightforward playground with limited dependencies and understandable architecture implementations. It supports a diverse range of efficient models, including Mamba, RetNet, Hyena, and Based, facilitating the study of recall and throughput trade-offs on tasks like Multi-Query Associative Recall (MQAR). Its design prioritizes simplicity for research and debugging.

Quick Start & Requirements

Installation involves cloning the repository and installing with pip install -e .[extra,analysis]. Core dependencies include torch, einops, tqdm, pydantic, and wandb. Optional dependencies like mamba_ssm and conv1d may have compatibility issues (e.g., mamba_ssm with PyTorch 2.5, fla module with Python 3.10+). Experiments are launched via python -m zoology.launch <config_path>, with parallel execution supported via Ray (-p flag). Weights and Biases is used for logging.

Highlighted Details

  • Facilitates reproduction of key results from papers on efficient language models.
  • Integrates implementations for numerous state-of-the-art efficient architectures.
  • Offers a flexible configuration system using Pydantic models for experiments and tasks.
  • Supports custom synthetic task creation by subclassing DataSegmentConfig.

Maintenance & Community

Developed by the HazyResearch group. No specific community channels (Discord/Slack) or roadmap details are provided in the README.

Licensing & Compatibility

The README includes a GitHub license shield but does not explicitly state the license type or any restrictions for commercial use or closed-source linking.

Limitations & Caveats

The training harness is simplified and not intended for large-scale model training. Some dependencies have specific version requirements and can be problematic. The MQAR synthetic task is noted as simplistic. A specific version of the RWKV-7 file requires manual state size computation updates.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

alpaca_farm by tatsu-lab

0%
842
RLHF simulation framework for accessible instruction-following/alignment research
Created 2 years ago
Updated 1 year ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

1.3%
2k
Framework for scaling multimodal model training across accelerators
Created 11 months ago
Updated 19 hours ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
7 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 1 year ago
Updated 7 months ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
3 more.

Hands-On-Large-Language-Models by HandsOnLLM

0.5%
23k
Code examples for "Hands-On Large Language Models" book
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.