zoology by HazyResearch

Playground for efficient language model architecture research

Created 2 years ago

260 stars

Top 97.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Albert Gu

Cofounder of Cartesia; Professor at CMU

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Zoology provides ML researchers a simplified framework for understanding and testing language model architectures on synthetic tasks. It enables debugging and reproducing results from efficient language model research before costly large-scale pretraining.

How It Works

The project offers a straightforward playground with limited dependencies and understandable architecture implementations. It supports a diverse range of efficient models, including Mamba, RetNet, Hyena, and Based, facilitating the study of recall and throughput trade-offs on tasks like Multi-Query Associative Recall (MQAR). Its design prioritizes simplicity for research and debugging.

Quick Start & Requirements

Installation involves cloning the repository and installing with pip install -e .[extra,analysis]. Core dependencies include torch, einops, tqdm, pydantic, and wandb. Optional dependencies like mamba_ssm and conv1d may have compatibility issues (e.g., mamba_ssm with PyTorch 2.5, fla module with Python 3.10+). Experiments are launched via python -m zoology.launch <config_path>, with parallel execution supported via Ray (-p flag). Weights and Biases is used for logging.

Highlighted Details

Facilitates reproduction of key results from papers on efficient language models.
Integrates implementations for numerous state-of-the-art efficient architectures.
Offers a flexible configuration system using Pydantic models for experiments and tasks.
Supports custom synthetic task creation by subclassing DataSegmentConfig.

Maintenance & Community

Developed by the HazyResearch group. No specific community channels (Discord/Slack) or roadmap details are provided in the README.

Licensing & Compatibility

The README includes a GitHub license shield but does not explicitly state the license type or any restrictions for commercial use or closed-source linking.

Limitations & Caveats

The training harness is simplified and not intended for large-scale model training. Some dependencies have specific version requirements and can be problematic. The MQAR synthetic task is noted as simplistic. A specific version of the RWKV-7 file requires manual state size computation updates.

zoology by HazyResearch

Explore Similar Projects

memprompt by madaan

NanoSage by masterFoad

recurrentgemma by google-deepmind

simple-llm by naklecha

LLM-Synthetic-Data by pengr

alpaca_farm by tatsu-lab

reaver by inoryy

VeOmni by ByteDance-Seed

llm_from_scratch by vivekkalyanarangan30

lingua by facebookresearch

experiments by SWE-bench

Hands-On-Large-Language-Models by HandsOnLLM