Discover and explore top open-source AI tools and projects—updated daily.
HazyResearchPlayground for efficient language model architecture research
Top 99.9% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Zoology provides ML researchers a simplified framework for understanding and testing language model architectures on synthetic tasks. It enables debugging and reproducing results from efficient language model research before costly large-scale pretraining.
How It Works
The project offers a straightforward playground with limited dependencies and understandable architecture implementations. It supports a diverse range of efficient models, including Mamba, RetNet, Hyena, and Based, facilitating the study of recall and throughput trade-offs on tasks like Multi-Query Associative Recall (MQAR). Its design prioritizes simplicity for research and debugging.
Quick Start & Requirements
Installation involves cloning the repository and installing with pip install -e .[extra,analysis]. Core dependencies include torch, einops, tqdm, pydantic, and wandb. Optional dependencies like mamba_ssm and conv1d may have compatibility issues (e.g., mamba_ssm with PyTorch 2.5, fla module with Python 3.10+). Experiments are launched via python -m zoology.launch <config_path>, with parallel execution supported via Ray (-p flag). Weights and Biases is used for logging.
Highlighted Details
DataSegmentConfig.Maintenance & Community
Developed by the HazyResearch group. No specific community channels (Discord/Slack) or roadmap details are provided in the README.
Licensing & Compatibility
The README includes a GitHub license shield but does not explicitly state the license type or any restrictions for commercial use or closed-source linking.
Limitations & Caveats
The training harness is simplified and not intended for large-scale model training. Some dependencies have specific version requirements and can be problematic. The MQAR synthetic task is noted as simplistic. A specific version of the RWKV-7 file requires manual state size computation updates.
2 weeks ago
1 week
tatsu-lab
inoryy
ByteDance-Seed
facebookresearch
HandsOnLLM