Discover and explore top open-source AI tools and projects—updated daily.
arman-bdAccessible LLM training and inference
New!
Top 19.2% on SourcePulse
GuppyLM is a ~9M parameter language model designed to demystify the end-to-end process of training an LLM from scratch. It targets engineers and researchers who want to understand LLM mechanics without requiring extensive resources or expertise. The project provides a fully runnable example, from data generation to inference, enabling users to build their own LLM in minutes, fostering practical learning and demystifying "black box" models.
How It Works
The project employs a deliberately simple, vanilla transformer architecture with 8.7 million parameters, eschewing advanced optimizations like GQA, RoPE, or SwiGLU for clarity and ease of implementation. It utilizes a BPE tokenizer with a 4,096-token vocabulary and a 128-token maximum sequence length. Training is performed on 60,000 synthetic conversations generated via template composition across 60 distinct topics, ensuring a consistent, fish-like personality. This minimalist approach prioritizes educational value and rapid iteration over raw performance.
Quick Start & Requirements
pip install torch tokenizers, then python -m guppylm chat.Highlighted Details
Maintenance & Community
The provided README does not detail specific maintenance contributors, community channels (like Discord/Slack), or a public roadmap.
Licensing & Compatibility
Released under the MIT license, permitting commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
The model's 128-token context window severely limits multi-turn conversation quality, leading to degradation after a few exchanges. Its personality is hardcoded into the weights rather than being controllable via system prompts, and it does not comprehend complex human abstractions.
3 days ago
Inactive
mlabonne