Discover and explore top open-source AI tools and projects—updated daily.
openaiOpen-weight LLMs for reasoning and agents
Top 2.3% on SourcePulse
OpenAI's gpt-oss models (120B and 20B parameters) are open-weight language models designed for advanced reasoning, agentic tasks, and developer use cases. They offer full chain-of-thought, fine-tunability, and native agentic capabilities like function calling and code execution, all under a permissive Apache 2.0 license.
How It Works
These models utilize a Mixture-of-Experts (MoE) architecture, with the 120B model featuring 5.1B active parameters and the 20B model featuring 3.6B active parameters. A key innovation is their native MXFP4 quantization for MoE layers, enabling efficient inference on single GPUs (H100 for 120B) and reduced memory footprints. They are trained with a specific "harmony" response format, crucial for correct operation.
Quick Start & Requirements
pip install gpt-oss[torch], gpt-oss[triton], or gpt-oss[metal]. For vLLM: uv pip install --pre vllm==0.10.1+gptoss --extra-index-url https://wheels.vllm.ai/gpt-oss/ --extra-index-url https://download.pytorch.org/whl/nightly/cu128 --index-strategy unsafe-best-match.huggingface-cli download.Highlighted Details
harmony format.Maintenance & Community
This repository focuses on reference implementations; OpenAI does not intend to accept new feature contributions beyond bug fixes. Contributions to the awesome-gpt-oss.md list are welcome.
Licensing & Compatibility
Permissive Apache 2.0 license. Compatible with commercial and closed-source applications.
Limitations & Caveats
The PyTorch reference implementation is inefficient and requires multiple H100 GPUs. Metal and Triton implementations are for educational purposes and not production-ready. The Python tool implementation runs in a permissive Docker container, posing potential security risks.
1 month ago
Inactive
mryab
algorithmicsuperintelligence
EricLBuehler
ggml-org