LLaMA-O1 by SimpleBerry

Open framework for large reasoning models

Created 1 year ago

804 stars

Top 43.9% on SourcePulse

Project Summary

LLaMA-O1 provides an open framework for training, inference, and evaluation of large reasoning models, specifically targeting the development of open-source Large Language Models (LLMs) with enhanced reasoning capabilities. It is designed for researchers and developers working with PyTorch and HuggingFace ecosystems.

How It Works

The framework leverages PyTorch and HuggingFace libraries for model implementation and training. It focuses on curated datasets for pretraining and supervised fine-tuning, with a roadmap including Reinforcement Learning from Human Feedback (RLHF) and inference-time reasoning enhancements. The approach emphasizes structured reasoning through datasets like OpenLongCoT.

Quick Start & Requirements

Install: Primarily through HuggingFace model downloads and PyTorch.
Prerequisites: PyTorch, HuggingFace libraries. GPU acceleration is recommended for training and efficient inference.
Demo: An online CPU-only demo is available: https://huggingface.co/spaces/SimpleBerry/LLaMA-O1-Supervised-1129-Demo
Models: Available on HuggingFace: https://huggingface.co/SimpleBerry/LLaMA-O1-Supervised-1129, https://huggingface.co/SimpleBerry/LLaMA-O1-Base-1127
Datasets: https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-SFT, https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-Pretrain-1202

Highlighted Details

Open-source models and datasets for large reasoning models.
Includes supervised and base pre-trained models.
Roadmap includes RLHF and inference-time reasoning enhancements.
GGUF quantized versions are available for broader compatibility.

Maintenance & Community

The project is hosted on GitHub: https://github.com/SimpleBerry/LLaMA-O1. Related research papers are linked for further context.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The framework is actively under development, with some features like RLHF and inference-time reasoning enhancements still in progress. The online demo is CPU-only, suggesting limited performance for interactive use without dedicated hardware.

LLaMA-O1 by SimpleBerry

Explore Similar Projects

recurrentgemma by google-deepmind

ReasonFlux by Gen-Verse

ScaleLLM by vectorch-ai

lightning-thunder by Lightning-AI

Open-Reasoner-Zero by Open-Reasoner-Zero

OLMo-core by allenai

open-thoughts by open-thoughts

OLMo by allenai

awesome-LLM-resources by WangRongsheng

lite.ai.toolkit by xlite-dev

ggml by ggml-org

transformers by huggingface