Dromedary  by IBM

Self-aligned language model research paper with minimal human supervision

created 2 years ago
1,148 stars

Top 34.3% on sourcepulse

GitHubView on GitHub
Project Summary

Dromedary is an open-source project focused on creating helpful, ethical, and reliable large language models (LLMs) through principle-driven self-alignment with minimal human supervision. It targets researchers and developers aiming to train or deploy LLMs with enhanced alignment capabilities, offering a novel approach to reduce reliance on extensive human feedback.

How It Works

Dromedary employs a "SELF-ALIGN" process, notably updated in Dromedary-2. This process leverages diverse datasets like ShareGPT and OpenOrca, incorporating an improved prompt style (general-specific-general) to guide LLM responses. This method enhances performance by directly using curated exemplars, eliminating the need for verbose cloning or inference-time few-shot examples, and simplifying the training pipeline. The SALMON pipeline for Dromedary-2's RLAIF is available in a separate IBM/SALMON repository.

Quick Start & Requirements

  • Installation: For custom training or inference on non-standard GPU counts, install the llama_dromedary package: cd llama_dromedary && pip install -r requirements.txt && pip install -e .. For standard inference (1, 2, 4, 8, 16 GPUs), use the original LLaMA repo. Additional inference packages are in the inference directory.
  • Prerequisites: PyTorch with CUDA support.
  • Model Weights: Released as delta weights requiring application to original LLaMA weights.
  • Resources: Training requires significant computational resources. Inference setup is detailed in the provided guides.
  • Links: Project page and paper (linked in README), Hugging Face Datasets Hub for synthetic data.

Highlighted Details

  • Achieved a NeurIPS 2023 Spotlight presentation.
  • Offers a full training pipeline for reproduction.
  • Releases synthetic data used for training Dromedary-65b and Dromedary-2-70b.
  • Provides delta weights for compliance with LLaMA license.

Maintenance & Community

The project acknowledges contributions from the Meta LLaMA, Stanford Alpaca, and Vicuna teams, among others. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

Model weights are released as delta weights, subject to the original LLaMA model license. Code licensing is not explicitly stated but appears to be permissive, given the acknowledgments of other open-source projects. Compatibility for commercial use depends on the LLaMA license terms.

Limitations & Caveats

Model weights are provided as deltas, requiring users to obtain and combine them with the base LLaMA weights. The effectiveness of the self-alignment process may vary depending on the quality and diversity of the input prompts and exemplars.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.