Dromedary  by IBM

Self-aligned language model research paper with minimal human supervision

Created 2 years ago
1,146 stars

Top 33.6% on SourcePulse

GitHubView on GitHub
Project Summary

Dromedary is an open-source project focused on creating helpful, ethical, and reliable large language models (LLMs) through principle-driven self-alignment with minimal human supervision. It targets researchers and developers aiming to train or deploy LLMs with enhanced alignment capabilities, offering a novel approach to reduce reliance on extensive human feedback.

How It Works

Dromedary employs a "SELF-ALIGN" process, notably updated in Dromedary-2. This process leverages diverse datasets like ShareGPT and OpenOrca, incorporating an improved prompt style (general-specific-general) to guide LLM responses. This method enhances performance by directly using curated exemplars, eliminating the need for verbose cloning or inference-time few-shot examples, and simplifying the training pipeline. The SALMON pipeline for Dromedary-2's RLAIF is available in a separate IBM/SALMON repository.

Quick Start & Requirements

  • Installation: For custom training or inference on non-standard GPU counts, install the llama_dromedary package: cd llama_dromedary && pip install -r requirements.txt && pip install -e .. For standard inference (1, 2, 4, 8, 16 GPUs), use the original LLaMA repo. Additional inference packages are in the inference directory.
  • Prerequisites: PyTorch with CUDA support.
  • Model Weights: Released as delta weights requiring application to original LLaMA weights.
  • Resources: Training requires significant computational resources. Inference setup is detailed in the provided guides.
  • Links: Project page and paper (linked in README), Hugging Face Datasets Hub for synthetic data.

Highlighted Details

  • Achieved a NeurIPS 2023 Spotlight presentation.
  • Offers a full training pipeline for reproduction.
  • Releases synthetic data used for training Dromedary-65b and Dromedary-2-70b.
  • Provides delta weights for compliance with LLaMA license.

Maintenance & Community

The project acknowledges contributions from the Meta LLaMA, Stanford Alpaca, and Vicuna teams, among others. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

Model weights are released as delta weights, subject to the original LLaMA model license. Code licensing is not explicitly stated but appears to be permissive, given the acknowledgments of other open-source projects. Compatibility for commercial use depends on the LLaMA license terms.

Limitations & Caveats

Model weights are provided as deltas, requiring users to obtain and combine them with the base LLaMA weights. The effectiveness of the self-alignment process may vary depending on the quality and diversity of the input prompts and exemplars.

Health Check
Last Commit

23 hours ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.