Dromedary by IBM

Self-aligned language model research paper with minimal human supervision

Created 2 years ago

1,145 stars

Top 33.6% on SourcePulse

View on GitHub

4 Experts Love This Project

Zhuohan Li

Coauthor of vLLM

Junyang Lin

Core Maintainer at Alibaba Qwen

Lewis Tunstall

Research Engineer at Hugging Face

Edward Sun

Research Scientist at Meta Superintelligence Lab

Project Summary

Dromedary is an open-source project focused on creating helpful, ethical, and reliable large language models (LLMs) through principle-driven self-alignment with minimal human supervision. It targets researchers and developers aiming to train or deploy LLMs with enhanced alignment capabilities, offering a novel approach to reduce reliance on extensive human feedback.

How It Works

Dromedary employs a "SELF-ALIGN" process, notably updated in Dromedary-2. This process leverages diverse datasets like ShareGPT and OpenOrca, incorporating an improved prompt style (general-specific-general) to guide LLM responses. This method enhances performance by directly using curated exemplars, eliminating the need for verbose cloning or inference-time few-shot examples, and simplifying the training pipeline. The SALMON pipeline for Dromedary-2's RLAIF is available in a separate IBM/SALMON repository.

Quick Start & Requirements

Installation: For custom training or inference on non-standard GPU counts, install the llama_dromedary package: cd llama_dromedary && pip install -r requirements.txt && pip install -e .. For standard inference (1, 2, 4, 8, 16 GPUs), use the original LLaMA repo. Additional inference packages are in the inference directory.
Prerequisites: PyTorch with CUDA support.
Model Weights: Released as delta weights requiring application to original LLaMA weights.
Resources: Training requires significant computational resources. Inference setup is detailed in the provided guides.
Links: Project page and paper (linked in README), Hugging Face Datasets Hub for synthetic data.

Highlighted Details

Achieved a NeurIPS 2023 Spotlight presentation.
Offers a full training pipeline for reproduction.
Releases synthetic data used for training Dromedary-65b and Dromedary-2-70b.
Provides delta weights for compliance with LLaMA license.

Maintenance & Community

The project acknowledges contributions from the Meta LLaMA, Stanford Alpaca, and Vicuna teams, among others. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

Model weights are released as delta weights, subject to the original LLaMA model license. Code licensing is not explicitly stated but appears to be permissive, given the acknowledgments of other open-source projects. Compatibility for commercial use depends on the LLaMA license terms.

Limitations & Caveats

Model weights are provided as deltas, requiring users to obtain and combine them with the base LLaMA weights. The effectiveness of the self-alignment process may vary depending on the quality and diversity of the input prompts and exemplars.

Health Check

Last Commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days