Instruction-following LLM trained on the Databricks Machine Learning Platform
Top 4.8% on sourcepulse
Databricks' Dolly is an open-source, instruction-following large language model derived from EleutherAI's Pythia-12b. It is fine-tuned on a ~15K instruction/response dataset created by Databricks employees, covering various capability domains. Dolly is designed for commercial use and aims to democratize access to instruction-tuned LLMs, offering a surprisingly capable model despite its limitations.
How It Works
Dolly-v2-12b is a 12 billion parameter causal language model. It leverages the Pythia-12b foundation model and is fine-tuned on a custom dataset of ~15,000 instruction-response pairs. This dataset was curated by Databricks employees, focusing on capabilities like brainstorming, classification, question answering, generation, information extraction, and summarization, inspired by the InstructGPT paper. This approach aims to imbue the model with strong instruction-following abilities.
Quick Start & Requirements
transformers
library:
from transformers import pipeline
import torch
instruct_pipeline = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
torch.float16
).Standard_ND96asr_v4
or p4d.24xlarge
). Training on A10 or V100 GPUs is possible with modifications.Highlighted Details
Maintenance & Community
The project is hosted by Databricks Labs. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
Limitations & Caveats
Dolly-v2-12b is not state-of-the-art and struggles with complex prompts, programming, math, factual accuracy, and nuanced tasks like humor or stylistic mimicry. The training data may reflect biases present in the internet and Wikipedia, as well as the specific demographics of Databricks employees.
2 years ago
1 day