Language model prompting strategy research paper
Top 59.2% on sourcepulse
This repository provides code for the "Ask Me Anything" (AMA) prompt-aggregation strategy, designed to improve language model performance on various tasks. It's targeted at researchers and practitioners looking to enhance LLM accuracy through systematic prompt engineering and ensemble methods. The core benefit is a structured approach to generating and combining multiple prompt-based predictions, leading to more robust and accurate outputs.
How It Works
AMA employs a two-stage process: first, it recursively uses the language model itself to transform task formats and generate diverse prompts. Second, it aggregates predictions from these multiple prompts using a weak supervision framework. This meta-learning approach aims to discover optimal prompt combinations and leverage the collective intelligence of various prompt strategies, outperforming single-prompt baselines.
Quick Start & Requirements
conda
. Clone the repository, then install dependencies for ama_prompting
, metal-ama
(weak supervision), and manifest
(model loading).AMA_DATA
environment variable (default /home/data
). Requires downloading the PromptSource (P3) dataset from Hugging Face and other specified datasets.manifest
tool for loading models (e.g., EleutherAI/gpt-j-6B).Highlighted Details
metal-ama
) models dependencies between prompts for improved aggregation.manifest
for efficient model inference and caching.Maintenance & Community
The project is associated with HazyResearch and acknowledges support from Together Computer, Numbers Station, Snorkel, Stanford Center for Research on Foundation Models, and Stanford HAI. Specific contributors are listed in the paper citation.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. However, it cites and depends on other projects like Hugging Face datasets and Snorkel MeTaL, which have their own licenses. Compatibility for commercial use would require verifying the licenses of all dependencies.
Limitations & Caveats
The setup involves cloning multiple repositories and managing environment variables, which can be complex. The project relies on specific versions of datasets and models, and the inference process for large models can be time-consuming and resource-intensive.
2 years ago
Inactive