ama_prompting  by HazyResearch

Language model prompting strategy research paper

created 2 years ago
547 stars

Top 59.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides code for the "Ask Me Anything" (AMA) prompt-aggregation strategy, designed to improve language model performance on various tasks. It's targeted at researchers and practitioners looking to enhance LLM accuracy through systematic prompt engineering and ensemble methods. The core benefit is a structured approach to generating and combining multiple prompt-based predictions, leading to more robust and accurate outputs.

How It Works

AMA employs a two-stage process: first, it recursively uses the language model itself to transform task formats and generate diverse prompts. Second, it aggregates predictions from these multiple prompts using a weak supervision framework. This meta-learning approach aims to discover optimal prompt combinations and leverage the collective intelligence of various prompt strategies, outperforming single-prompt baselines.

Quick Start & Requirements

  • Installation: Requires Python 3.8 and conda. Clone the repository, then install dependencies for ama_prompting, metal-ama (weak supervision), and manifest (model loading).
  • Data: Assumes datasets are located in the AMA_DATA environment variable (default /home/data). Requires downloading the PromptSource (P3) dataset from Hugging Face and other specified datasets.
  • Models: Uses the manifest tool for loading models (e.g., EleutherAI/gpt-j-6B).
  • Setup: Detailed instructions are provided for setting up the environment, data, and model loading.
  • Docs: Hugging Face P3, Manifest Repo

Highlighted Details

  • Implements both zero-shot, few-shot, and the AMA (decomposed) prompting baselines.
  • Weak supervision component (metal-ama) models dependencies between prompts for improved aggregation.
  • Utilizes manifest for efficient model inference and caching.
  • Includes scripts for running experiments on benchmarks like SuperGLUE.

Maintenance & Community

The project is associated with HazyResearch and acknowledges support from Together Computer, Numbers Station, Snorkel, Stanford Center for Research on Foundation Models, and Stanford HAI. Specific contributors are listed in the paper citation.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it cites and depends on other projects like Hugging Face datasets and Snorkel MeTaL, which have their own licenses. Compatibility for commercial use would require verifying the licenses of all dependencies.

Limitations & Caveats

The setup involves cloning multiple repositories and managing environment variables, which can be complex. The project relies on specific versions of datasets and models, and the inference process for large models can be time-consuming and resource-intensive.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.