prodigy-openai-recipes by explosion

Prodigy recipes for zero/few-shot learning via OpenAI GPT-3

Created 3 years ago

323 stars

Top 84.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

This repository provides recipes for integrating OpenAI's GPT-3 with Prodigy for efficient data annotation. It targets NLP practitioners and researchers looking to bootstrap annotation workflows using zero- and few-shot learning, enabling faster creation of high-quality datasets for training custom models.

How It Works

The core approach leverages OpenAI's LLMs to generate initial predictions for tasks like Named Entity Recognition (NER) and text classification. These predictions are then presented within the Prodigy annotation interface, allowing users to quickly curate them. Users can refine prompts and provide examples interactively, with corrections feeding back into the LLM's context for improved future predictions.

Quick Start & Requirements

Install Prodigy with a license key: python -m pip install prodigy -f https://XXXX-XXXX-XXXX-XXXX@download.prodi.gy
Install dependencies: python -m pip install -r requirements.txt
Set OPENAI_ORG and OPENAI_KEY environment variables.
Requires an OpenAI API key and a Prodigy license key.

Highlighted Details

Recipes for NER, text classification, term/pattern generation, and prompt A/B testing.
Supports zero-shot and few-shot learning via customizable Jinja2 prompt templates.
Allows interactive prompt tuning and feedback loop for improved LLM predictions.
Includes utilities for exporting annotations and training downstream spaCy or Hugging Face models.

Maintenance & Community

This repository is marked as archival, with its functionality moved to Prodigy and spaCy-llm for continued maintenance and upgrades.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility with commercial or closed-source projects would depend on the specific license chosen for the new, maintained versions in Prodigy.

Limitations & Caveats

The archival notice indicates that this repository is no longer actively maintained, with all features migrated to Prodigy. Users should refer to the Prodigy documentation for the latest implementations and support. OpenAI's prompt size limit (4079 tokens) restricts the complexity and length of prompts.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days