EasyInstruct by zjunlp

Instruction processing framework for LLMs

Created 2 years ago

409 stars

Top 71.3% on SourcePulse

Project Summary

EasyInstruct is a Python framework designed to simplify the processing of instructions for Large Language Models (LLMs). It targets researchers and developers working with LLMs, offering modularized tools for instruction generation, selection, and prompting, thereby streamlining experimental workflows.

How It Works

The framework modularizes instruction processing into distinct components: Generators, Selectors, Prompts, and Engines. Generators implement various instruction creation techniques like Self-Instruct, Evol-Instruct, Backtranslation, and KG2Instruct. Selectors offer metrics such as length, perplexity, ROUGE, and GPT scores to filter and refine instruction datasets. The Prompts and Engines modules handle the construction and execution of prompts on specified LLMs, supporting a range of commercial and locally deployed models.

Quick Start & Requirements

Installation: pip install git+https://github.com/zjunlp/EasyInstruct@main (latest) or pip install easyinstruct (PyPI, not latest).
Prerequisites: OpenAI API key is required for most generation methods. Supports models like GPT-3.5, GPT-4, Claude, and Cohere.
Demo: A Gradio app is available at demo/app.py or via HuggingFace Spaces.
Documentation: https://zjunlp.gitbook.io/easyinstruct/documentations

Highlighted Details

Supports multiple instruction generation methods: Self-Instruct, Evol-Instruct, Backtranslation, KG2Instruct.
Offers a comprehensive suite of instruction selection metrics, including ROUGE, GPT score, and CIRS for code.
Provides modular components for custom generator/selector development.
Includes a Gradio demo for quick experimentation and a shell script for batch processing.

Maintenance & Community

The project is actively maintained with regular updates (last commit recent) and welcomes Pull Requests. It is a subproject of KnowLM.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The PyPI version is not the latest. While supporting various LLMs, many generation methods rely on API access (e.g., OpenAI), incurring costs and requiring API keys.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days