EasyInstruct  by zjunlp

Instruction processing framework for LLMs

Created 2 years ago
406 stars

Top 71.7% on SourcePulse

GitHubView on GitHub
Project Summary

EasyInstruct is a Python framework designed to simplify the processing of instructions for Large Language Models (LLMs). It targets researchers and developers working with LLMs, offering modularized tools for instruction generation, selection, and prompting, thereby streamlining experimental workflows.

How It Works

The framework modularizes instruction processing into distinct components: Generators, Selectors, Prompts, and Engines. Generators implement various instruction creation techniques like Self-Instruct, Evol-Instruct, Backtranslation, and KG2Instruct. Selectors offer metrics such as length, perplexity, ROUGE, and GPT scores to filter and refine instruction datasets. The Prompts and Engines modules handle the construction and execution of prompts on specified LLMs, supporting a range of commercial and locally deployed models.

Quick Start & Requirements

  • Installation: pip install git+https://github.com/zjunlp/EasyInstruct@main (latest) or pip install easyinstruct (PyPI, not latest).
  • Prerequisites: OpenAI API key is required for most generation methods. Supports models like GPT-3.5, GPT-4, Claude, and Cohere.
  • Demo: A Gradio app is available at demo/app.py or via HuggingFace Spaces.
  • Documentation: https://zjunlp.gitbook.io/easyinstruct/documentations

Highlighted Details

  • Supports multiple instruction generation methods: Self-Instruct, Evol-Instruct, Backtranslation, KG2Instruct.
  • Offers a comprehensive suite of instruction selection metrics, including ROUGE, GPT score, and CIRS for code.
  • Provides modular components for custom generator/selector development.
  • Includes a Gradio demo for quick experimentation and a shell script for batch processing.

Maintenance & Community

The project is actively maintained with regular updates (last commit recent) and welcomes Pull Requests. It is a subproject of KnowLM.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The PyPI version is not the latest. While supporting various LLMs, many generation methods rely on API access (e.g., OpenAI), incurring costs and requiring API keys.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), and
3 more.

Alpaca-CoT by PhoebusSi

0.1%
3k
IFT platform for instruction collection, parameter-efficient methods, and LLMs
Created 2 years ago
Updated 1 year ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code), and
11 more.

open-instruct by allenai

0.7%
3k
Training codebase for instruction-following language models
Created 2 years ago
Updated 17 hours ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
18 more.

WizardLM by nlpxucan

0.0%
9k
LLMs built using Evol-Instruct for complex instruction following
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.