Toolkit-for-Prompt-Compression by 3DAgentWorld

Prompt compression toolkit for LLM inference efficiency

Created 1 year ago

285 stars

Top 91.9% on SourcePulse

Project Summary

PCToolkit is a unified, plug-and-play toolkit for prompt compression in Large Language Models (LLMs). It offers researchers and developers a standardized framework to experiment with, evaluate, and integrate various state-of-the-art prompt compression techniques, aiming to improve inference efficiency and reduce computational costs. The toolkit supports multiple compression methods, diverse datasets, and evaluation metrics, facilitating reproducible research and practical application.

How It Works

PCToolkit employs a modular design, separating functionalities into Compressor, Dataset, Metric, and Runner modules. This architecture allows for easy integration of new compression algorithms, datasets, and evaluation metrics. The toolkit provides a unified interface to five distinct compressors: Selective Context, LLMLingua, LongLLMLingua, SCRL, and Keep it Simple. It supports evaluation across various NLP tasks, including reconstruction, summarization, and question answering, using 11 datasets and over 5 metrics.

Quick Start & Requirements

Install via pip install -r requirements.txt after cloning the repository.
Requires manual download of models, particularly for the SCRL method, with guidance provided in the /models folder. Huggingface tokens and OpenAI API keys are needed for certain functionalities.
Official Demo: https://huggingface.co/spaces/CjangCjengh/Prompt-Compression-Toolbox
Technical Report: https://arxiv.org/abs/2403.17411

Highlighted Details

Supports 5 distinct prompt compression methods (Selective Context, LLMLingua, LongLLMLingua, SCRL, Keep it Simple).
Integrates 11 datasets and 5+ evaluation metrics for comprehensive benchmarking.
Modular design allows for easy addition of new compressors, datasets, and metrics.
Evaluated across a wide range of NLP tasks including reconstruction, summarization, mathematical problem-solving, and code completion.

Maintenance & Community

The project is associated with research from authors Li, Yucheng et al. and Jiang, Huiqiang et al. Further details can be found in the linked paper and technical report.

Licensing & Compatibility

The repository is licensed under the MIT license. This license generally permits commercial use and integration into closed-source projects.

Limitations & Caveats

Model weights need to be downloaded manually, and API keys for services like OpenAI must be configured. The README indicates that modifications to metrics might be necessary, especially for the LongBench dataset.

Toolkit-for-Prompt-Compression by 3DAgentWorld

Explore Similar Projects

Selective_Context by liyucheng09

prompt-lookup-decoding by apoorvumang

AutoCompressors by princeton-nlp

mtla by D-Keqi

Seed-Coder by ByteDance-Seed

Awesome-LLM4IE-Papers by quqxui

EAGLE by SafeAILab

LongLoRA by JIA-Lab-research

AliceMind by alibaba

Awesome-LLM-Inference by xlite-dev

LLMLingua by microsoft

lm-evaluation-harness by EleutherAI