Selective_Context  by liyucheng09

Context compressor for LLM inference efficiency (EMNLP 2023)

created 2 years ago
390 stars

Top 74.7% on sourcepulse

GitHubView on GitHub
Project Summary

Selective Context addresses the challenge of LLM context window limitations by compressing input text, enabling models to process twice the content while reducing memory and GPU usage by 40%. It is designed for researchers and practitioners working with long documents or extended conversations, offering significant efficiency gains without performance degradation.

How It Works

The core approach involves evaluating the informativeness of lexical units (sentences, phrases, or tokens) within a given context. It uses a base language model to compute self-information scores for these units, effectively identifying and retaining the most crucial information while discarding less relevant content. This method maximizes the utility of fixed context lengths in LLMs.

Quick Start & Requirements

  • Install via pip: pip install selective-context
  • Download spaCy models: python -m spacy download en_core_web_sm (and zh_core_web_sm for Chinese).
  • Official demo available on Huggingface Space.
  • Streamlit app included: streamlit run app/app.py

Highlighted Details

  • Enables processing of 2x more content.
  • Claims 40% reduction in memory and GPU time.
  • Evaluated on summarization, QA, context reconstruction, and conversation tasks.
  • Accepted for EMNLP 2023.

Maintenance & Community

  • Project accepted to EMNLP 2023.
  • Links to Huggingface Hub datasets and previous arXiv version provided.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permissive license suitable for commercial use and closed-source integration.

Limitations & Caveats

The repository focuses on specific models and languages (e.g., GPT-2, English/Chinese) for its evaluations, and reproduction of paper experiments requires downloading custom datasets.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

yarn by jquesnelle

1.0%
2k
Context window extension method for LLMs (research paper, models)
created 2 years ago
updated 1 year ago
Feedback? Help us improve.