Discover and explore top open-source AI tools and projects—updated daily.
jia-gaoReduce LLM token costs with intelligent prompt compression
Top 87.0% on SourcePulse
Summary
Leanctx is a Python SDK designed to drastically reduce LLM input token costs (40-60%) for production applications by compressing prompts without requiring code modifications. It targets developers building RAG systems, conversational agents, and document processing pipelines where large contexts lead to high token bills. By intelligently compressing dynamic content while preserving critical elements like code and tool calls, leanctx offers significant cost savings and improved accuracy on long-context benchmarks, running locally by default for enhanced privacy.
How It Works
Leanctx acts as a drop-in wrapper around existing LLM SDKs (OpenAI, Anthropic, Gemini). It intercepts requests, applies a configurable compression pipeline before sending them to the LLM provider. This pipeline includes middleware for mode (on/off) and triggers (e.g., minimum token threshold). A content classifier routes message parts—code, errors, prose, etc.—to specific compressors: verbatim preservation for critical data, the local LLMLingua-2 model (Lingua) for general prose, or a configured LLM (SelfLLM) for higher-quality summarization. This approach ensures essential information remains intact while reducing redundant tokens, offering a flexible trade-off between compression ratio, cost, and quality.
Quick Start & Requirements
Install core functionality and provider SDKs with:
pip install 'leanctx[openai,anthropic,gemini]'
To enable local LLMLingua-2 compression, add:
pip install 'leanctx[lingua]' (requires ~1.2 GB download for model weights to ~/.cache/huggingface/ on first use).
The project includes a CLI for benchmarking (leanctx bench run).
Highlighted Details
Maintenance & Community
The project is actively maintained, with version 0.3.1 released on April 26, 2026. The roadmap outlines planned features including full LongBench v2 sweep, Docker Hub publishing, multimodal compression, and TypeScript SDK porting. Community links (Discord/Slack) are not explicitly mentioned in the README.
Licensing & Compatibility
Leanctx is released under the MIT License, permitting commercial use and integration into closed-source applications.
Limitations & Caveats
As of v0.3.1, Gemini's multimodal requests and function calls automatically fall back to passthrough mode ("opaque-bailout") as compression is not yet supported for these specific types. Compression for these scenarios is targeted for v0.3.x.
2 weeks ago
Inactive
ggml-org
huggingface
AntonOsika