claw-compactor by open-compress

Slash AI agent token costs with advanced compression

Created 5 months ago

2,191 stars

Top 19.8% on SourcePulse

Project Summary

Claw Compactor addresses the significant token spend incurred by AI agents by providing a command-line tool to compress workspace files. It targets AI agent developers and power users seeking to reduce operational costs by halving token usage through a suite of deterministic, rule-based compression techniques, eliminating the need for costly LLM processing during compression.

How It Works

The tool employs a five-layer compression pipeline: a rule engine for deduplication and filler removal, dictionary encoding with an auto-learned codebook, observation compression converting session JSONL to structured summaries, run-length encoding (RLE) for shorthand patterns, and a Compressed Context Protocol for abbreviation. This layered, deterministic approach ensures maximum savings without LLM inference costs, with specific layers offering lossless roundtrips or preserving all facts and decisions while removing verbose formatting.

Quick Start & Requirements

Install: Clone the repository (git clone https://github.com/aeromomo/claw-compactor.git) and navigate into the directory (cd claw-compactor).
Prerequisites: Python 3.9+. pip install tiktoken is optional but recommended for exact token counts; a CJK-aware heuristic is used otherwise.
Usage:
- Benchmark (dry-run): python3 scripts/mem_compress.py /path/to/workspace benchmark
- Compress: python3 scripts/mem_compress.py /path/to/workspace full
Links: GitHub Repository

Highlighted Details

Achieves ~97% savings on session transcripts via observation extraction.
Offers 50-70% savings on verbose/new workspaces and 10-20% on regular maintenance runs.
Fully CJK-aware, supporting Chinese, Japanese, and Korean characters.
Complements prompt caching for up to 95% effective cost reduction (50% compression + 90% cache discount).

Maintenance & Community

The provided README does not detail specific contributors, sponsorships, partnerships, or community channels (e.g., Discord, Slack).

Licensing & Compatibility

The project is released under the MIT License, generally permitting commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Observation compression and the Compressed Context Protocol are lossy but designed to preserve all facts and decisions. If tiktoken is not installed, token counts rely on a heuristic that is approximately 90% accurate. Dictionary decompression requires the presence of the memory/.codebook.json file. Ensure the workspace path is correct to avoid FileNotFoundError.

claw-compactor by open-compress

Explore Similar Projects

python-toon by xaviviro

TokenDagger by M4THYOU

rellm by r2d4

sqz by ojuschugh1

prompt-optimizer by vaibkumr

caveman-compression by wilpel

kvpress by NVIDIA

R-KV by Zefan-Cai

RedPajama-Data by togethercomputer

LLMLingua by microsoft

scaledown by scaledown-team

tiktoken by openai