claude-token-efficient by drona23

LLM output token efficiency via context rules

Created 3 months ago

5,812 stars

Top 8.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elvis Saravia

Founder of DAIR.AI

Project Summary

This project provides a single-file solution (CLAUDE.md) to significantly reduce Claude's output token usage by approximately 63%, targeting verbosity, sycophancy, and formatting noise. It's designed for users running automation pipelines, agent loops, or code generation tasks where consistent, parseable, and concise output is critical, offering substantial cost savings and improved response quality without requiring any code modifications.

How It Works

The core approach involves placing a CLAUDE.md file in the project's root directory. Claude automatically reads this file, applying its contained rules to modify its output behavior. This mechanism targets common LLM response patterns like unnecessary pleasantries, restating prompts, and overly verbose code, enforcing conciseness and directness. The advantage lies in its "drop-in" nature, requiring zero code changes and immediately impacting Claude's responses.

Quick Start & Requirements

Primary install/run: Use curl -o CLAUDE.md https://raw.githubusercontent.com/drona23/claude-token-efficient/main/CLAUDE.md, clone the repository and copy a profile, or manually copy the file contents into your project root.
Prerequisites: Primarily tested on Claude models; untested on local models like Llama.cpp or Mistral.
Overhead: The CLAUDE.md file itself consumes input tokens on every message. Net savings are only realized when output volume is sufficiently high to offset this persistent cost.
Links: Full benchmark results are available in BENCHMARK.md.

Highlighted Details

Achieves an average output token reduction of ~63% across various prompts, including code review and explanations.
Fixes specific issues: bans sycophantic openers/closers, restating prompts, em dashes/smart quotes, "As an AI..." framing, unsolicited suggestions, over-engineered code, and enforces "I don't know" for uncertain facts.
Scalability estimates suggest potential monthly savings of ~$0.86 for 100 prompts/day and ~$8.64 for 1,000 prompts/day (using Sonnet pricing).
Supports composability via global, project-level, and subdirectory-level CLAUDE.md files for layered rule management.

Maintenance & Community

This project actively incorporates community feedback, with specific GitHub issues cited as direct inspirations for fixes. Contributions via PRs and issues are welcomed, and community submissions are integrated into future versions.

Licensing & Compatibility

The project is licensed under the MIT license, permitting free use, modification, and distribution. No specific restrictions for commercial use or closed-source linking are mentioned, aligning with the permissive nature of the MIT license.

Limitations & Caveats

This solution is not cost-effective for single, short queries or casual, low-volume use, as the input token overhead can result in a net token increase. It does not address deep failure modes like hallucinated implementations or architectural drift, which require more robust API-level enforcement. Its effectiveness is primarily validated on Claude models, with performance on other LLMs being untested.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

238 stars in the last 30 days