C2C  by thu-nics

Direct LLM communication via KV-Cache projection

Created 1 month ago
274 stars

Top 94.4% on SourcePulse

GitHubView on GitHub
Project Summary

Direct semantic communication between Large Language Models (LLMs) is enabled by Cache-to-Cache (C2C), which leverages KV-caches to bypass traditional text generation. This approach targets researchers and engineers aiming to enhance LLM collaboration, offering significant improvements in accuracy and performance while reducing latency.

How It Works

C2C projects and fuses KV-caches between different LLMs, creating a shared semantic space. This novel method allows models to exchange information directly at the neural representation level, analogous to translating between languages via the Rosetta Stone. This direct communication bypasses the overhead and potential information loss of text-based inter-model dialogue, leading to demonstrably better outcomes.

Quick Start & Requirements

Installation involves creating a Python 3.10 environment (conda create -n rosetta python=3.10, conda activate rosetta) and installing the package (pip install -e .). Additional dependencies for training and evaluation are available via pip install -e ".[training,evaluation]". The project relies on PyTorch and Hugging Face Transformers. GPU acceleration is necessary for practical use.

Highlighted Details

  • Achieves 8.5–10.5% higher accuracy compared to individual LLMs.
  • Outperforms text-based communication by 3.0–5.0% in performance.
  • Delivers a 2.0× speedup in inference latency.
  • Supports arbitrary LLM pairs, automatically adapting to differing architectures (hidden dimensions, layers, heads, tokenizers).

Maintenance & Community

The project is associated with authors Tianyu Fu, Zihan Min, et al. While community enthusiasm is noted, specific channels like Discord/Slack or a public roadmap are not detailed in the README.

Licensing & Compatibility

The specific open-source license is not stated in the provided README. This omission requires clarification for adoption decisions, especially concerning commercial use or integration into closed-source projects. The framework is designed for broad compatibility with diverse LLM architectures.

Limitations & Caveats

The README does not detail known limitations, bugs, or deprecation plans. The absence of a specified license is a significant adoption blocker. The project's research paper is dated 2025, suggesting it may represent early-stage development.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
191 stars in the last 30 days

Explore Similar Projects

Starred by Taranjeet Singh Taranjeet Singh(Cofounder of Mem0), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
4 more.

LMCache by LMCache

1.1%
6k
LLM serving engine extension for reduced TTFT and increased throughput
Created 1 year ago
Updated 23 hours ago
Feedback? Help us improve.