C2C by thu-nics

Direct LLM communication via KV-Cache projection

Created 5 months ago

360 stars

Top 78.3% on SourcePulse

Project Summary

Direct semantic communication between Large Language Models (LLMs) is enabled by Cache-to-Cache (C2C), which leverages KV-caches to bypass traditional text generation. This approach targets researchers and engineers aiming to enhance LLM collaboration, offering significant improvements in accuracy and performance while reducing latency.

How It Works

C2C projects and fuses KV-caches between different LLMs, creating a shared semantic space. This novel method allows models to exchange information directly at the neural representation level, analogous to translating between languages via the Rosetta Stone. This direct communication bypasses the overhead and potential information loss of text-based inter-model dialogue, leading to demonstrably better outcomes.

Quick Start & Requirements

Installation involves creating a Python 3.10 environment (conda create -n rosetta python=3.10, conda activate rosetta) and installing the package (pip install -e .). Additional dependencies for training and evaluation are available via pip install -e ".[training,evaluation]". The project relies on PyTorch and Hugging Face Transformers. GPU acceleration is necessary for practical use.

Paper: arXiv:2510.03215
HuggingFace Models: nics-efc/C2C_Fuser
Live Demo: Available via script/playground/gradio_demo.py.

Highlighted Details

Achieves 8.5–10.5% higher accuracy compared to individual LLMs.
Outperforms text-based communication by 3.0–5.0% in performance.
Delivers a 2.0× speedup in inference latency.
Supports arbitrary LLM pairs, automatically adapting to differing architectures (hidden dimensions, layers, heads, tokenizers).

Maintenance & Community

The project is associated with authors Tianyu Fu, Zihan Min, et al. While community enthusiasm is noted, specific channels like Discord/Slack or a public roadmap are not detailed in the README.

Licensing & Compatibility

The specific open-source license is not stated in the provided README. This omission requires clarification for adoption decisions, especially concerning commercial use or integration into closed-source projects. The framework is designed for broad compatibility with diverse LLM architectures.

Limitations & Caveats

The README does not detail known limitations, bugs, or deprecation plans. The absence of a specified license is a significant adoption blocker. The project's research paper is dated 2025, suggesting it may represent early-stage development.

C2C by thu-nics

Explore Similar Projects

Awesome-KV-Cache-Management by TreeAI-Lab

Awesome-KV-Cache-Compression by October2001

InfiniStore by bytedance

SepLLM by HKUDS

MinivLLM by Wenyueh

hpc-ops by Tencent

KVCache-Factory by Zefan-Cai

distributed-llama by b4rtaz

onnxruntime-genai by microsoft

tiny-llm by skyzh

mlx-lm by ml-explore

LMCache by LMCache