LatentMAS  by Gen-Verse

Multi-agent reasoning framework enabling latent-space collaboration

Created 1 month ago
690 stars

Top 49.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

A multi-agent reasoning framework, LatentMAS, enables agents to collaborate by exchanging information within the model's latent space rather than generating lengthy textual outputs. This approach significantly boosts efficiency, reducing token consumption and accelerating reasoning processes, making it valuable for researchers and practitioners seeking to optimize complex multi-agent system performance.

How It Works

LatentMAS facilitates agent communication through "latent thoughts" passed via the agent's working memory, bypassing the need for explicit textual reasoning traces. This method employs training-free latent-space alignment for stable generation and is designed as a general technique compatible with any Hugging Face model, with optional acceleration via vLLM backends.

Quick Start & Requirements

  • Install: Clone the repository, create and activate a conda environment (conda create -n latentmas python=3.10 -y, conda activate latentmas), then install dependencies (pip install -r requirements.txt). For vLLM support, run pip install vllm.
  • Prerequisites: Python 3.10, Hugging Face models. vLLM integration is optional. Recommended environment variables: HF_HOME, TRANSFORMERS_CACHE, HF_DATASETS_CACHE.
  • Resource Footprint: The hybrid vLLM setup recommends two GPUs for optimal performance.
  • Links: Repository: https://github.com/YourRepo/LatentMAS.git

Highlighted Details

  • Achieves ~50–80% reduction in token usage.
  • Provides ~3×–7× wall-clock speedups compared to standard Text-MAS or chain-of-thought baselines.
  • Features training-free latent-space alignment for stable generation.
  • Compatible with any Hugging Face model and optionally vLLM backends.

Maintenance & Community

The project has recently released its paper and code, and was featured by Hugging Face. No specific community channels (e.g., Discord, Slack) or detailed contributor information are provided in the README.

Licensing & Compatibility

The provided text does not explicitly state a software license. The citation points to an arXiv preprint, indicating an academic research context. Commercial use compatibility is not specified.

Limitations & Caveats

The vLLM backend requires modifications to internal vLLM packages, which may result in minor numeric differences compared to the official Hugging Face backend due to variations in decoding strategies. To reproduce official published results, the Hugging Face backend is recommended.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
9
Star History
136 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Tri Dao Tri Dao(Chief Scientist at Together AI), and
1 more.

hnet by goombalab

0.3%
800
Hierarchical sequence modeling with dynamic chunking
Created 6 months ago
Updated 1 month ago
Feedback? Help us improve.