Language modeling research paper in a sentence representation space
Top 20.5% on sourcepulse
Large Concept Models (LCM) offers an implementation for language modeling in a sentence representation space, targeting researchers and practitioners interested in novel sequence-to-sequence architectures. It enables auto-regressive sentence prediction using explicit, language-agnostic "concepts" derived from the SONAR embedding space, supporting multilingual text and speech.
How It Works
LCM models operate on sentence embeddings, treating them as discrete concepts. The repository includes implementations for Mean Squared Error (MSE) regression and diffusion-based generation approaches. This concept-centric approach aims to capture higher-level semantic meaning, moving beyond token-level prediction for potentially more robust and interpretable language generation.
Quick Start & Requirements
uv
(uv sync --extra cpu --extra eval --extra data
) or pip
. GPU support requires manual installation of compatible torch
and fairseq2
versions (e.g., uv pip install torch==2.5.1 --extra-index-url https://download.pytorch.org/whl/cu121
and uv pip install fairseq2==v0.3.0rc1 --pre --extra-index-url https://fair.pkg.atmeta.com/fairseq2/whl/rc/pt2.5.1/cu121
).fairseq2
(release candidate), torch
, nltk
(for evaluation). GPU with CUDA is recommended for training/inference.Highlighted Details
Maintenance & Community
fairseq2
.Licensing & Compatibility
Limitations & Caveats
fairseq2
.6 months ago
1 week