SONAR  by facebookresearch

Multilingual/multimodal embeddings for text and speech tasks

Created 2 years ago
881 stars

Top 40.7% on SourcePulse

GitHubView on GitHub
Project Summary

SONAR provides a fixed-size, multilingual, and multimodal sentence embedding space, outperforming existing methods on cross-lingual similarity tasks. It supports text and speech encoders/decoders for tasks like translation and similarity search, benefiting researchers and developers working with diverse languages and modalities.

How It Works

SONAR leverages a teacher-student training approach on speech transcription data to embed speech segments into the same space as text. This allows for cross-modal and zero-shot language translation. The architecture includes separate encoders and decoders for text and speech, enabling flexible pipeline construction for various NLP and speech processing tasks.

Quick Start & Requirements

  • Install via pip: pip install sonar-space
  • Requires fairseq2 with specific PyTorch and CUDA versions (e.g., pip install fairseq2 --extra-index-url https://fair.pkg.atmeta.com/fairseq2/whl/pt2.6.0/cu124).
  • Models are automatically downloaded to $TORCH_HOME/hub.
  • GPU acceleration is recommended for performance.
  • Demo notebooks are available for detailed examples.

Highlighted Details

  • Supports 200 languages for text and 37 for speech.
  • Enables text-to-text, speech-to-text, and speech-to-embedding tasks.
  • Includes BLASER 2.0 models for MT quality evaluation and MuTox for multilingual toxicity classification.
  • Embeddings are 1024-dimensional.

Maintenance & Community

  • Developed by Facebook Research.
  • Contribution guidelines are provided.
  • Citation information for the associated paper is available.

Licensing & Compatibility

  • SONAR code is MIT licensed.
  • Caution: Some SONAR models are released under a non-commercial license (NC_MODEL_LICENSE). Refer to LICENSE for details.

Limitations & Caveats

  • The dependency on fairseq2 requires careful version matching for PyTorch and CUDA.
  • Some models have non-commercial restrictions, limiting their use in commercial applications.
Health Check
Last Commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.