edge by cartesia-ai

Open-source library for efficient state space models (SSMs) on-device

Created 1 year ago

392 stars

Top 73.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Alex Chen

Cofounder of Nexa AI

Project Summary

Edge provides an open-source library for developing and deploying efficient State Space Models (SSMs) on-device, targeting researchers and developers building real-time AI applications. It addresses the limitations of large, cloud-dependent models by offering optimized SSM architectures that achieve constant tokens per second and memory consumption, making them ideal for edge devices.

How It Works

Edge leverages State Space Models (SSMs), which offer a more computationally efficient alternative to Transformer architectures. The library focuses on custom, hardware-specialized inference kernels for SSMs like Mamba, enabling optimized performance across various accelerators. It also provides access to open-weight SSM models, pre-optimized for multiple hardware platforms, including CPU, CUDA GPUs, and Apple Silicon via Metal and MLX.

Quick Start & Requirements

Install: pip install cartesia-pytorch or pip install cartesia-metal or pip install cartesia-mlx.
Prerequisites: PyTorch, MLX, or Metal support depending on the package. Specific hardware requirements depend on the chosen backend (e.g., Apple Silicon for cartesia-metal).
Resources: Model sizes range from 1B to 8B parameters.
Docs: https://github.com/cartesia-ai/edge/tree/main/mlx (for MLX examples).

Highlighted Details

Supports custom hardware-specialized inference kernels for SSM architectures.
Offers open-weight SSM models (Rene-v0.1, Llamba family) optimized for PyTorch, MLX, and Metal.
Includes distilled models (Llamba family) derived from Llama-3.2 and Llama-3.1.
Provides quantization support for optimized on-device performance.

Maintenance & Community

The project is actively developed by Cartesia AI.
Contact information for custom support is available.

Licensing & Compatibility

The specific license is not explicitly stated in the provided README snippet. Compatibility for commercial use or closed-source linking would require clarification of the license.

Limitations & Caveats

The README does not explicitly state the license, which is crucial for determining commercial use compatibility. While it supports multiple backends, the performance and feature set may vary across different hardware accelerators.

edge by cartesia-ai

Explore Similar Projects

LLM-TPU by sophgo

Nanoflow by efeslab

prima.cpp by Lizonghang

xFasterTransformer by intel

marlin by IST-DASLab

LiteRT-LM by google-ai-edge

rtp-llm by alibaba

ollm by Mega4alik

distributed-llama by b4rtaz

tiny-llm by skyzh

intel-extension-for-pytorch by intel

CTranslate2 by OpenNMT