felafax by felafax

AI infra for non-NVIDIA GPUs, enabling LLM fine-tuning

Created 1 year ago

570 stars

Top 56.5% on SourcePulse

View on GitHub

6 Experts Love This Project

Matthew Johnson

Coauthor of JAX; Research Scientist at Google Brain

Jonathan Ragan-Kelley

Professor at MIT

Luis Capelo

Cofounder of Lightning AI

Jeff Hammerbacher

Cofounder of Cloudera

and 2 more!

Project Summary

Felafax provides an AI infrastructure framework for fine-tuning and continued training of open-source Large Language Models (LLMs) on non-NVIDIA hardware, primarily Google TPUs. It targets ML researchers and developers seeking cost-effective and scalable LLM training solutions, offering a simplified workflow and enabling efficient utilization of diverse hardware accelerators.

How It Works

Felafax leverages JAX and its XLA backend for efficient computation across various hardware, including TPUs, AWS Trainium, and AMD/Intel GPUs. This approach allows for seamless scaling from single-core VMs to large TPU pods and supports advanced features like model and data sharding for handling large models and datasets. The framework supports both full-precision and LoRA fine-tuning.

Quick Start & Requirements

Install: pip install pipx followed by pipx install felafax-cli.
Authentication: Requires a token from preview.felafax.ai.
Prerequisites: Python, pipx. Supports Llama 3.1 (1B, 3B, 8B, 70B, 405B) models.
Resources: Free TPU resources available on Google Colab.
Docs: felafax.ai

Highlighted Details

Supports Llama 3.1 JAX implementation, converted from PyTorch for performance.
Offers a CLI for easy fine-tuning setup, job monitoring, and model interaction.
Demonstrated 405B Llama 3.1 fine-tuning on 8x AMD MI300x GPUs using JAX sharding.
Enables free fine-tuning on Google Colab TPUs.

Maintenance & Community

Contact: founders@felafax.ai.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use is not specified.

Limitations & Caveats

The 405B model fine-tuning was performed in JAX eager mode due to infrastructure constraints, indicating potential for significant performance improvements with JIT compilation.
The project appears to be in active development, with features like the 405B model fine-tuning noted as "New!".

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days