byzer-llm  by allwefantasy

Ray-based lifecycle solution for LLMs: pretrain, finetune, serving

created 2 years ago
311 stars

Top 87.6% on sourcepulse

GitHubView on GitHub
Project Summary

Byzer-LLM is a comprehensive framework for managing the entire lifecycle of Large Language Models (LLMs), from pre-training and fine-tuning to deployment and serving. It offers a unified Python and SQL API, making LLM operations accessible to a broad range of users, including engineers, researchers, and power users. The project aims to simplify and democratize LLM development and deployment.

How It Works

Byzer-LLM leverages Ray for distributed computing, enabling scalable pre-training, fine-tuning, and inference. It supports various inference backends, including vLLM, DeepSpeed, Transformers, and llama_cpp, allowing users to choose the most suitable option for their needs. The framework provides a consistent API for both open-source and SaaS LLMs, abstracting away underlying complexities. Key features include support for function calling, Pydantic class responses, prompt templating, and multi-modal capabilities.

Quick Start & Requirements

  • Installation: pip install -U byzerllm
  • Prerequisites: Python 3.10.11, CUDA 12.1.0 (optional for SaaS models). For specific backends like vLLM with CUDA 11.8, custom installation steps are provided.
  • Setup: ray start --head
  • Documentation: https://github.com/allwefantasy/byzer-llm

Highlighted Details

  • Supports a wide array of open-source and SaaS LLMs, including Llama, Qwen, Baichuan, OpenAI, Azure OpenAI, and more.
  • Offers advanced features like function calling with Pydantic models, LLM-friendly function/data class definitions, and prompt templating (including Jinja2).
  • Provides robust deployment options with support for vLLM, DeepSpeed, and llama_cpp backends, including quantization (4-bit, 8-bit, GPTQ, AWQ).
  • Includes capabilities for multi-modal inference (image input) and text-to-speech (TTS) using various providers.

Maintenance & Community

The project is actively maintained, with recent updates noted in April 2024. Community engagement can be found via links provided in the README, though specific channels like Discord/Slack are not explicitly mentioned.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking would depend on the specific license, which needs clarification.

Limitations & Caveats

The README mentions troubleshooting for vLLM versions and provides workarounds, indicating potential compatibility issues. The pre-training and fine-tuning sections primarily refer to Byzer-SQL, suggesting the Python API for these specific tasks might be less mature or primarily integrated through the SQL interface.

Health Check
Last commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Philipp Schmid Philipp Schmid(DevRel at Google DeepMind), and
2 more.

LightLLM by ModelTC

0.7%
3k
Python framework for LLM inference and serving
created 2 years ago
updated 15 hours ago
Feedback? Help us improve.