byzer-llm  by allwefantasy

Ray-based lifecycle solution for LLMs: pretrain, finetune, serving

Created 2 years ago
315 stars

Top 85.6% on SourcePulse

GitHubView on GitHub
Project Summary

Byzer-LLM is a comprehensive framework for managing the entire lifecycle of Large Language Models (LLMs), from pre-training and fine-tuning to deployment and serving. It offers a unified Python and SQL API, making LLM operations accessible to a broad range of users, including engineers, researchers, and power users. The project aims to simplify and democratize LLM development and deployment.

How It Works

Byzer-LLM leverages Ray for distributed computing, enabling scalable pre-training, fine-tuning, and inference. It supports various inference backends, including vLLM, DeepSpeed, Transformers, and llama_cpp, allowing users to choose the most suitable option for their needs. The framework provides a consistent API for both open-source and SaaS LLMs, abstracting away underlying complexities. Key features include support for function calling, Pydantic class responses, prompt templating, and multi-modal capabilities.

Quick Start & Requirements

  • Installation: pip install -U byzerllm
  • Prerequisites: Python 3.10.11, CUDA 12.1.0 (optional for SaaS models). For specific backends like vLLM with CUDA 11.8, custom installation steps are provided.
  • Setup: ray start --head
  • Documentation: https://github.com/allwefantasy/byzer-llm

Highlighted Details

  • Supports a wide array of open-source and SaaS LLMs, including Llama, Qwen, Baichuan, OpenAI, Azure OpenAI, and more.
  • Offers advanced features like function calling with Pydantic models, LLM-friendly function/data class definitions, and prompt templating (including Jinja2).
  • Provides robust deployment options with support for vLLM, DeepSpeed, and llama_cpp backends, including quantization (4-bit, 8-bit, GPTQ, AWQ).
  • Includes capabilities for multi-modal inference (image input) and text-to-speech (TTS) using various providers.

Maintenance & Community

The project is actively maintained, with recent updates noted in April 2024. Community engagement can be found via links provided in the README, though specific channels like Discord/Slack are not explicitly mentioned.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking would depend on the specific license, which needs clarification.

Limitations & Caveats

The README mentions troubleshooting for vLLM versions and provides workarounds, indicating potential compatibility issues. The pre-training and fine-tuning sections primarily refer to Byzer-SQL, suggesting the Python API for these specific tasks might be less mature or primarily integrated through the SQL interface.

Health Check
Last Commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.