This project provides a collection of microservice-based Generative AI examples, such as ChatQnA and Text2Image, designed for developers to easily deploy, test, and scale AI applications. It aims to simplify the adoption of GenAI by offering a flexible, service-based architecture compatible with various hardware and deployment environments.
How It Works
The project is structured around several key components: GenAIComps (microservices for LLM, embedding, reranking), GenAIInfra (cloud-native deployment suite), and GenAIEval (performance benchmarking). This modular approach allows for the construction of diverse use cases like ChatQnA and DocSum, with deployment facilitated via Docker Compose or Kubernetes, enabling efficient scaling and hardware flexibility.
Quick Start & Requirements
- Installation: Deployment via Python startup, Docker Compose, or Kubernetes.
- Prerequisites: Docker Compose (for Docker Compose deployment), Kubernetes cluster (for Kubernetes deployment), Helm (optional, v3.15+).
- Hardware: Supports Intel Gaudi, Xeon, NVIDIA GPUs, and AMD GPUs. Reference configurations are provided for Intel Tiber Developer Cloud and AWS c7i.16xlarge instances.
- Documentation: GenAIExamples Documentation
Highlighted Details
- Offers a wide range of GenAI use cases including ChatQnA, VisualQnA, Text2Image, DocSum, CodeGen, and more.
- Supports deployment across diverse hardware including Intel Gaudi, Xeon, NVIDIA, and AMD GPUs.
- Includes GenAIEval for performance benchmarking (throughput, latency, accuracy) across different hardware.
- Provides flexible deployment options: Python startup, Docker Compose, and Kubernetes (via Helm or GMC).
Maintenance & Community
- Actively seeking contributions for bug fixes, new components, documentation, and use cases.
- Contribution guidelines are available.
Licensing & Compatibility
- The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
- Some use cases like SearchQnA and MultimodalQnA have limited hardware support (e.g., not supported on ROCm for SearchQnA).
- The project's license is not clearly stated in the README, which may impact commercial adoption.