ai-virtual-assistant  by NVIDIA-AI-Blueprints

AI virtual assistant for customer service automation

Created 1 year ago
265 stars

Top 96.2% on SourcePulse

GitHubView on GitHub
Project Summary

This NVIDIA AI Blueprint offers a customizable, AI-driven virtual assistant to enhance customer service. It addresses traditional system limitations by providing automated, context-aware, and secure responses, boosting user satisfaction and streamlining inquiry handling. The solution targets IT engineers seeking to integrate NVIDIA NIM microservices into agentic virtual assistant applications.

How It Works

The system employs Retrieval-Augmented Generation (RAG) with NVIDIA NeMo Retriever™ and NIM™ microservices. User queries initiate data retrieval from structured (Postgres) and unstructured (Milvus vector DB) sources. A large language model then uses this context to generate responses. LangGraph orchestrates sub-agents for multi-turn dialogue and personalized Q&A.

Quick Start & Requirements

  • Installation: Docker Compose for single-node or Helm charts for Kubernetes.
    • Docker Compose: docker compose -f deploy/compose/docker-compose.yaml up -d
    • Helm: cd deploy/helm/ && bash deploy.sh
  • Prerequisites: Docker Engine (v2.29.1+), NVIDIA Container Toolkit, Git, API keys (NVIDIA NIMs, NGC). Kubernetes and NVIDIA GPU Operator for Helm.
  • Hardware: Self-hosting NIMs requires 8x H100/A100 GPUs; L40 GPU for vector store. NVIDIA-hosted NIMs need an L40 GPU for pipeline operation.
  • Setup Time: ~10 minutes for initial model download/deployment.
  • Docs: NIMs: https://build.nvidia.com/nim, NeMo Retriever: https://build.nvidia.com/explore/retrieval.

Highlighted Details

  • NVIDIA NIM microservices for inference, embeddings, and reranking.
  • Retrieval-Augmented Generation (RAG) for data-driven, context-aware responses.
  • LangGraph for AI agent orchestration.
  • Data ingestion from structured (Postgres) and unstructured (Milvus) sources.
  • Includes sample UI, analytics (sentiment, summarization), and a data flywheel for continuous improvement.

Maintenance & Community

Community contributions are welcomed via GitHub issues and pull requests, with contributing guidelines provided.

Licensing & Compatibility

Licensed under Apache License, Version 2.0, permitting commercial use and integration into closed-source applications. Sample data uses a separate NVIDIA asset license.

Limitations & Caveats

Significant latency may occur with cloud-hosted NIMs. Self-hosting demands substantial GPU hardware. Prompts are optimized for Llama 3.1 70B NIM and may require tuning. Production environments should consider a dedicated VDB service.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
14 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.