LLM_AppDev-HandsOn  by sroecker

Workshop for local LLM application development

created 1 year ago
401 stars

Top 73.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a hands-on workshop and example code for developing applications with local Large Language Models (LLMs). It targets developers and researchers interested in building Retrieval Augmented Generation (RAG) chatbots that can query custom documents, with a focus on open-source tools and local deployment. The primary benefit is enabling users to create private, document-aware AI assistants without relying on external cloud services.

How It Works

The application utilizes Streamlit for the user interface, LlamaIndex for document indexing and retrieval, and Ollama for serving local LLMs. This stack allows for RAG by indexing documents into a vector store and then retrieving relevant chunks to augment LLM prompts. The approach emphasizes using open-source components and local LLMs, making it accessible for users without powerful GPUs or those prioritizing data privacy.

Quick Start & Requirements

  • Local Setup: Recommended for Mac M1 (16GB+ RAM). Install Ollama from ollama.ai.
  • Installation:
    python -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    streamlit run app.py
    
  • Prerequisites: Ollama service running, Zephyr model pulled (ollama pull zephyr).
  • Configuration: Set OLLAMA_HOST environment variable if needed.
  • Resources: Local LLM inference can be resource-intensive.
  • Docs: Streamlit App, Ollama API

Highlighted Details

  • Demonstrates RAG with custom documents using local LLMs.
  • Supports deployment via Podman and OpenShift (Kubernetes).
  • Includes options for GPU acceleration with NVIDIA Container Toolkit and AMD KFD/DRI.
  • Offers guidance on disabling Ollama service for debugging on Linux.

Maintenance & Community

  • The repository is maintained by sroecker.
  • References include "AI on Openshift" and "Open Sourcerers."

Licensing & Compatibility

  • The repository itself does not explicitly state a license in the README.
  • The software stack uses open-source tools (Streamlit, LlamaIndex, Ollama), which have their own licenses. Compatibility for commercial use depends on the licenses of these underlying components.

Limitations & Caveats

  • Local LLM performance is highly dependent on hardware, especially for GPU-less setups.
  • Generating embeddings directly within the Streamlit app requires increasing shared memory for PyTorch, and LlamaIndex does not yet support generating embeddings via the Ollama service directly.
  • GPU support requires specific setup with NVIDIA Container Toolkit or AMD drivers.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.