keras-llm-robot  by smalltong02

Web UI for LLMs, built with Keras, Langchain, and Fastchat

created 1 year ago
252 stars

Top 99.7% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a web UI for interacting with and learning about large language models, targeting developers and researchers. It enables offline deployment and testing of Hugging Face models, offering features like chat, quantization, fine-tuning, RAG, and multimodal capabilities, aiming to simplify LLM experimentation.

How It Works

The project leverages Langchain and Fastchat for its core architecture, with a Streamlit-based UI. It supports loading various open-source LLMs, including quantized versions, and integrates auxiliary models for RAG, code interpretation, speech recognition/generation, and image recognition/generation. This modular approach allows for combining multiple models to achieve complex functionalities like Agents and multimodal interactions.

Quick Start & Requirements

  • Install: Clone the repository, create a conda environment (conda create -n keras-llm-robot python==3.11.5), activate it (conda activate keras-llm-robot), and install dependencies (pip install -r requirements-ubuntu.txt or platform-specific equivalent).
  • Prerequisites: Python 3.10/3.11, Conda/Miniconda, Git. NVIDIA GPU with CUDA Toolkit (matching PyTorch version) is recommended for GPU acceleration. Linux requires build-essential, ffmpeg, portaudio19-dev. Windows requires CMake.
  • Running: python __webgui_server__.py --webui for local access. Reverse proxy setup is recommended for cloud deployments.
  • Docs: Langchain Project, Fastchat Project

Highlighted Details

  • Supports dozens of LLMs, including quantized (GGUF, GPTQ, AWQ) and multimodal models.
  • Integrates RAG with various vector databases (Faiss, Milvus, PGVector) and embedding models.
  • Offers a code interpreter with local execution or Docker sandbox modes.
  • Includes speech (Whisper, XTTS-v2, Azure) and image (BLIP, OpenDalle, Stable Video Diffusion) capabilities.
  • Supports function calling and Google Toolboxes (Mail, Calendar, Drive, Maps, YouTube).

Maintenance & Community

The project is actively updated, with recent additions including support for Gemma, Qwen2, and Google Photos tool. It is based on Langchain and Fastchat.

Licensing & Compatibility

The project appears to be open-source, but the specific license is not explicitly stated in the README. Compatibility for commercial use would require verification of underlying dependencies and the project's own license.

Limitations & Caveats

Fine-tuning is currently limited to Linux systems. macOS support for flash-attn and bitsandbytes is not guaranteed, and Windows installation for these libraries may require manual downloads. Some advanced features like fine-tuning and specific model quantizations might have platform-specific requirements or limitations.

Health Check
Last commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 19 hours ago
Feedback? Help us improve.