swiss_army_llama  by Dicklesworthstone

FastAPI service for semantic text search using precomputed embeddings

created 2 years ago
1,020 stars

Top 37.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a FastAPI service for semantic text search and LLM operations, targeting developers and researchers who need to integrate local LLMs into applications. It offers a unified API for text embedding, completion, and advanced similarity measures, simplifying complex LLM workflows.

How It Works

The service leverages llama_cpp for LLM inference and textract for broad file format support, including OCR for scanned documents and Whisper for audio transcription. Embeddings are cached in SQLite for efficiency. It utilizes a Rust library, fast_vector_similarity, for advanced similarity metrics beyond standard cosine similarity, enabling a two-step search process: initial FAISS filtering followed by more nuanced similarity calculations. Multiple embedding pooling methods are supported to create fixed-length vectors from token embeddings.

Quick Start & Requirements

  • Docker Install: sudo ./setup_dockerized_app_on_fresh_machine.sh (installs Docker and the app)
  • Native Install: Requires build-essential, libxml2-dev, libxslt1-dev, antiword, unrtf, poppler-utils, pstotext, tesseract-ocr, flac, ffmpeg, lame, libmad0, libsox-fmt-mp3, sox, libjpeg-dev, swig, redis-server, libpoppler-cpp-dev, pkg-config.
  • Python Dependencies: requirements.txt lists dependencies including llama-cpp-python, fastapi, faster-whisper, faiss-cpu, fast_vector_similarity, redis.
  • Access: http://localhost:8089 (Swagger UI)
  • Docs: Swagger UI

Highlighted Details

  • Supports advanced similarity measures: Spearman's Rho, Kendall's Tau, etc.
  • Handles diverse file types (PDF, DOCX, images with OCR) and audio (Whisper transcription).
  • Optional RAM disk usage for faster model loading.
  • Includes a real-time log viewer accessible via the browser.

Maintenance & Community

The project is actively maintained by the author. Community contributions are encouraged. Links to commercial web apps by the author are provided.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

The setup script installs Docker via apt, which might not be ideal for all environments. RAM disk setup requires specific sudo permissions and configuration. The project relies heavily on external dependencies, including system libraries and specific Python packages.

Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.