secret-llama by abi

In-browser chatbot for local, private LLM inference

Created 1 year ago

2,675 stars

Top 17.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

This project provides a fully private, in-browser Large Language Model (LLM) chatbot, enabling users to interact with models like Llama 3 and Mistral without any server-side components. It targets users seeking privacy and offline LLM capabilities, offering a ChatGPT-like interface directly within their web browser.

How It Works

The chatbot leverages the WebLLM inference engine, which allows LLMs to run directly in the browser using WebGPU. This approach eliminates the need for server infrastructure, ensuring all conversation data remains on the user's local machine for maximum privacy. It supports quantized models, reducing memory footprint and enabling larger models to run efficiently within browser constraints.

Quick Start & Requirements

Install/Run: No installation required; run directly via the provided web link. For local development: yarn install, then yarn dev.
Prerequisites: Modern browser with WebGPU support (Chrome, Edge; Firefox/Safari require manual enabling).
Resources: Model sizes range from 600MB to 4.3GB, requiring sufficient browser RAM.
Links: Try it out: https://secret-llama.vercel.app/ | Discord: https://discord.gg/QkVzykMc9V

Highlighted Details

Supports Llama 3 and Mistral 7B Instruct models.
Fully private: no data leaves the user's computer.
Works offline once models are downloaded.
Easy-to-use interface comparable to ChatGPT.

Maintenance & Community

The project is seeking contributors for interface improvements, model support, and bug fixes. A Discord server is available for community interaction.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

WebGPU support is required, which may necessitate manual configuration in some browsers like Firefox and Safari. Performance and model availability are dependent on the user's hardware and browser capabilities.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days