llama-node by Atome-FE

Node.js library for local LLM inference

Created 2 years ago

869 stars

Top 41.3% on SourcePulse

View on GitHub

4 Experts Love This Project

Yagil Burowski

Founder of LM Studio

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Georgi Gerganov

Author of llama.cpp, whisper.cpp

Travis Fischer

Founder of Agentic

Project Summary

LLaMA Node provides a Node.js interface for running various large language models (LLMs) locally on consumer hardware, including CPUs. It targets Node.js developers seeking to integrate LLM capabilities into their applications without relying on cloud APIs, enabling offline inference for models like LLaMA, Alpaca, Vicuna, and RWKV.

How It Works

The library leverages native Node.js addons (N-API) to bridge JavaScript with C++ inference engines: llama.cpp, llm (a Rust port of llama.rs), and rwkv.cpp. This architecture allows for efficient execution of LLM inference directly within a Node.js process, offloading heavy computation to the C++ backends. The use of N-API facilitates communication between the Node.js event loop and the computationally intensive LLM threads.

Quick Start & Requirements

Install the core package: npm install llama-node
Install at least one backend:
- llama.cpp: npm install @llama-node/llama-cpp
- llm: npm install @llama-node/core
- rwkv.cpp: npm install @llama-node/rwkv-cpp
Supported platforms: macOS (x64, arm64), Linux (glibc >= 2.31), Windows (x64).
Node.js version: >= 16.
CUDA support requires manual compilation.
Official Documentations: [Link not provided in README]

Highlighted Details

Supports a wide range of GGML-formatted models including LLaMA, Alpaca, Vicuna, GPT4All, and RWKV.
Utilizes N-API for efficient inter-process communication between Node.js and C++ inference threads.
Offers multiple backend options (llama.cpp, llm, rwkv.cpp) for flexibility.
Enables local, CPU-based inference, democratizing access to LLMs.

Maintenance & Community

Project is in an early stage; API may change without semantic versioning.
Community Discord available: https://discord.gg/dKFeCwfsDk
Twitter: https://twitter.com/hlhr202

Licensing & Compatibility

Licensed under MIT/Apache-2.0.
Recommends citing dependencies if code is reused.

Limitations & Caveats

The project is explicitly stated as being in an early stage and not production-ready, with potential for breaking API changes. Manual compilation is required for CUDA support.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days