llama-coder by ex3ndr

VS Code extension for local AI code completion

Created 2 years ago

2,083 stars

Top 21.2% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

Llama Coder offers a self-hosted, privacy-focused alternative to GitHub Copilot for VS Code users. It leverages Ollama and Code Llama models to provide local AI-powered code completion, aiming for performance comparable to commercial solutions while running on user hardware.

How It Works

The extension integrates with Ollama, a framework for running large language models locally. Users can select from various Code Llama model sizes and quantization levels, with larger, more quantized models generally offering better performance. The system is designed for flexibility, allowing users to run inference on their local machine or offload it to a dedicated server via a configurable Ollama endpoint.

Quick Start & Requirements

Installation: Install the VS Code extension. Ensure Ollama is installed and running locally or on a remote machine.
Prerequisites: Ollama, VS Code. Recommended hardware includes Apple Silicon (M1/M2/M3) or NVIDIA RTX 4090 for optimal performance. Minimum 16GB RAM required; more is recommended.
Setup: Local setup is straightforward if Ollama is running. Remote setup requires configuring the Ollama endpoint in extension settings.
Documentation: https://github.com/ex3ndr/llama-coder

Highlighted Details

Aims for Copilot-level performance.
Supports any programming or human language.
No telemetry or tracking.
Offers remote inference capabilities.
Supports various Code Llama models and quantization levels (e.g., stable-code:3b-code-q4_0, codellama:7b-code-q4_K_M).

Maintenance & Community

Recent updates include features like pausing completions, bearer token support for remote inference, and improved Jupyter notebook support. The project is actively maintained with frequent releases.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Performance is highly dependent on user hardware, with Apple Silicon or high-end NVIDIA GPUs recommended. Some models may perform slowly on older NVIDIA cards or macOS. The project does not specify which models are supported beyond Code Llama and DeepSeek.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days