llama-coder  by ex3ndr

VS Code extension for local AI code completion

created 1 year ago
2,033 stars

Top 22.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Llama Coder offers a self-hosted, privacy-focused alternative to GitHub Copilot for VS Code users. It leverages Ollama and Code Llama models to provide local AI-powered code completion, aiming for performance comparable to commercial solutions while running on user hardware.

How It Works

The extension integrates with Ollama, a framework for running large language models locally. Users can select from various Code Llama model sizes and quantization levels, with larger, more quantized models generally offering better performance. The system is designed for flexibility, allowing users to run inference on their local machine or offload it to a dedicated server via a configurable Ollama endpoint.

Quick Start & Requirements

  • Installation: Install the VS Code extension. Ensure Ollama is installed and running locally or on a remote machine.
  • Prerequisites: Ollama, VS Code. Recommended hardware includes Apple Silicon (M1/M2/M3) or NVIDIA RTX 4090 for optimal performance. Minimum 16GB RAM required; more is recommended.
  • Setup: Local setup is straightforward if Ollama is running. Remote setup requires configuring the Ollama endpoint in extension settings.
  • Documentation: https://github.com/ex3ndr/llama-coder

Highlighted Details

  • Aims for Copilot-level performance.
  • Supports any programming or human language.
  • No telemetry or tracking.
  • Offers remote inference capabilities.
  • Supports various Code Llama models and quantization levels (e.g., stable-code:3b-code-q4_0, codellama:7b-code-q4_K_M).

Maintenance & Community

Recent updates include features like pausing completions, bearer token support for remote inference, and improved Jupyter notebook support. The project is actively maintained with frequent releases.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Performance is highly dependent on user hardware, with Apple Silicon or high-end NVIDIA GPUs recommended. Some models may perform slowly on older NVIDIA cards or macOS. The project does not specify which models are supported beyond Code Llama and DeepSeek.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
38 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 22 hours ago
Feedback? Help us improve.