VectorCode  by Davidyz

CLI tool for code repository indexing to enhance LLM prompting

created 6 months ago
580 stars

Top 56.6% on sourcepulse

GitHubView on GitHub
Project Summary

VectorCode is a command-line tool and Neovim plugin designed to enhance Large Language Model (LLM) interactions with codebases. It indexes code repositories, allowing users to programmatically inject relevant context into LLM prompts, thereby improving output quality and reducing hallucinations for closed-source, niche, or cutting-edge projects.

How It Works

VectorCode leverages a vector database (ChromaDB) to store embeddings of code chunks. It supports various embedding engines, with SentenceTransformer being the default. The tool offers features like configurable chunking, metadata inclusion, and smarter, syntax-aware chunking using py-tree-sitter. It also respects .gitignore and uses project root anchors for better detection.

Quick Start & Requirements

Highlighted Details

  • Syntax-aware chunking via py-tree-sitter.
  • .gitignore support for indexing.
  • Neovim plugin integration for seamless workflow.
  • Configurable chunk size and document selection.

Maintenance & Community

The project acknowledges contributions from @milanglacier and @olimorris. Discussions are available for questions and usage sharing.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial or closed-source integration.

Limitations & Caveats

The project is in beta quality with basic retrieval and embedding functionalities. Remote ChromaDB support with authentication is not yet implemented, and there's no current method to view or delete individual files within a collection without re-vectorizing the entire project.

Health Check
Last commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)
19
Issues (30d)
9
Star History
264 stars in the last 90 days

Explore Similar Projects

Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
6 more.

repomix by yamadashy

0.8%
18k
CLI tool to pack codebases into AI-friendly formats for LLMs
created 1 year ago
updated 5 days ago
Feedback? Help us improve.