codeqai  by fynnfluegge

CLI tool for local semantic code search and chat

Created 2 years ago
488 stars

Top 63.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a local-first solution for semantic code search and chat, enabling users to build custom copilots by fine-tuning models with code datasets. It targets developers seeking private, efficient code analysis and interaction tools.

How It Works

The system parses codebases using Tree-sitter for accurate syntax analysis, generating embeddings with Sentence-Transformers, Instructor-Embedding, or OpenAI models. These embeddings are stored in a FAISS vector database for fast semantic search. For chat functionality, it integrates with llama.cpp or Ollama for local LLM inference, or supports OpenAI/Azure OpenAI/Anthropic APIs. Synchronization with Git ensures the vector store remains up-to-date with code changes.

Quick Start & Requirements

  • Install via pipx install codeqai.
  • Requires Python >=3.9, <3.12.
  • faiss-cpu or faiss-gpu (recommended for CUDA 7.5+) must be installed.
  • Local LLM usage requires sentence-transformers, instructor, or llama.cpp.
  • Remote model usage requires API keys (OpenAI, Azure OpenAI, Anthropic).
  • Initial indexing may take time.
  • See Troubleshooting for installation issues.

Highlighted Details

  • Supports dataset generation for fine-tuning in Alpaca, conversational, instruction, or completion formats.
  • Integrates with Tree-sitter for parsing multiple languages including Python, TypeScript, JavaScript, Java, Rust, Kotlin, Go, C++, C, C#, and Ruby.
  • Offers 100% local processing for embeddings and LLMs, ensuring data privacy.
  • Provides a Streamlit UI for an interactive experience.

Maintenance & Community

The project is actively maintained with CI/CD pipelines for build and publish. Contributions are welcomed via issues or pull requests. Development can be managed with Conda or Poetry.

Licensing & Compatibility

Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

faiss-cpu wheels are not available for Python 3.12, requiring an earlier Python version for installation. Results may be improved with well-documented code. llama.cpp requires GGUF format models to be pre-downloaded.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.