codeqai by fynnfluegge

CLI tool for local semantic code search and chat

Created 2 years ago

495 stars

Top 62.6% on SourcePulse

Project Summary

This project provides a local-first solution for semantic code search and chat, enabling users to build custom copilots by fine-tuning models with code datasets. It targets developers seeking private, efficient code analysis and interaction tools.

How It Works

The system parses codebases using Tree-sitter for accurate syntax analysis, generating embeddings with Sentence-Transformers, Instructor-Embedding, or OpenAI models. These embeddings are stored in a FAISS vector database for fast semantic search. For chat functionality, it integrates with llama.cpp or Ollama for local LLM inference, or supports OpenAI/Azure OpenAI/Anthropic APIs. Synchronization with Git ensures the vector store remains up-to-date with code changes.

Quick Start & Requirements

Install via pipx install codeqai.
Requires Python >=3.9, <3.12.
faiss-cpu or faiss-gpu (recommended for CUDA 7.5+) must be installed.
Local LLM usage requires sentence-transformers, instructor, or llama.cpp.
Remote model usage requires API keys (OpenAI, Azure OpenAI, Anthropic).
Initial indexing may take time.
See Troubleshooting for installation issues.

Highlighted Details

Supports dataset generation for fine-tuning in Alpaca, conversational, instruction, or completion formats.
Integrates with Tree-sitter for parsing multiple languages including Python, TypeScript, JavaScript, Java, Rust, Kotlin, Go, C++, C, C#, and Ruby.
Offers 100% local processing for embeddings and LLMs, ensuring data privacy.
Provides a Streamlit UI for an interactive experience.

Maintenance & Community

The project is actively maintained with CI/CD pipelines for build and publish. Contributions are welcomed via issues or pull requests. Development can be managed with Conda or Poetry.

Licensing & Compatibility

Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

faiss-cpu wheels are not available for Python 3.12, requiring an earlier Python version for installation. Results may be improved with well-documented code. llama.cpp requires GGUF format models to be pre-downloaded.

codeqai by fynnfluegge

Explore Similar Projects

web-code-agent by oldjs

codai by meysamhadeli

are-copilots-local-yet by ErikBjare

probe by probelabs

QA-Pilot by reid41

privy by srikanth235

moatless-tools by aorwall

awesome-ai-coding by wsxiaoys

codeshell-vscode by WisdomShell

ai_code_reader by duma-repo

claude-code-chat by andrepimenta

ProxyAI by carlrobertoh