Discover and explore top open-source AI tools and projects—updated daily.
sankalp1999Explore codebases with natural language RAG
Top 98.1% on SourcePulse
Summary
sankalp1999/code_qa is a RAG-powered system for natural language querying of codebases. It targets developers and researchers seeking to understand complex code by providing contextual answers and interactive chat, leveraging Treesitter for AST parsing and LanceDB for efficient vector storage.
How It Works The system parses codebases into abstract syntax trees (ASTs) using Treesitter, then indexes code chunks with OpenAI or Jina embeddings stored in LanceDB. Natural language queries retrieve relevant code snippets via vector search and generate contextual answers using LLMs like GPT-4o, with an optional Colbert-based reranker for improved relevance. This approach enables efficient, semantic code exploration.
Quick Start & Requirements
pip install -r requirements.txt, run redis-server.localhost:6379..env with OPENAI_API_KEY (required) and optional JINA_API_KEY../index_codebase.sh <path>, run server with python app.py <folder_path>, access UI at http://localhost:5001.Highlighted Details
feature/optimization) offers 2.5x faster performance (10-20s worst-case) via reduced HYDE token limits and enhanced context processing with SambaNova Llama 3.1 models.Maintenance & Community The README does not provide specific details on maintainers, community channels, or project roadmap.
Licensing & Compatibility Licensed under the MIT License, permitting broad use and modification.
Limitations & Caveats
The primary branch's performance may differ from the claimed 2.5x speedup achieved in the feature/optimization branch. Performance is dependent on specific LLM configurations and API availability. Requires a local Redis instance and OpenAI API key for core functionality.
11 months ago
Inactive
superagent-ai
AntonOsika