autodoc  by context-labs

Toolkit for auto-generating codebase documentation using LLMs

created 2 years ago
2,203 stars

Top 20.9% on sourcepulse

GitHubView on GitHub
Project Summary

Autodoc is an experimental toolkit for automatically generating codebase documentation using Large Language Models (LLMs) like GPT-4. It indexes Git repositories by traversing files and using LLMs to create documentation, which is stored within the codebase itself. This allows developers to query their codebase for specific information and receive answers with direct code references, aiming to keep documentation synchronized with code changes via CI pipelines.

How It Works

Autodoc performs a depth-first traversal of a Git repository's contents. For each file, it calculates token count and selects an LLM (currently only OpenAI models are supported) based on cost and context length, prioritizing GPT-4 for better accuracy. The generated documentation is stored locally within the .autodoc folder, enabling CLI-based querying of the codebase.

Quick Start & Requirements

  • Install globally via npm: npm install -g @context-labs/autodoc
  • Requires Node.js v18.0.0+ (v19.0.0+ recommended).
  • Requires an OpenAI API key set as an environment variable: export OPENAI_API_KEY=<YOUR_KEY_HERE>.
  • Indexing command: doc index
  • Querying command: doc q
  • Official documentation: https://github.com/context-labs/autodoc

Highlighted Details

  • Generates documentation stored directly within the codebase.
  • Supports querying the codebase via a CLI tool.
  • Estimates indexing costs before execution.
  • Future support planned for self-hosted models (Llama, Alpaca) and a web version.

Maintenance & Community

  • Active development with a core team.
  • Community channels: Discord, Twitter.
  • Open to contributions.

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Primarily targets Node.js environments. Requires OpenAI API access.

Limitations & Caveats

Autodoc is in early development and not production-ready. The README notes that response quality can vary, and a "naive model selection strategy" may use less accurate GPT-3.5 for smaller files. Indexing large projects can be costly, with estimates in the hundreds of dollars.

Health Check
Last commit

11 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
83 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.