RepoToTextForLLMs by Doriandarko

Python script for LLM-driven GitHub repo analysis

Created 1 year ago

781 stars

Top 45.0% on SourcePulse

Project Summary

This Python script automates the analysis of GitHub repositories for Large Language Models (LLMs), extracting READMEs, repository structure, and non-binary file contents. It provides structured outputs with pre-formatted prompts to aid in comprehensive repo evaluation, targeting developers and researchers working with LLMs.

How It Works

The script uses an iterative traversal method to map repository structure, avoiding recursion limits. It selectively extracts text content from files, intelligently skipping binary files to ensure efficient processing and focus on analyzable data.

Quick Start & Requirements

Install: pip install PyGithub tqdm
Prerequisites: Python, GitHub Personal Access Token (as GITHUB_TOKEN environment variable).
Usage: Run python repototxt.py and enter the repository URL.

Highlighted Details

README retrieval for initial insights.
Structured repository traversal without recursion limits.
Selective extraction of text file contents, skipping binaries.
Outputs include analysis prompts for LLM guidance.

Maintenance & Community

Contributions are welcomed via pull requests and issue reporting.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Requires a GitHub Personal Access Token for operation. The script's effectiveness is dependent on the quality and format of the repository's files.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

6 stars in the last 30 days

Explore Similar Projects

Starred by

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind).

codebase-digest by kamilstanuch

CLI tool for LLM-assisted codebase analysis

Created 1 year ago

Updated 1 year ago

mcp-git-ingest by adhikasp

GitHub repository reader and analyzer

Created 1 year ago

Updated 11 months ago

git2txt by addyosmani

CLI tool to convert GitHub repos to text files for LLMs

Created 1 year ago

Updated 1 year ago

RepoToText by JeremiahPetersen

Web app for LLM-based repo analysis

Created 2 years ago

Updated 1 year ago

Starred by

Edward Sun

Edward Sun(Research Scientist at Meta Superintelligence Lab).

github2file by QuixiAI

CLI tool for extracting GitHub repo code to a single file

Created 1 year ago

Updated 11 months ago

RepoAgent by OpenBMB

LLM-powered tool for repo-level code documentation generation

Created 2 years ago

Updated 1 year ago

rendergit by karpathy

Flatten GitHub repos into a single HTML page

Created 4 months ago

Updated 4 months ago

Starred by

Jesse Clark

Jesse Clark(Cofounder of Marqo),

Taranjeet Singh

Taranjeet Singh(Cofounder of Mem0), and

1 more.

gpt-repository-loader by mpoon

CLI tool for converting code repos into LLM-friendly format

Created 2 years ago

Updated 1 year ago

Starred by

Guritfaq Singh

Guritfaq Singh(Cofounder of CodeRabbit),

Anton Osika

Anton Osika(Cofounder of Lovable), and

2 more.

ai-pr-reviewer by coderabbitai

AI-based code reviewer for GitHub pull requests

Created 2 years ago

Updated 3 weeks ago

Starred by

David Cournapeau

David Cournapeau(Author of scikit-learn),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

7 more.

repomix by yamadashy

CLI tool to pack codebases into AI-friendly formats for LLMs

Created 1 year ago

Updated 14 hours ago

Starred by

Joe Walnes

Joe Walnes(Head of Experimental Projects at Stripe),

Travis Fischer

Travis Fischer(Founder of Agentic), and

2 more.

gitingest by coderamp-labs

CLI tool for LLM-friendly code ingestion from Git repos

Created 1 year ago

Updated 1 week ago

Starred by

Hiroshi Shibata

Hiroshi Shibata(Core Contributor to Ruby),

Stas Kelvich

Stas Kelvich(Cofounder of Neon), and

2 more.

vscode-gitlens by gitkraken

VS Code extension supercharges Git within the IDE

Created 9 years ago

Updated 1 day ago

Feedback? Help us improve.