glimpse  by seatedro

CLI tool for LLM context loading from codebases

created 7 months ago
300 stars

Top 89.7% on sourcepulse

GitHubView on GitHub
Project Summary

Glimpse is a command-line utility designed to efficiently extract and process code from local directories, Git repositories, and web pages, primarily for feeding into Large Language Models (LLMs). It offers parallel processing, tree-view navigation, token counting, and flexible output options, making it valuable for developers and researchers working with LLM-based code analysis or generation.

How It Works

Glimpse leverages Rust's concurrency features for fast, parallel file processing. It respects .gitignore by default and allows custom include/exclude patterns. For token counting, it integrates with OpenAI's tiktoken and HuggingFace's tokenizers, providing estimates for LLM context window usage. Web content is processed by converting HTML to Markdown, with options for traversing linked pages.

Quick Start & Requirements

  • Install: cargo install glimpse, brew install glimpse, or via Nix.
  • Prerequisites: Rust toolchain for cargo install.
  • Usage: glimpse /path/to/project or glimpse https://example.com.
  • Docs: https://github.com/seatedro/glimpse

Highlighted Details

  • Supports direct processing of GitHub, GitLab, Bitbucket, and other Git repositories by cloning them to a temporary directory.
  • Converts web pages to Markdown, preserving structure, links, code blocks, and optionally traversing linked pages.
  • Can output processed content as a PDF with syntax highlighting, table of contents, and custom headers/footers.
  • Offers two token counting backends: tiktoken (OpenAI) and HuggingFace (any model or local file).

Maintenance & Community

  • Developed by seatedro.
  • Community links are not explicitly provided in the README.

Licensing & Compatibility

  • MIT License. Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

The tool's effectiveness for web content processing, especially complex sites, may vary. While it supports various Git providers, performance on very large repositories might require tuning thread counts or exclude patterns.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
3
Issues (30d)
1
Star History
61 stars in the last 90 days

Explore Similar Projects

Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
6 more.

repomix by yamadashy

0.8%
18k
CLI tool to pack codebases into AI-friendly formats for LLMs
created 1 year ago
updated 5 days ago
Feedback? Help us improve.