glimpse by seatedro

CLI tool for LLM context loading from codebases

Created 1 year ago

352 stars

Top 79.2% on SourcePulse

Project Summary

Glimpse is a command-line utility designed to efficiently extract and process code from local directories, Git repositories, and web pages, primarily for feeding into Large Language Models (LLMs). It offers parallel processing, tree-view navigation, token counting, and flexible output options, making it valuable for developers and researchers working with LLM-based code analysis or generation.

How It Works

Glimpse leverages Rust's concurrency features for fast, parallel file processing. It respects .gitignore by default and allows custom include/exclude patterns. For token counting, it integrates with OpenAI's tiktoken and HuggingFace's tokenizers, providing estimates for LLM context window usage. Web content is processed by converting HTML to Markdown, with options for traversing linked pages.

Quick Start & Requirements

Install: cargo install glimpse, brew install glimpse, or via Nix.
Prerequisites: Rust toolchain for cargo install.
Usage: glimpse /path/to/project or glimpse https://example.com.
Docs: https://github.com/seatedro/glimpse

Highlighted Details

Supports direct processing of GitHub, GitLab, Bitbucket, and other Git repositories by cloning them to a temporary directory.
Converts web pages to Markdown, preserving structure, links, code blocks, and optionally traversing linked pages.
Can output processed content as a PDF with syntax highlighting, table of contents, and custom headers/footers.
Offers two token counting backends: tiktoken (OpenAI) and HuggingFace (any model or local file).

Maintenance & Community

Developed by seatedro.
Community links are not explicitly provided in the README.

Licensing & Compatibility

MIT License. Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

The tool's effectiveness for web content processing, especially complex sites, may vary. While it supports various Git providers, performance on very large repositories might require tuning thread counts or exclude patterns.

glimpse by seatedro

Explore Similar Projects

codefetch by regenrek

codai by meysamhadeli

ingest by sammcj

codebase-digest by kamilstanuch

VectorCode by Davidyz

CodeAsk by woniu9524

llama.vscode by ggml-org

acemcp by qy527145

autodoc by context-labs

shotgun_code by glebkudr

repomix by yamadashy

gitingest by coderamp-labs