infinite-image-browsing by zanllp

Intelligent browser for AI-generated media

Created 3 years ago

1,256 stars

Top 31.1% on SourcePulse

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> This project offers a high-performance image/video/audio browser for AI art generation workflows, supporting popular UIs (SD-webui, ComfyUI, Fooocus) and standalone operation. It addresses efficient management and rapid retrieval of large image datasets via infinite scrolling, advanced metadata search, and an experimental natural language categorization system, enhancing user productivity.

How It Works

<2-4 sentences on core approach / design (key algorithms, models, data flow, or architectural choices) and why this approach is advantageous or novel.> The core architecture uses aggressive caching for millisecond image load times and generates adjustable thumbnails. It extracts and normalizes metadata (prompt, model, Lora) into searchable tags, enabling precise, fuzzy, and Google-like searches. A novel "Walk Mode" flattens directory structures for seamless browsing. An experimental RAG-like feature uses prompt embeddings, clustering, and LLM-generated titles for semantic categorization and natural language querying, powered by OpenAI-compatible APIs.

Quick Start & Requirements

Primary install / run command (pip, Docker, binary, etc.).
Non-default prerequisites and dependencies (GPU, CUDA >= 12, Python 3.12, large dataset, API keys, OS, hardware, etc.).
Estimated setup time or resource footprint.
If they are present, include links to official quick-start, docs, demo, or other relevant pages.

Installation options include an SD-webui extension via URL (https://github.com/zanllp/sd-webui-infinite-image-browsing), a standalone Python app, or pre-compiled desktop releases. Performance is boosted by flags like --generate_video_cover and --generate_image_cache. Initial index generation for ~20,000 images takes ~45 seconds. AI features require an OpenAI-compatible API endpoint and key.

Highlighted Details

Bullet 1 (benchmarks, performance claims, novel integration, etc.)
Bullet 2
Bullet 3
Bullet 4 (optional)

Performance: Millisecond image display post-caching; adjustable thumbnail resolution (64px-1024px).
Search: Advanced capabilities: prompt/model/Lora tags, autocomplete, auto-translation, fuzzy search, custom paths.
Versatility: Operates as SD-webui extension, standalone Python app, or desktop executable.
Browsing: "Walk Mode" flattens directories; file tree preview with basic operations.
AI Search: Experimental natural language categorization and retrieval (RAG-like) for semantic discovery.

Maintenance & Community

Notable contributors, sponsorships, partnerships, deprecations, migrations, or other health signals if notable.
Links to Discord/Slack, social handles, roadmap, etc.

Primarily maintained by an individual developer, contributions via pull requests (especially for i18n) are welcomed. Suggestions via issue section. No dedicated community channels mentioned.

Licensing & Compatibility

License type and notable restrictions (GPL -> copyleft, SSPL, etc.).
Compatibility notes for commercial use or closed-source linking.

License type is not specified, a critical omission. Supports SD-webui (incl. Stealth), ComfyUI (partially), Fooocus, NovelAI, StableSwarmUI, Invoke.AI, Pixiv.

Limitations & Caveats

<1-3 sentences on caveats: unsupported platforms, missing features, alpha status, known bugs, breaking changes, bus factor, deprecation, etc. Avoid vague non-statements and judgments.>

i18n translations may be incomplete. ComfyUI support is partial. Advanced AI features are experimental, rely on external OpenAI-compatible APIs without mock fallbacks, and failures cause direct errors. Compiled Windows executables may trigger antivirus false positives. Unspecified license is a significant adoption blocker.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days