Discover and explore top open-source AI tools and projects—updated daily.
dwzhu-pkuAutomating academic illustration generation for AI scientists
Top 9.9% on SourcePulse
PaperBanana automates the creation of academic illustrations for AI scientists, transforming raw scientific content into publication-quality diagrams and plots. It targets researchers seeking to accelerate visual communication and publication workflows by providing a sophisticated, multi-agent generation framework.
How It Works
The framework employs a reference-driven, multi-agent pipeline. Specialized agents (Retriever, Planner, Stylist, Visualizer, Critic) collaborate to generate illustrations. The Retriever identifies relevant examples, the Planner translates content into descriptions, the Stylist refines aesthetics, the Visualizer creates images, and the Critic iteratively refines outputs. This approach leverages in-context learning and iterative refinement for high-quality, semantically accurate, and aesthetically pleasing results.
Quick Start & Requirements
Installation involves cloning the repository, setting up Python 3.12 with uv, and installing dependencies via uv pip install -r requirements.txt. Configuration requires API keys for underlying models (e.g., Gemini) and optionally downloading the PaperBananaBench dataset. Users can launch an interactive Streamlit demo (streamlit run demo.py) or utilize the command-line interface (python main.py).
Highlighted Details
Maintenance & Community
The project is actively supported by community contributions, with several related forks and projects noted. It is explicitly stated that this is not an officially supported Google product and has no current plans for commercialization.
Licensing & Compatibility
PaperBanana is released under the Apache-2.0 license. However, Google has filed patents for the core workflows, which restricts third-party commercial applications utilizing similar logic, though open-source research efforts are unaffected.
Limitations & Caveats
The project acknowledges that further development is needed for more reliable generation and handling diverse, complex scenarios. The patent filings by Google impose restrictions on commercial use of similar logic by third parties. It is not an officially supported Google product.
1 day ago
Inactive
NVIDIA-AI-Blueprints