Discover and explore top open-source AI tools and projects—updated daily.
llmsresearchAgentic framework for automated academic illustration
New!
Top 61.6% on SourcePulse
This project provides an open-source implementation and extension of Google Research's PaperBanana, designed to automate the creation of academic figures, diagrams, and statistical plots from text descriptions. It targets AI scientists and researchers, offering a benefit of generating publication-quality visuals efficiently through an agentic framework powered by Google Gemini.
How It Works
PaperBanana employs a two-phase, multi-agent pipeline featuring five specialized agents: Retriever, Planner, Stylist, Visualizer, and Critic. Phase 1 involves the Retriever selecting relevant examples, the Planner generating a detailed textual description, and the Stylist refining it for visual aesthetics based on NeurIPS-style guidelines. Phase 2 focuses on iterative refinement, where the Visualizer renders an image using Gemini 3 Pro for diagrams or Matplotlib for plots, and the Critic evaluates the output, providing feedback for revised descriptions. This refinement loop repeats up to three times, leveraging Gemini's VLM capabilities for planning and critique.
Quick Start & Requirements
pip install paperbanana. For development, clone the repository and install with pip install -e ".[dev,google]".paperbanana setup for an interactive wizard to configure the API key, or manually edit the .env file.paperbanana generate --input <text_file> --caption "<description>". Generate plots with paperbanana plot --data <csv_file> --intent "<description>". Evaluate diagrams with paperbanana evaluate --generated <img1> --reference <img2> --context <text_file> --caption "<description>".Highlighted Details
Maintenance & Community
This is an unofficial, community-driven open-source implementation. Specific details on active maintainers, sponsorships, or dedicated community channels (like Discord/Slack) are not explicitly detailed in the README beyond the GitHub repository itself.
Licensing & Compatibility
The project is released under the MIT License. This permissive license allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
This project is an unofficial reimplementation based on a research paper and is not affiliated with or endorsed by the original authors or Google Research. The implementation may differ from the original system described in the paper. Users should exercise discretion.
1 day ago
Inactive
PySpur-Dev