papervizagent by google-research

Multi-agent framework for automated academic illustration generation

Created 5 months ago

471 stars

Top 63.9% on SourcePulse

Project Summary

Summary PaperVizAgent (formerly PaperBanana) is a reference-driven, multi-agent framework for automating the generation of publication-quality academic illustrations, diagrams, and plots from scientific content. It functions as a creative team of specialized agents, transforming raw text into visually appealing and semantically accurate scientific figures. This benefits researchers by streamlining illustration creation, ensuring aesthetic quality and technical accuracy for academic publications.

How It Works

PaperVizAgent orchestrates five specialized agents: Retriever, Planner, Stylist, Visualizer, and Critic. The Retriever identifies reference diagrams. The Planner translates content into descriptions using in-context learning. The Stylist refines descriptions based on synthesized academic aesthetic guidelines. The Visualizer generates visuals from descriptions using image generation models. The Critic agent enables multi-round iterative refinement with the Visualizer. This approach leverages reference examples and iterative refinement for novel, high-quality scientific visualizations.

Quick Start & Requirements

Installation uses the uv Python package manager. Install uv, then Python 3.12, and run uv pip install -r requirements.txt. API keys for Google, Anthropic, and OpenAI are required (environment variables or configs/model_config.yaml). The PaperBananaBench dataset is optional; the framework functions without it by bypassing the Retriever Agent's few-shot capability. Launch the interactive Streamlit demo with streamlit run demo.py or use the CLI via python main.py.

Highlighted Details

Multi-Agent Pipeline: Orchestrates specialized agents (Retriever, Planner, Stylist, Visualizer, Critic).
Reference-Driven Learning: Leverages curated examples via generative retrieval.
Iterative Refinement: Employs a Critic-Visualizer loop for progressive quality enhancement.
Style-Aware Generation: Uses synthesized aesthetic guidelines for academic compliance.
Flexible Modes: Supports various experiment modes from basic generation to full pipeline refinement.
Interactive Demo & Visualization: Streamlit interface for generation, refinement, and pipeline evolution tracking.
High-Resolution Output: Supports upscaling to 2K/4K and batch export.

Maintenance & Community

The README does not detail specific contributors, sponsorships, partnerships, or community channels (e.g., Discord, Slack), nor does it mention a roadmap.

Licensing & Compatibility

No software license is specified in the README. The project's disclaimer states it is for demonstration purposes only and not suitable for production, implying potential limitations for commercial use or integration without further clarification.

Limitations & Caveats

This project is explicitly for demonstration purposes only and is not an officially supported Google product. It is not eligible for the Google Open Source Software Vulnerability Rewards Program and is not intended for production use. The PaperBananaBench dataset is also noted as forthcoming.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

21 stars in the last 30 days