gemini-writer by Doriandarko

Autonomous AI writing agent for creative content generation

Created 5 months ago

279 stars

Top 93.0% on SourcePulse

Project Summary

This project provides an autonomous AI writing agent, gemini-writer, powered by Google's Gemini 3 Flash model. It addresses the challenge of generating long-form creative content like novels and story collections by automating the planning, writing, and management process. Aimed at writers, researchers, and power users, it offers a significant benefit by enabling independent, deep-reasoning content creation with real-time feedback and robust context handling.

How It Works

The agent operates via an agentic loop, leveraging Gemini's "thinking mode" for advanced reasoning. It utilizes a suite of tools—create_project, write_file, and compress_context—to manage its workspace and output. A key design choice is its smart context management, featuring automatic compression at 90% of the 1,000,000 token limit to maintain operational efficiency. This approach is advantageous for long-form generation, allowing the agent to autonomously plan, execute, and iterate on creative tasks while providing real-time streaming of its thought process and generated content.

Quick Start & Requirements

Installation: Recommended: Install uv (curl -LsSf https://astral.sh/uv/install.sh | sh), then uv pip install -r requirements.txt. Alternatively, use pip install -r requirements.txt.
Prerequisites: Python, uv (optional), and a Google Gemini API key.
Configuration: Create a .env file in the project root containing GEMINI_API_KEY=your-api-key-here. Obtain API keys from https://aistudio.google.com/app/apikey.
Usage:
- Inline prompt: uv run writer.py "Your prompt" or python writer.py "Your prompt"
- Interactive: uv run writer.py or python writer.py
- Recovery: uv run writer.py --recover output/your_project/.context_summary_*.md or python writer.py --recover output/your_project/.context_summary_*.md
Setup Time: Minimal, primarily dependency installation and API key configuration.

Highlighted Details

Autonomous Writing: Agent independently plans and executes creative writing tasks.
Real-Time Streaming: Provides live updates on agent's thinking, content generation, and tool calls.
Smart Context Management: Automatically compresses context at 900,000 tokens within a 1,000,000 token window.
Recovery Mode: Enables resuming interrupted work from saved context summaries.
Tool Integration: Supports project creation, file writing (create, append, overwrite), and workspace management.

Maintenance & Community

Created by Pietro Schirano (@Doriandarko). The README does not specify community channels (e.g., Discord, Slack) or a public roadmap.

Licensing & Compatibility

License: MIT License with an Attribution Requirement.
Commercial Use: Permitted, provided clear attribution is given to Pietro Schirano (@Doriandarko). Compatibility is standard for Python environments; API usage is subject to Google's terms.

Limitations & Caveats

The agent is limited to a maximum of 300 iterations per task, which may pose a constraint for exceptionally complex or lengthy creative endeavors. It relies heavily on the Google Gemini API, making it dependent on API availability, terms of service, and potential associated costs. Users may perceive the agent as "stuck" during complex tasks, necessitating monitoring of progress within generated project files. Troubleshooting guidance is provided for common API key and permission errors.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days