big-sleep  by lucidrains

CLI tool for text-to-image generation

Created 4 years ago
2,573 stars

Top 18.2% on SourcePulse

GitHubView on GitHub
Project Summary

Big Sleep is a command-line tool and Python library for generating images from text prompts using OpenAI's CLIP and a BigGAN. It's designed for users with GPUs who want to experiment with text-to-image synthesis through a simple interface.

How It Works

The tool leverages CLIP to interpret text prompts and guide a BigGAN generator towards producing corresponding images. This approach allows for creative image generation by "dreaming" visuals based on natural language descriptions, offering a straightforward way to explore AI-powered art.

Quick Start & Requirements

  • Install via pip: pip install big-sleep
  • Requires a GPU.
  • Usage: dream "a pyramid made of ice"
  • Advanced usage and code examples are available in the README.

Highlighted Details

  • Supports training on multiple phrases using a "|" delimiter.
  • Allows penalizing specific prompts to steer generation away from unwanted elements.
  • Option to use a larger OpenAI vision model (--larger-model) for potentially improved generations.
  • Can save the best high-scoring image during generation (--save-best).

Maintenance & Community

The project is based on work by Ryan Murdock and is available on GitHub. Links to original and simplified notebooks are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that Big Sleep can sometimes steer off-manifold into noise due to the class-conditioned nature of the GAN. The --max-classes flag is suggested for stability at the cost of expressivity.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.