big-sleep  by lucidrains

CLI tool for text-to-image generation

Created 5 years ago
2,566 stars

Top 17.8% on SourcePulse

GitHubView on GitHub
Project Summary

Big Sleep is a command-line tool and Python library for generating images from text prompts using OpenAI's CLIP and a BigGAN. It's designed for users with GPUs who want to experiment with text-to-image synthesis through a simple interface.

How It Works

The tool leverages CLIP to interpret text prompts and guide a BigGAN generator towards producing corresponding images. This approach allows for creative image generation by "dreaming" visuals based on natural language descriptions, offering a straightforward way to explore AI-powered art.

Quick Start & Requirements

  • Install via pip: pip install big-sleep
  • Requires a GPU.
  • Usage: dream "a pyramid made of ice"
  • Advanced usage and code examples are available in the README.

Highlighted Details

  • Supports training on multiple phrases using a "|" delimiter.
  • Allows penalizing specific prompts to steer generation away from unwanted elements.
  • Option to use a larger OpenAI vision model (--larger-model) for potentially improved generations.
  • Can save the best high-scoring image during generation (--save-best).

Maintenance & Community

The project is based on work by Ryan Murdock and is available on GitHub. Links to original and simplified notebooks are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that Big Sleep can sometimes steer off-manifold into noise due to the class-conditioned nature of the GAN. The --max-classes flag is suggested for stability at the cost of expressivity.

Health Check
Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

0%
2k
Training-free paradigm for text-to-image generation/editing
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.