gen-cv by Azure

Vision AI solution accelerator with image generation and manipulation examples

Created 2 years ago

431 stars

Top 68.9% on SourcePulse

Project Summary

This repository provides a collection of examples and accelerators for synthetic image generation, manipulation, and reasoning, leveraging Azure AI services and open-source frameworks. It targets developers and researchers interested in practical applications of Computer Vision, OpenAI, and Stable Diffusion, offering solutions for tasks like video analysis, avatar creation, and image editing.

How It Works

The project integrates Azure Machine Learning, Azure OpenAI Vision, and popular open-source models like Stable Diffusion and Segment Anything. It facilitates fine-tuning of large models on Azure, enables advanced image manipulation through techniques like inpainting and Dreambooth, and explores vector search for managing image embeddings. This approach allows users to harness powerful, pre-trained models within a managed cloud environment for scalable AI-driven image workflows.

Quick Start & Requirements

Install: Clone the repository, create a Conda environment (conda create -n gen-cv python=3.10), activate it (conda activate gen-cv), and install dependencies (pip install -r requirements.txt).
Prerequisites: Python 3.10, Conda, and a GPU is highly recommended for Stable Diffusion image generation. Users need to configure service parameters and keys in a .env file.
Resources: Tested on GitHub Codespaces and Azure ML Compute Instance.
Docs: Examples are provided as Jupyter notebooks within the repository.

Highlighted Details

Features fine-tuning for Azure OpenAI Vision for video analysis.
Includes examples for creating avatar videos and interactive avatar experiences.
Demonstrates Stable Diffusion XL integration with Azure Machine Learning.
Showcases background removal, precise inpainting, and custom object/style addition using various models.

Maintenance & Community

This project is maintained by Azure and welcomes contributions via pull requests, requiring agreement to a Contributor License Agreement (CLA). It follows the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the provided README, but it is a Microsoft project, implying potential adherence to Microsoft's open-source policies. Compatibility for commercial use or closed-source linking would require clarification on the specific license.

Limitations & Caveats

The README does not specify the exact license, which could impact commercial use. While a GPU is recommended for certain tasks, it's not a strict requirement for all examples, potentially leading to performance limitations on CPU-only environments.

gen-cv by Azure

Explore Similar Projects

Comfyui-zhenzhen by T8mars

krita-vision-tools by Acly

ZenCtrl by FotographerAI

EasyPhoto by aigc-apps

imagesorcery-mcp by sunriseapps

artcraft by storytold

Modern-Computer-Vision-with-PyTorch-2E by PacktPublishing

peinture by Amery2010

imaginAIry by brycedrennan

sdnext by vladmandic

Open-Sora-Plan by PKU-YuanGroup

HivisionIDPhotos by Zeyi-Lin