AmuseAI by saddam213

Multimodal generative AI SDK demo

Created 7 months ago

508 stars

Top 60.7% on SourcePulse

Project Summary

Summary This project provides a UI demo for the TensorStack SDK, showcasing advanced multi-modal generative AI capabilities. It targets developers and power users seeking to integrate sophisticated image, video, and audio AI models. The primary benefit is offering a unified interface to cutting-edge, computationally intensive AI technologies.

How It Works

AmuseAI employs a modular architecture, integrating diverse AI models and runtimes. It utilizes OnnxRuntime for optimized hardware acceleration and a Python Inference Runtime for broad ML ecosystem compatibility. The system supports dynamic loading of LoRA adapters for fine-tuned style and character control, and incorporates advanced transformer architectures like Z-Image and FLUX2 for image synthesis, alongside WAN/LTX-2 for video generation. Speech-to-Text and Text-to-Speech models are also integrated for enhanced interaction.

Quick Start & Requirements

Primary install/run command: Not explicitly stated. Users are directed to download "Amuse v3.2.0" and a "desktop package (version 7.1.0)".
Prerequisites: A FontAwesome Pro v7 License is required for commercial use. Font files must be added to the Fonts directory and set with a "Resource" build action.
Links: Archives available at Hugging Face: https://huggingface.co/TensorStack/Amuse.

Highlighted Details

Migration to .NET 10 framework with latest OnnxRuntime for optimized hardware acceleration.
Python Inference Runtime implementation for seamless interop with the broader ML ecosystem.
Support for dynamic loading and blending of LoRA adapters for style and character control.
Integration of Z-Image and FLUX2 models for high-fidelity image synthesis.
Expansion into generative video using WAN and LTX-2 architectures.
Deployment of Speech-to-Text (STT) and Text-to-Speech (TTS) audio models.

Maintenance & Community

No specific details regarding contributors, sponsorships, community channels (Discord/Slack), or roadmaps are provided in the README.

Licensing & Compatibility

The project's core license is not specified. However, a critical external dependency, FontAwesome Pro v7, requires a license for commercial use, with specific instructions for asset integration.

Limitations & Caveats

Commercial use is contingent upon acquiring the FontAwesome Pro v7 license and correctly integrating its assets. The README does not detail other potential limitations, unsupported platforms, or known bugs.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

118 stars in the last 30 days