cosmos-reason1 by nvidia-cosmos

Multimodal LLMs for physical common sense and embodied decisions

Created 1 year ago

922 stars

Top 39.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yineng Zhang

Inference Lead at SGLang; Research Scientist at Together AI

Project Summary

Cosmos-Reason1 is a suite of multimodal LLMs, ontologies, and benchmarks designed to imbue AI with physical common sense and embodied reasoning capabilities. Targeting researchers and developers in AI, robotics, and embodied agents, it enables models to generate physically grounded responses through long chain-of-thought reasoning.

How It Works

The models, Cosmos-Reason1-8B and Cosmos-Reason1-56B, undergo a four-stage training process: vision pre-training, general Supervised Fine-Tuning (SFT), Physical AI SFT, and Physical AI reinforcement learning. This approach leverages ontologies for physical common sense and embodied reasoning, coupled with custom benchmarks, to specifically enhance the physical reasoning abilities of multimodal LLMs.

Quick Start & Requirements

Installation and usage details are not provided in the README.
Requires access to the models, which are not directly downloadable from the repository.
Further information is available on the Product Website and through associated Papers.

Highlighted Details

Two multimodal LLMs released: Cosmos-Reason1-8B and Cosmos-Reason1-56B.
Focus on physically grounded responses and embodied decision-making.
Development of ontologies for physical common sense and embodied reasoning.
Creation of benchmarks to evaluate Physical AI reasoning capabilities.

Maintenance & Community

Developed by NVIDIA.
No community links (Discord, Slack, etc.) are provided in the README.

Licensing & Compatibility

Source code is released under the Apache 2.0 License.
Models are released under the NVIDIA Open Model License.
Custom licenses are available upon contact. Commercial use may be subject to the NVIDIA Open Model License terms.

Limitations & Caveats

The README lacks specific instructions for installation, usage, or direct model access, requiring users to consult external resources. Model family details are marked as "Coming Soon."

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

15 stars in the last 30 days