vibetensor  by NVlabs

AI-generated deep learning system software

Created 1 month ago
525 stars

Top 60.2% on SourcePulse

GitHubView on GitHub
Project Summary

VibeTensor is a novel deep learning system entirely generated by AI agents, offering a PyTorch-inspired runtime with a C++ core and CUDA support. It targets researchers and power users interested in AI-assisted software engineering and exploring novel deep learning system architectures. The primary benefit is demonstrating AI's capability to generate and validate complex, coherent software stacks from high-level APIs down to low-level CUDA operations.

How It Works

VibeTensor features a C++20 core for tensor operations and autograd, a CUDA subsystem for GPU acceleration (including a stream-ordered caching allocator and graph capture), and language bindings for Python (via nanobind) and Node.js (via N-API). Its core novelty lies in its AI-driven generation process, where code is proposed and validated by agents through automated builds and tests, minimizing manual intervention. This approach aims for correctness and coherence across the entire stack, from language frontends to GPU kernels.

Quick Start & Requirements

  • Primary install / run command: Editable dev install: python -m pip install -U pip build pytest numpy, export CUDACXX=$(which nvcc), CMAKE_BUILD_TYPE=Debug python -m pip install -v -e .[test].
  • Non-default prerequisites and dependencies: Python >= 3.10, CMake >= 3.26, C++20 compiler (GCC/Clang), NVIDIA GPU + CUDA toolkit (CUDA 12+ required; CI uses 13.0.2). CPU-only builds are disabled. Optional for JS/TS: Node.js 22 + npm Node-API headers.
  • Estimated setup time or resource footprint: Requires compilation; specific times not detailed.
  • Links: Research paper link is mentioned but not provided as a URL.

Highlighted Details

  • Fully AI-generated deep learning system, showcasing agentic software engineering.
  • PyTorch-inspired eager runtime with a C++20 core, CUDA support, and experimental Node.js API.
  • Implements its own tensors, storage, dispatcher, autograd, CUDA runtime, and caching allocator.
  • Extensible via dynamic operator plugins (C ABI), Python overrides, and a Triton bridge.
  • Experimental multi-GPU support with Fabric tensors and a CUTLASS Blackwell ring allreduce plugin (CUDA 13+ required).
  • DLPack import/export for zero-copy interoperation with other frameworks.

Maintenance & Community

VibeTensor is described as an active research project; APIs and behavior may change without notice. No explicit community links (Discord/Slack) or sponsorship details are provided in the README.

Licensing & Compatibility

Licensed under the Apache License, Version 2.0. Vendored third-party code retains its original licenses. The Apache 2.0 license is generally compatible with commercial use and closed-source linking.

Limitations & Caveats

This repository is released for agentic system research purposes only and should not be used for production. Performance is not competitive with PyTorch, with potential for global suboptimality due to component composition. APIs and behavior are subject to change. Experimental features, such as the Node.js API and multi-GPU support, are best-effort.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
530 stars in the last 30 days

Explore Similar Projects

Starred by Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
3 more.

Trace by microsoft

0.1%
708
AutoDiff-like tool for end-to-end AI agent training with general feedback
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.