oneflow  by Oneflow-Inc

Deep learning framework for user-friendly, scalable, efficient model development

Created 8 years ago
9,367 stars

Top 5.4% on SourcePulse

GitHubView on GitHub
Project Summary

OneFlow is a deep learning framework designed for user-friendliness, scalability, and efficiency. It targets researchers and engineers looking to program models with a PyTorch-like API, scale them to n-dimensional parallel execution using Global Tensor, and accelerate deployment via its Graph Compiler.

How It Works

OneFlow utilizes a Global Tensor abstraction to manage distributed data across multiple devices and nodes, enabling seamless n-dimensional parallelism. Its Graph Compiler optimizes the computation graph for efficient execution, facilitating model acceleration and deployment. This approach aims to simplify distributed training and inference compared to traditional frameworks.

Quick Start & Requirements

  • Install: python3 -m pip install oneflow (stable CUDA), python3 -m pip install --pre oneflow -f https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu118 (nightly CUDA).
  • Prerequisites: Python 3.7-3.11, CUDA arch 60+, CUDA Toolkit 10.0+, Nvidia driver 440.33+. Docker images are available.
  • Resources: Building from source requires libopenblas-dev, nasm, g++, gcc, python3-pip, cmake, autoconf, libtool.
  • Docs: QUICKSTART, API Reference.

Highlighted Details

  • PyTorch-like API for ease of use.
  • N-dimensional parallelism via Global Tensor.
  • Graph Compiler for deployment acceleration.
  • Includes Libai for large-scale Transformer models and FlowVision for computer vision tasks.

Maintenance & Community

  • Developed by OneFlow Inc and Zhejiang Lab.
  • Community channels include GitHub issues, QQ group (331883), WeChat, Discord, Twitter, LinkedIn, and Medium.

Licensing & Compatibility

  • License: Apache License 2.0.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

  • Some model zoo links (OneFlow-Models, OneFlow-Benchmark) are marked as outdated.
  • Building from source requires specific system dependencies and careful CMake configuration.
Health Check
Last Commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
17 stars in the last 30 days

Explore Similar Projects

Starred by Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI) and Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code).

torchshard by kaiyuyue

0%
300
PyTorch engine for tensor slicing into parallel shards
Created 4 years ago
Updated 3 months ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

3.4%
1k
Framework for scaling multimodal model training across accelerators
Created 5 months ago
Updated 3 weeks ago
Starred by Luca Soldaini Luca Soldaini(Research Scientist at Ai2), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
4 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
Created 4 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
20 more.

alpa by alpa-projects

0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Created 4 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
13 more.

torchtitan by pytorch

0.7%
4k
PyTorch platform for generative AI model training research
Created 1 year ago
Updated 19 hours ago
Feedback? Help us improve.