DCVC  by microsoft

Neural video codec for real-time compression

Created 3 years ago
611 stars

Top 53.7% on SourcePulse

GitHubView on GitHub
Project Summary

DCVC-RT is a neural video codec designed for practical, real-time applications, targeting researchers and developers seeking high compression efficiency and low latency. It offers a single model for a wide bitrate range, controllable rate control, and unified YUV/RGB coding, aiming to surpass traditional codecs like H.266/VTM in both speed and compression ratio.

How It Works

DCVC-RT minimizes operational costs, identified as a key bottleneck for NVC speed, by employing implicit temporal modeling to avoid complex motion modules and using single low-resolution latent representations. This approach accelerates encoding/decoding without sacrificing compression quality. It also incorporates model integerization for cross-device consistency and a module-bank-based rate control for adaptability.

Quick Start & Requirements

  • Install: Requires Python 3.12, CUDA 12.6, and PyTorch 2.6. Install dependencies via conda and pip install -r requirements.txt. C++ components for bitstream writing and CUDA kernels require cmake, g++, ninja-build, and pip install . in respective src/cpp/ and src/layers/extensions/inference/ directories.
  • Prerequisites: NVIDIA GPU, CUDA 12.6, Python 3.12.
  • Pretrained Models: Download from provided links and place in ./checkpoints.
  • Testing: Use test_video.py with specified model paths, rate numbers, and dataset configurations.
  • Resources: CPU frequency scaling is recommended for arithmetic coding performance.
  • Docs: Usage, Test Conditions

Highlighted Details

  • Achieves 100+ FPS for 1080p and real-time 4K coding.
  • Offers an average 21% bitrate saving over H.266/VTM for 1080p content.
  • Intra-frame codec shows 11.1% bitrate reduction over VTM with over 10x faster decoding.
  • Supports wide bitrate range and dynamic rate control via quantization parameters.

Maintenance & Community

The project is associated with Microsoft and builds upon the DCVC family of models, with contributions from various researchers. Further details on the DCVC family can be found in the DCVC-family section.

Licensing & Compatibility

The project is released under a license that permits use and modification, with specific trademark guidelines for Microsoft assets. Compatibility for commercial use or closed-source linking is not explicitly detailed but implied by the permissive nature of typical research codebases.

Limitations & Caveats

While optimized for CUDA 12.6 and PyTorch 2.6, compatibility with other versions may vary. The README notes that time.time() precision can be insufficient on Windows for accurate speed measurements. CPU performance for arithmetic coding is critical and requires manual configuration.

Health Check
Last Commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
2
Star History
26 stars in the last 30 days

Explore Similar Projects

Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), and
2 more.

HunyuanVideo by Tencent-Hunyuan

0.2%
11k
PyTorch code for video generation research
Created 9 months ago
Updated 3 weeks ago
Feedback? Help us improve.