Discover and explore top open-source AI tools and projects—updated daily.
jiachenzhuPyTorch code for a CVPR 2025 research paper
Top 36.1% on SourcePulse
This repository provides the official PyTorch implementation of DynamicTanh (DyT), a novel element-wise operation designed to replace normalization layers in Transformer architectures. It targets researchers and practitioners in deep learning, offering a way to achieve comparable or improved performance with potentially simplified models.
How It Works
DyT replaces standard normalization layers with a learnable scaled tanh function: DyT(x) = tanh(αx), where α is a learnable scalar. This approach aims to provide the stabilizing benefits of normalization without the computational overhead or architectural complexity, potentially leading to more efficient and effective models.
Quick Start & Requirements
conda create -n DyT python=3.12, conda activate DyT, conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia, pip install timm==1.0.15 tensorboard.Highlighted Details
timm library and ConvNeXt repository.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The provided training commands are for reproducing results on ImageNet-1K; adapting DyT to other tasks or custom models requires following instructions in respective folders or the "HowTo" guide. Computational efficiency results require separate reproduction steps.
11 months ago
Inactive
facebookresearch
TRI-ML
dusty-nv
huggingface