dropout  by facebookresearch

PyTorch implementation for "Dropout Reduces Underfitting" research paper

created 2 years ago
313 stars

Top 87.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for the "Dropout Reduces Underfitting" paper, introducing novel "early dropout" and "late dropout" techniques. It targets researchers and practitioners in deep learning, particularly those working with vision transformers and convolutional networks, aiming to improve model performance by addressing both underfitting and overfitting.

How It Works

The project implements two distinct dropout strategies: "early dropout" is applied early in training to help underfitting models achieve lower training loss, while "late dropout" is applied later in training to enhance generalization and combat overfitting. This dual approach allows for more nuanced control over the training process and model convergence.

Quick Start & Requirements

  • Install: Follow instructions in INSTALL.md.
  • Prerequisites: PyTorch, timm library, ConvNeXt codebase. Training commands suggest multi-node (4 nodes, 8 GPUs each) or single-machine (8 GPUs) setups.
  • Resources: Requires significant GPU resources for training.
  • Links: INSTALL.md (link not provided in README), timm library, ConvNeXt codebase.

Highlighted Details

  • Achieves state-of-the-art results on ImageNet-1K for various models including ViT, Mixer, and ConvNeXt.
  • Demonstrates performance gains with both "basic" and "improved" training recipes.
  • Provides example commands for training and evaluation on both multi-node and single-machine configurations.
  • Codebase built upon the established timm and ConvNeXt libraries.

Maintenance & Community

  • Authors are affiliated with Meta AI, UC Berkeley, and MBZUAI.
  • No explicit community links (Discord, Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • License: CC-BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International).
  • Restrictions: Non-commercial use only. Compatibility with closed-source projects is restricted due to the NC clause.

Limitations & Caveats

The CC-BY-NC 4.0 license strictly prohibits commercial use, limiting adoption for many industry applications. The README also implies significant computational resources are needed for training, potentially posing a barrier for users without access to large GPU clusters.

Health Check
Last commit

2 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.