Mish by digantamisra98

Research paper for Mish activation function

Created 6 years ago

1,303 stars

Top 30.5% on SourcePulse

View on GitHub

3 Experts Love This Project

Shawn Lewis

Cofounder of Weights & Biases

Lukas Biewald

Cofounder of Weights & Biases

Andrew Trask

Research Scientist at Google DeepMind

Project Summary

Mish is a novel, self-regularized, non-monotonic neural activation function designed to improve deep learning model performance across various tasks. It aims to provide a smoother loss landscape and better generalization compared to traditional activations like ReLU and Swish, benefiting researchers and practitioners in computer vision and natural language processing.

How It Works

Mish is defined as x * tanh(softplus(x)). This formulation allows for non-monotonicity, enabling gradients to flow more freely through negative values, which can help prevent vanishing gradients. The self-regularizing property and smooth, continuous nature are hypothesized to contribute to better optimization and more stable training, particularly in deeper networks.

Quick Start & Requirements

PyTorch: import torch.nn as nn; mish = nn.Mish() (available in PyTorch 1.9+)
TensorFlow: Available via tensorflow_addons.activations.mish
DarkNet: Integrated into the framework.
Dependencies: Primarily Python with deep learning frameworks like PyTorch, TensorFlow, or MXNet. CUDA is beneficial for performance.

Highlighted Details

Achieves state-of-the-art results on object detection benchmarks like MS-COCO when combined with architectures like CSP-p7.
Demonstrates improved accuracy and smoother loss landscapes compared to ReLU and Swish on CIFAR-10 and MNIST datasets.
Offers better performance in deep networks where traditional activations struggle with optimization.
Integrated into major deep learning frameworks including PyTorch, TensorFlow, MXNet, and ONNX.

Maintenance & Community

The project is actively maintained and has seen significant community contributions, with Mish being integrated into numerous popular libraries and frameworks. Links to community discussions and related projects are available in the README.

Licensing & Compatibility

The repository does not explicitly state a license. However, its integration into major frameworks suggests broad compatibility for research and commercial use, though a formal license check is recommended.

Limitations & Caveats

While benchmarks show strong performance, the computational cost of Mish might be slightly higher than ReLU in certain implementations. The README also points to several community-developed faster or experimental variants.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days