Mish  by digantamisra98

Research paper for Mish activation function

Created 6 years ago
1,302 stars

Top 30.6% on SourcePulse

GitHubView on GitHub
Project Summary

Mish is a novel, self-regularized, non-monotonic neural activation function designed to improve deep learning model performance across various tasks. It aims to provide a smoother loss landscape and better generalization compared to traditional activations like ReLU and Swish, benefiting researchers and practitioners in computer vision and natural language processing.

How It Works

Mish is defined as x * tanh(softplus(x)). This formulation allows for non-monotonicity, enabling gradients to flow more freely through negative values, which can help prevent vanishing gradients. The self-regularizing property and smooth, continuous nature are hypothesized to contribute to better optimization and more stable training, particularly in deeper networks.

Quick Start & Requirements

  • PyTorch: import torch.nn as nn; mish = nn.Mish() (available in PyTorch 1.9+)
  • TensorFlow: Available via tensorflow_addons.activations.mish
  • DarkNet: Integrated into the framework.
  • Dependencies: Primarily Python with deep learning frameworks like PyTorch, TensorFlow, or MXNet. CUDA is beneficial for performance.

Highlighted Details

  • Achieves state-of-the-art results on object detection benchmarks like MS-COCO when combined with architectures like CSP-p7.
  • Demonstrates improved accuracy and smoother loss landscapes compared to ReLU and Swish on CIFAR-10 and MNIST datasets.
  • Offers better performance in deep networks where traditional activations struggle with optimization.
  • Integrated into major deep learning frameworks including PyTorch, TensorFlow, MXNet, and ONNX.

Maintenance & Community

The project is actively maintained and has seen significant community contributions, with Mish being integrated into numerous popular libraries and frameworks. Links to community discussions and related projects are available in the README.

Licensing & Compatibility

The repository does not explicitly state a license. However, its integration into major frameworks suggests broad compatibility for research and commercial use, though a formal license check is recommended.

Limitations & Caveats

While benchmarks show strong performance, the computational cost of Mish might be slightly higher than ReLU in certain implementations. The README also points to several community-developed faster or experimental variants.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
5 more.

attorch by BobMcDear

0.2%
576
PyTorch nn module subset, implemented in Python using Triton
Created 2 years ago
Updated 1 month ago
Starred by Guy Gur-Ari Guy Gur-Ari(Cofounder of Augment), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
1 more.

pytorch_image_classification by hysts

0%
1k
PyTorch image classification for various datasets (CIFAR, MNIST, ImageNet)
Created 7 years ago
Updated 3 years ago
Feedback? Help us improve.