Targeted-Dropout  by Cohere-Labs-Community

Code for a targeted dropout research paper

Created 7 years ago
254 stars

Top 99.1% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides complementary code for the Targeted Dropout paper, enabling researchers and practitioners to implement and experiment with a novel dropout technique for neural networks. The primary benefit is improved model generalization and robustness through more efficient regularization.

How It Works

Targeted Dropout introduces a method to selectively drop units based on their importance, aiming to improve generalization and reduce overfitting. The implementation likely involves modifications to standard dropout layers to incorporate this targeted selection mechanism, potentially leading to more efficient training and better performance on downstream tasks.

Quick Start & Requirements

  • Primary install / run command: python -m TD.train --hparams=resnet_default
  • Prerequisites: Python 3, Tensorflow 1.8.
  • The project supports different environments via the --env flag (local, gcp, tpu).
  • Hparams can be specified or overridden using --hparams and --hparam_override flags.

Highlighted Details

  • Code supports training and pruning models.
  • Hparam sets like resnet_default are available for quick experimentation.
  • Environment flags allow for flexible deployment on local machines, GCP, or TPUs.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The license is not specified in the README.

Limitations & Caveats

The project requires Tensorflow 1.8, which is an older version and may present compatibility challenges with modern Python environments or other libraries.

Health Check
Last Commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), and
1 more.

diffusion by mosaicml

0%
707
Diffusion model training code
Created 2 years ago
Updated 8 months ago
Starred by Sebastian Raschka Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

direct-preference-optimization by eric-mitchell

0.3%
3k
Reference implementation for Direct Preference Optimization (DPO)
Created 2 years ago
Updated 1 year ago
Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
7 more.

weak-to-strong by openai

0.1%
3k
Weak-to-strong generalization research paper implementation
Created 1 year ago
Updated 1 year ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Casper Hansen Casper Hansen(Author of AutoAWQ), and
1 more.

GPT2 by ConnorJL

0%
1k
GPT2 training implementation, supporting TPUs and GPUs
Created 6 years ago
Updated 2 years ago
Feedback? Help us improve.