Targeted-Dropout by Cohere-Labs-Community

Code for a targeted dropout research paper

Created 7 years ago

255 stars

Top 98.8% on SourcePulse

2 Experts Love This Project

aidangomez

Cofounder of Cohere

1vn

Cofounder of Cohere

Project Summary

This repository provides complementary code for the Targeted Dropout paper, enabling researchers and practitioners to implement and experiment with a novel dropout technique for neural networks. The primary benefit is improved model generalization and robustness through more efficient regularization.

How It Works

Targeted Dropout introduces a method to selectively drop units based on their importance, aiming to improve generalization and reduce overfitting. The implementation likely involves modifications to standard dropout layers to incorporate this targeted selection mechanism, potentially leading to more efficient training and better performance on downstream tasks.

Quick Start & Requirements

Primary install / run command: python -m TD.train --hparams=resnet_default
Prerequisites: Python 3, Tensorflow 1.8.
The project supports different environments via the --env flag (local, gcp, tpu).
Hparams can be specified or overridden using --hparams and --hparam_override flags.

Highlighted Details

Code supports training and pruning models.
Hparam sets like resnet_default are available for quick experimentation.
Environment flags allow for flexible deployment on local machines, GCP, or TPUs.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The license is not specified in the README.

Limitations & Caveats

The project requires Tensorflow 1.8, which is an older version and may present compatibility challenges with modern Python environments or other libraries.

Health Check

Last Commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

Starred by

Maxime Labonne

Maxime Labonne(Head of Post-Training at Liquid AI) and

Wing Lian

Wing Lian(Founder of Axolotl AI).

rho by microsoft

LLM pretraining research paper using selective language modeling (SLM)

Created 1 year ago

Updated 1 year ago

dropout by facebookresearch

PyTorch implementation for "Dropout Reduces Underfitting" research paper

Created 2 years ago

Updated 2 years ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

Minitron by NVlabs

Compressed language models via pruning/distillation

Created 1 year ago

Updated 2 months ago

MiniRBT by iflytek

Small, distilled Chinese pre-trained language models

Created 3 years ago

Updated 6 months ago

1.5-Pints by Pints-AI

LLM recipe for pre-training models

Created 1 year ago

Updated 9 months ago

Starred by

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral),

Hanlin Tang

Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), and

1 more.

diffusion by mosaicml

Diffusion model training code

Created 2 years ago

Updated 1 year ago

Pre-trained-Models by loujie0822

NLP pre-trained model overview

Created 6 years ago

Updated 5 years ago

Starred by

Patrick Kidger

Patrick Kidger(Core Contributor to JAX ecosystem),

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera), and

1 more.

WeightWatcher by CalculatedContent

DNN analyzer for predicting model accuracy without training/test data

Created 7 years ago

Updated 1 month ago

SpanBERT by facebookresearch

SpanBERT is a research paper implementation for improved pre-training

Created 6 years ago

Updated 2 years ago

Starred by

Sebastian Raschka

Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

3 more.

direct-preference-optimization by eric-mitchell

Reference implementation for Direct Preference Optimization (DPO)

Created 2 years ago

Updated 1 year ago

Starred by

Jared Palmer

Jared Palmer(SVP at GitHub; Founder of Turborepo; Author of Formik, TSDX),

Edward Sun

Edward Sun(Research Scientist at Meta Superintelligence Lab), and

7 more.

weak-to-strong by openai

Weak-to-strong generalization research paper implementation

Created 2 years ago

Updated 1 year ago

Starred by

George Hotz

George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai),

Casper Hansen

Casper Hansen(Author of AutoAWQ), and

1 more.

GPT2 by ConnorJL

GPT2 training implementation, supporting TPUs and GPUs

Created 6 years ago

Updated 3 years ago

Feedback? Help us improve.