mt-dnn  by namisan

PyTorch package for multi-task deep neural networks research

created 6 years ago
2,253 stars

Top 20.5% on sourcepulse

GitHubView on GitHub
Project Summary

This PyTorch package implements Multi-Task Deep Neural Networks (MT-DNN) for Natural Language Understanding, targeting researchers and practitioners in NLP. It enables improved model performance and generalization by training a single model on multiple related tasks simultaneously, leveraging pre-trained language models like BERT.

How It Works

MT-DNN utilizes a shared encoder (typically BERT) with task-specific output layers. The core idea is to learn a unified representation that benefits from the diverse signals across multiple NLU tasks. This approach aims to improve generalization and robustness compared to training single-task models, as demonstrated in various ACL and arXiv publications by Microsoft researchers.

Quick Start & Requirements

Highlighted Details

  • Supports fine-tuning pre-trained BERT models for multi-task learning.
  • Includes scripts for reproducing GLUE benchmark results and domain adaptation tasks (SciTail, SNLI).
  • Offers features like SMART regularization, gradient accumulation, and FP16 training for efficiency and robustness.
  • Provides utilities for extracting text embeddings and converting TensorFlow BERT models to PyTorch.

Maintenance & Community

The project is associated with Microsoft researchers. Contact information for several key contributors is provided. No explicit community channels like Discord/Slack are mentioned.

Licensing & Compatibility

The README does not explicitly state a license. It references other projects with MIT and Apache 2.0 licenses, but this does not guarantee compatibility. Commercial use would require clarification.

Limitations & Caveats

The project relies on Python 3.6, which is end-of-life. Public model sharing is currently unavailable due to policy changes. Some results may be based on older GLUE datasets, and achieving top leaderboard performance may require task-specific fine-tuning beyond the multi-task refinement.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.