NeuronBlocks  by microsoft

NLP DNN toolkit for building end-to-end neural network pipelines

Created 6 years ago
1,454 stars

Top 28.2% on SourcePulse

GitHubView on GitHub
Project Summary

NeuronBlocks is an NLP deep learning toolkit designed to simplify the creation of end-to-end neural network models for NLP tasks. It targets engineers and researchers seeking to reduce development costs and complexity in model building and training, offering a modular approach akin to playing with Lego bricks.

How It Works

NeuronBlocks comprises a "Block Zoo" of reusable neural network components and a "Model Zoo" of pre-configured NLP models. Users can either select existing JSON configuration files from the Model Zoo or construct custom models by combining blocks from the Block Zoo. This modular design promotes code reusability and simplifies model sharing via configuration files, abstracting away much of the underlying implementation details.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python 3.6+ and PyTorch 0.4.1+.
  • Supports Linux and Windows, CPU and GPU.
  • Quick start examples are available for training, testing, and prediction (interactive/batch modes).
  • Official tutorial and code documentation are linked.

Highlighted Details

  • Supports a wide range of NLP tasks including sentence classification, sentiment analysis, question answering, and more.
  • Enables model sharing through simple JSON configuration files.
  • Offers platform flexibility, running on both Linux and Windows, CPU and GPU.
  • Includes a model visualizer for architecture debugging and correctness checking.

Maintenance & Community

Developed by STCA NLP Group, Microsoft. Contributions are welcomed. Ongoing work includes knowledge distillation, multilingual support, NER, and multi-task training.

Licensing & Compatibility

Licensed under the MIT License. This permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The project's reference paper is from EMNLP 2019, suggesting potential for outdated components or practices compared to the latest NLP advancements. Specific details on the breadth and recency of the "Block Zoo" and "Model Zoo" are not immediately clear from the README.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), and
42 more.

spaCy by explosion

0.1%
32k
NLP library for production applications
Created 11 years ago
Updated 3 months ago
Feedback? Help us improve.