NeuronBlocks  by microsoft

NLP DNN toolkit for building end-to-end neural network pipelines

created 6 years ago
1,456 stars

Top 28.7% on sourcepulse

GitHubView on GitHub
Project Summary

NeuronBlocks is an NLP deep learning toolkit designed to simplify the creation of end-to-end neural network models for NLP tasks. It targets engineers and researchers seeking to reduce development costs and complexity in model building and training, offering a modular approach akin to playing with Lego bricks.

How It Works

NeuronBlocks comprises a "Block Zoo" of reusable neural network components and a "Model Zoo" of pre-configured NLP models. Users can either select existing JSON configuration files from the Model Zoo or construct custom models by combining blocks from the Block Zoo. This modular design promotes code reusability and simplifies model sharing via configuration files, abstracting away much of the underlying implementation details.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python 3.6+ and PyTorch 0.4.1+.
  • Supports Linux and Windows, CPU and GPU.
  • Quick start examples are available for training, testing, and prediction (interactive/batch modes).
  • Official tutorial and code documentation are linked.

Highlighted Details

  • Supports a wide range of NLP tasks including sentence classification, sentiment analysis, question answering, and more.
  • Enables model sharing through simple JSON configuration files.
  • Offers platform flexibility, running on both Linux and Windows, CPU and GPU.
  • Includes a model visualizer for architecture debugging and correctness checking.

Maintenance & Community

Developed by STCA NLP Group, Microsoft. Contributions are welcomed. Ongoing work includes knowledge distillation, multilingual support, NER, and multi-task training.

Licensing & Compatibility

Licensed under the MIT License. This permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The project's reference paper is from EMNLP 2019, suggesting potential for outdated components or practices compared to the latest NLP advancements. Specific details on the breadth and recency of the "Block Zoo" and "Model Zoo" are not immediately clear from the README.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
2 more.

maestro by roboflow

0.1%
3k
CLI/SDK for fine-tuning multimodal models
created 1 year ago
updated 5 days ago
Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Travis Fischer Travis Fischer(Founder of Agentic).

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 9 months ago
updated 2 weeks ago
Starred by Logan Kilpatrick Logan Kilpatrick(Product Lead on Google AI Studio), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

catalyst by catalyst-team

0%
3k
PyTorch framework for accelerated deep learning R&D
created 7 years ago
updated 1 month ago
Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google), Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), and
4 more.

LLMs-from-scratch by rasbt

1.4%
61k
Educational resource for LLM construction in PyTorch
created 2 years ago
updated 1 day ago
Feedback? Help us improve.