fastNLP  by fastnlp

NLP framework for reducing boilerplate code in NLP projects

Created 7 years ago
3,138 stars

Top 15.3% on SourcePulse

GitHubView on GitHub
Project Summary

fastNLP is a modular and extensible NLP framework designed to reduce engineering boilerplate in user projects, such as data processing loops and training cycles. It targets NLP practitioners and researchers seeking a streamlined workflow for tasks like text classification, offering features for efficient training and multi-framework compatibility.

How It Works

fastNLP provides a high-level API for data handling, model training, and evaluation. It abstracts away complex engineering tasks, allowing users to focus on model logic. Key components include DataSet and DataBundle for data management, Trainer and Evaluator for streamlined training and evaluation loops, and support for distributed training and mixed-precision (fp16) out-of-the-box. Its modular design and backend abstraction enable compatibility with PyTorch, PaddlePaddle, and Jittor.

Quick Start & Requirements

Highlighted Details

  • Supports PyTorch, PaddlePaddle, and Jittor backends.
  • Built-in support for fp16, multi-GPU, and ZeRO optimization.
  • cache_results decorator for efficient data preprocessing.
  • Trainer and Evaluator classes simplify training and evaluation loops.
  • apply_field and apply_field_more for efficient data transformation.

Maintenance & Community

The project is currently in incubation. Further community and maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Version 1.0.0+ features a redesigned architecture, making it incompatible with older versions, requiring code adjustments for prior fastNLP users. The project is noted as being "currently still in incubation."

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

3.4%
1k
Framework for scaling multimodal model training across accelerators
Created 5 months ago
Updated 3 weeks ago
Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 11 months ago
Updated 2 months ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Christian Laforte Christian Laforte(Distinguished Engineer at NVIDIA; Former CTO at Stability AI), and
3 more.

lightning-hydra-template by ashleve

0.1%
5k
ML experimentation template using PyTorch Lightning + Hydra
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.