NLP pipeline for Chinese and English, using TensorFlow
Top 30.2% on sourcepulse
This repository provides a Deep Learning NLP Pipeline implemented in TensorFlow, offering a suite of tools for Chinese and English text processing. It targets developers and researchers needing foundational NLP capabilities like segmentation, POS tagging, NER, and dependency parsing, with the ability to train custom models.
How It Works
The pipeline leverages TensorFlow for its core NLP modules. Segmentation uses Linear Chain CRF via CRF++, while POS tagging and NER are implemented using LSTM/BI-LSTM/LSTM-CRF networks. Dependency parsing employs an Arc-Standard System with a Feed Forward Neural Network. The project also includes Seq2Seq-Attention for summarization and CNNs for document classification. Pre-trained models for Chinese are available, with support for English POS tagging.
Quick Start & Requirements
pip install deepnlp
Highlighted Details
Maintenance & Community
The project's README states that the deepnlp
library was archived by the end of 2020 and only supports TensorFlow up to version 1.13.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility with commercial or closed-source projects is not specified.
Limitations & Caveats
The project is archived and only supports older versions of TensorFlow (up to 1.13). Pre-trained models for English POS and domain-specific NER require manual download. The README mentions TextCNN is "WIP" (Work In Progress).
9 months ago
1 week