awesome-chinese-ner by taishan1994

Resource list for Chinese NER

Created 4 years ago

772 stars

Top 44.5% on SourcePulse

Project Summary

This repository serves as a comprehensive, curated collection of resources for Chinese Named Entity Recognition (NER). It targets researchers and practitioners in Natural Language Processing (NLP) by providing access to the latest papers, tools, datasets, pre-trained models, and surveys related to Chinese NER and broader information extraction tasks. The primary benefit is a centralized hub for staying updated and finding practical resources in this specialized NLP domain.

How It Works

The repository functions as a curated list, aggregating links to academic papers (with arXiv and official publication links), GitHub repositories for code and models, and relevant datasets. It covers a wide spectrum of approaches, from traditional BiLSTM-CRF models to state-of-the-art methods leveraging large language models (LLMs) and prompt-based learning. The organization categorizes resources by model type, dataset, and related tasks, facilitating efficient discovery.

Quick Start & Requirements

This repository is a collection of links and does not have a direct installation or execution command. Users will need to follow the links provided for individual papers, code repositories, or datasets to set up and run specific tools or models. Prerequisites will vary greatly depending on the chosen resource, potentially including Python, deep learning frameworks (PyTorch, TensorFlow), specific libraries, and hardware like GPUs for model training or inference.

Highlighted Details

Extensive coverage of recent research, including papers from 2023-2024 on LLM-based NER and prompt engineering.
Links to various Chinese NER datasets such as MSRA, Weibo, CLUENER2020, and medical-specific datasets.
A broad list of Chinese pre-trained language models including ChineseBERT, MacBERT, ERNIE, and ZEN.
Includes resources for related tasks like relation extraction, event extraction, and general information extraction.

Maintenance & Community

The repository is maintained by taishan1994. It aggregates links from various sources, including academic conferences (ACL, EMNLP, COLING, AAAI) and platforms like arXiv and GitHub. There are no direct community channels (e.g., Discord, Slack) mentioned for this specific repository.

Licensing & Compatibility

The licensing for individual resources (papers, code, datasets) varies. Users must consult the licenses of each linked project or dataset. Compatibility for commercial use or closed-source linking depends entirely on the licenses of the specific tools and datasets accessed via the links.

Limitations & Caveats

This repository is a curated list and does not provide direct implementations or unified interfaces. Users must navigate to individual linked resources, which may have their own setup complexities, dependencies, or licensing restrictions. The rapidly evolving nature of NLP means some links or resources may become outdated.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days