German-NLP  by adbar

German NLP resource list for open-access tools

created 7 years ago
492 stars

Top 63.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated, community-driven list of open-access, open-source, and off-the-shelf resources and tools specifically for German Natural Language Processing (NLP). It aims to provide a comprehensive and usable catalog for researchers, developers, and anyone working with German language data, prioritizing maintained and user-friendly options.

How It Works

The project functions as a living document, meticulously organized into categories covering corpora, frameworks, treebanks, deep learning models, annotation standards, and various linguistic processing tasks from preprocessing to semantic analysis. It emphasizes resources that are actively maintained and readily usable, with a bias towards practicality and ease of integration.

Quick Start & Requirements

This is a curated list, not a software package. No installation or execution commands are applicable. The resources themselves will have their own requirements.

Highlighted Details

  • Extensive categorization of German NLP resources, from historical corpora to modern LLMs.
  • Includes specialized datasets for sentiment analysis, named entity recognition, and more.
  • Lists numerous frameworks and tools for various NLP tasks like tokenization, lemmatization, and parsing.
  • Features a wide array of treebanks and deep learning models tailored for German.

Maintenance & Community

The list is community-maintained, with contributions and suggestions actively welcomed via pull requests. A contributors list is available.

Licensing & Compatibility

The repository itself is not licensed as a software package. The licensing of individual resources listed within the repository varies and must be checked on a per-resource basis.

Limitations & Caveats

The list's quality and comprehensiveness depend on community contributions; some categories may be less developed than others. The project explicitly states a bias towards usability and user-friendliness, which might exclude some technically valuable but less accessible resources.

Health Check
Last commit

9 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 90 days

Explore Similar Projects

Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), and
4 more.

awesome-nlp by keon

0.1%
17k
Curated list of NLP resources
created 9 years ago
updated 1 year ago
Feedback? Help us improve.