awesome-bangla  by banglakit

Bangla NLP tools, datasets, and resources

created 8 years ago
546 stars

Top 59.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated list of tools, datasets, and resources for Bangla (Bengali) computing, primarily aimed at researchers and hobbyists in Natural Language Processing (NLP). It serves as a central hub for discovering and accessing various components needed for Bangla language technology development.

How It Works

The collection is organized into categories such as Typing Tools and Keyboards, Libraries, Corpora and Datasets, NLP Tools, OCR/HTR, Speech to Text, Text to Speech, and others. Each entry provides a brief description and links to relevant projects, libraries, or datasets, facilitating easy navigation and access to Bangla-specific language resources.

Quick Start & Requirements

This is a curated list, not a runnable project. To utilize the resources, users will need to individually install and configure the listed tools and libraries. Specific requirements vary per resource but generally include Python, Java, C++, JavaScript, or R environments, depending on the tool.

Highlighted Details

  • Extensive coverage of Bangla NLP resources, including numerous datasets for POS tagging, speech, handwriting, and sentiment analysis.
  • A wide array of input methods and phonetic parsers for Bangla text entry and conversion.
  • Includes multiple OCR and HTR solutions, alongside speech processing tools (STT/TTS).
  • Features several Bangla language models and word embedding projects.

Maintenance & Community

The list is open for contributions, encouraging community involvement in expanding the collection. Links to relevant research centers and font providers are included.

Licensing & Compatibility

Licenses vary significantly across the listed resources, ranging from permissive (MIT, Apache) to more restrictive ones. Users must verify the license of each individual tool or dataset for compatibility with their intended use, especially for commercial applications.

Limitations & Caveats

As a curated list, the project itself does not provide direct functionality. The quality, maintenance status, and licensing of individual listed resources are the responsibility of their respective creators. Some listed projects may be outdated or unmaintained.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), and
4 more.

awesome-nlp by keon

0.1%
17k
Curated list of NLP resources
created 9 years ago
updated 1 year ago
Feedback? Help us improve.