Japanese LLM list: models, benchmarks, datasets
Top 33.1% on sourcepulse
This repository serves as a comprehensive, community-curated catalog of Japanese Large Language Models (LLMs) and related evaluation benchmarks. It aims to provide researchers, developers, and enthusiasts with an organized overview of the rapidly evolving Japanese LLM landscape, facilitating discovery and comparison of models, datasets, and evaluation methodologies.
How It Works
The project compiles information from various sources, including academic papers, public releases, and community contributions, to create detailed tables and descriptions of Japanese LLMs. It categorizes models by architecture, parameter count, training data, developer, and license, offering a structured approach to understanding the ecosystem. It also lists and describes various Japanese LLM evaluation benchmarks and datasets.
Quick Start & Requirements
This repository is a curated list and does not require installation or execution. Users can browse the README for information on specific models and benchmarks.
Highlighted Details
Maintenance & Community
The project is actively maintained by the llm-jp
community, with contributions from numerous researchers and organizations in Japan. Information is updated regularly, and users are encouraged to contribute via GitHub Issues.
Licensing & Compatibility
Licenses vary significantly across the listed models, ranging from permissive MIT and Apache 2.0 to more restrictive non-commercial licenses (e.g., CC BY-NC-SA 4.0) and custom terms of use. Users must carefully check the specific license for each model before use, especially for commercial applications.
Limitations & Caveats
The README explicitly states that the content is not guaranteed to be complete or accurate and may change without notice. Some information may be based on speculation or individual interpretation. Users should verify details independently.
2 weeks ago
1 day