InternLM-techreport  by InternLM

Multilingual LLM research paper with 104B parameters

created 2 years ago
905 stars

Top 41.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides the technical report for InternLM, a 104B parameter multilingual large language model developed by Shanghai AI Lab and SenseTime. It targets researchers and developers seeking high-performance LLMs, particularly for Chinese language applications, and offers state-of-the-art results on various benchmarks, outperforming existing open-source models and even ChatGPT on comprehensive exams.

How It Works

InternLM is a 104B parameter multilingual foundational model pre-trained on 1.6T tokens using a multi-phase progressive process. It is then fine-tuned to align with human preferences. The development also includes Uniscale-LLM, an efficient training system for large language models. This approach aims for well-rounded capabilities across knowledge, comprehension, math, and coding without external tools.

Quick Start & Requirements

The repository primarily contains a technical report (PDF) detailing the model's architecture, training, and evaluation. It does not provide direct model weights or inference code. Access to the model itself would require separate distribution or implementation.

Highlighted Details

  • Achieves state-of-the-art performance on comprehensive exams like MMLU, AGIEval, C-Eval, and GAOKAO-Bench.
  • Demonstrates superior performance compared to ChatGPT on these benchmarks.
  • Exhibits strong capabilities in understanding Chinese language and culture.
  • Pre-trained on 1.6T tokens with a multi-phase progressive process.

Maintenance & Community

This repository is a technical report and does not appear to be actively maintained as a code project. It is associated with the InternLM Team.

Licensing & Compatibility

The licensing for the model itself is not specified in this technical report. The report itself is likely under a permissive license allowing distribution and citation.

Limitations & Caveats

This repository only contains the technical report and does not provide access to the model weights, code, or an inference API. Users interested in using the model would need to find its separate distribution.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.