InternLM-techreport by InternLM

Multilingual LLM research paper with 104B parameters

Created 2 years ago

902 stars

Top 40.1% on SourcePulse

View on GitHub

3 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Ying Sheng

Coauthor of SGLang

Junyang Lin

Core Maintainer at Alibaba Qwen

Project Summary

This repository provides the technical report for InternLM, a 104B parameter multilingual large language model developed by Shanghai AI Lab and SenseTime. It targets researchers and developers seeking high-performance LLMs, particularly for Chinese language applications, and offers state-of-the-art results on various benchmarks, outperforming existing open-source models and even ChatGPT on comprehensive exams.

How It Works

InternLM is a 104B parameter multilingual foundational model pre-trained on 1.6T tokens using a multi-phase progressive process. It is then fine-tuned to align with human preferences. The development also includes Uniscale-LLM, an efficient training system for large language models. This approach aims for well-rounded capabilities across knowledge, comprehension, math, and coding without external tools.

Quick Start & Requirements

The repository primarily contains a technical report (PDF) detailing the model's architecture, training, and evaluation. It does not provide direct model weights or inference code. Access to the model itself would require separate distribution or implementation.

Highlighted Details

Achieves state-of-the-art performance on comprehensive exams like MMLU, AGIEval, C-Eval, and GAOKAO-Bench.
Demonstrates superior performance compared to ChatGPT on these benchmarks.
Exhibits strong capabilities in understanding Chinese language and culture.
Pre-trained on 1.6T tokens with a multi-phase progressive process.

Maintenance & Community

This repository is a technical report and does not appear to be actively maintained as a code project. It is associated with the InternLM Team.

Licensing & Compatibility

The licensing for the model itself is not specified in this technical report. The report itself is likely under a permissive license allowing distribution and citation.

Limitations & Caveats

This repository only contains the technical report and does not provide access to the model weights, code, or an inference API. Users interested in using the model would need to find its separate distribution.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days