FengZhengLLM  by SCIR-TG

Aerospace LLM for knowledge dissemination and research

created 8 months ago
516 stars

Top 61.6% on sourcepulse

GitHubView on GitHub
Project Summary

"FengZheng" is a large language model specifically trained for the aerospace domain, targeting young enthusiasts and professionals. It aims to enhance the dissemination and understanding of aerospace knowledge through AI, offering improved factual accuracy and question-answering capabilities compared to similarly sized open-source models.

How It Works

FengZheng employs a two-stage training process: knowledge injection and format alignment. Knowledge injection uses curated aerospace documents from the web, books, and academic papers, filtered by vocabulary and deduplicated. Format alignment then fine-tunes the model on instruction-following and conversational data, including data generated by closed-source models and specific instructions for retrieval-augmented generation (RAG). A novel "knowledge injection supervised fine-tuning" strategy is introduced, involving self-supervised data augmentation and curriculum learning based on model perplexity on domain documents to improve efficiency and mitigate the "alignment tax."

Quick Start & Requirements

  • Install: Uses Conda for environment setup (conda create -n fz_bench python==3.10, conda activate fz_bench, pip install requirements.txt).
  • Prerequisites: Python 3.10.
  • Evaluation: Scripts eval_single_point.py and eval_factual_long.py are provided for model evaluation.
  • Demo: Online experience link available with an invitation code (scir).

Highlighted Details

  • Outperforms similarly sized open-source models in aerospace knowledge coverage and question-answering benchmarks.
  • Integrated with "Satellite Encyclopedia" for public use, serving over 7,000 Q&A sessions.
  • Features a RAG module for enhanced accuracy, including query enhancement, expansion, and decomposition.
  • Developed by Harbin Institute of Technology (HIT-SCIR-TG).

Maintenance & Community

  • Project primarily developed by students from HIT-SCIR-TG.
  • Collaboration with "Satellite Encyclopedia" community.
  • No explicit links to community channels (Discord/Slack) or roadmap provided in the README.

Licensing & Compatibility

  • License: Apache 2.0.
  • Restrictions: Resources are for academic research only; commercial use is strictly prohibited.

Limitations & Caveats

The current model primarily focuses on Chinese language and basic aerospace knowledge. Multilingual capabilities and deeper integration into aerospace scientific research are planned for future versions. The project disclaims legal responsibility for model output accuracy due to factors like computation, randomness, and quantization.

Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.