Aerospace LLM for knowledge dissemination and research
Top 61.6% on sourcepulse
"FengZheng" is a large language model specifically trained for the aerospace domain, targeting young enthusiasts and professionals. It aims to enhance the dissemination and understanding of aerospace knowledge through AI, offering improved factual accuracy and question-answering capabilities compared to similarly sized open-source models.
How It Works
FengZheng employs a two-stage training process: knowledge injection and format alignment. Knowledge injection uses curated aerospace documents from the web, books, and academic papers, filtered by vocabulary and deduplicated. Format alignment then fine-tunes the model on instruction-following and conversational data, including data generated by closed-source models and specific instructions for retrieval-augmented generation (RAG). A novel "knowledge injection supervised fine-tuning" strategy is introduced, involving self-supervised data augmentation and curriculum learning based on model perplexity on domain documents to improve efficiency and mitigate the "alignment tax."
Quick Start & Requirements
conda create -n fz_bench python==3.10
, conda activate fz_bench
, pip install requirements.txt
).eval_single_point.py
and eval_factual_long.py
are provided for model evaluation.scir
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The current model primarily focuses on Chinese language and basic aerospace knowledge. Multilingual capabilities and deeper integration into aerospace scientific research are planned for future versions. The project disclaims legal responsibility for model output accuracy due to factors like computation, randomness, and quantization.
5 months ago
Inactive