LLM for astronomy research
Top 90.1% on sourcepulse
StarWhisper is a series of large language models (LLMs) tailored for astronomy, offering language, time-series, and multimodal capabilities ranging from 7B to 72B parameters. Developed with support from the National Astronomical Observatories and Zhijiang Laboratory, it aims to serve as an AI tool for astronomical data processing, particularly for projects like the SITIAN survey, by integrating astronomical knowledge and exploring multimodal solutions for specific challenges.
How It Works
The project leverages a data flywheel approach, refining training methods with cleaned and corrected scientific and popular science data to enhance astronomical physics, coding, and agent capabilities. It has released technical reports on specialized models: StarWhisper Pulsar for state-of-the-art pulsar identification using multimodal LLMs, StarWhisper LC for light curve classification via transfer learning and LLMs, and StarWhisper Telescope for telescope control workflows using LLM agents, which has been applied to the SITIAN project.
Quick Start & Requirements
LLM_Data
directory.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is actively under development, with a to-do list including further fine-tuning for scientific data, reinforcement learning from human feedback, and the development of an astronomical knowledge graph to mitigate hallucinations. Open-sourcing of multimodal fine-tuning weights is pending.
3 weeks ago
1 week