ChatGenTitle is a fine-tuned LLaMA model designed to generate academic paper titles based on abstracts. It targets researchers and students seeking to improve their paper titling process, offering a faster and more intelligent alternative to manual title creation.
How It Works
The project fine-tunes LLaMA models using a large dataset of arXiv paper titles and abstracts. It leverages LoRA (Low-Rank Adaptation) for efficient fine-tuning, significantly reducing computational requirements and training time compared to full model fine-tuning. This approach allows for high-quality title generation with fewer resources.
Quick Start & Requirements
- Installation: LoRA model weights are available on HuggingFace and require a base LLaMA model for use. Specific installation instructions are not detailed in the README, but typically involve loading the LoRA weights onto a compatible LLaMA checkpoint.
- Prerequisites: Requires a LLaMA base model (e.g., LLaMA-7B, LLaMA-13B). Fine-tuning was performed on A100 GPUs.
- Online Demo: An online, free-to-use version is available.
Highlighted Details
- Fine-tuned on a dataset derived from Cornell-University/arxiv, containing millions of arXiv paper abstracts and metadata.
- Offers multiple LoRA model versions (e.g., LLaMa-Lora-7B-3, LLaMa-Lora-7B-cs-6-new) trained on different subsets of arXiv data.
- Provides comparisons of title generation quality against ChatGPT (GPT-3.5) and GPT-4.
- Includes a daily updated feed of LLM-related papers from arXiv.
Maintenance & Community
- Models are open-sourced on HuggingFace.
- The project aims to continuously crawl and process arXiv papers for research support.
- Users are encouraged to ask questions and open PRs.
Licensing & Compatibility
- Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
- Strictly for research use only. Commercial use and use in actual paper writing are prohibited. The LLaMA base models also have a non-commercial use restriction.
Limitations & Caveats
The project explicitly states that the models are for research purposes only and cannot be used for actual paper writing. The underlying LLaMA models are also restricted to non-commercial use, limiting the practical application of ChatGenTitle for publication.