ChatGenTitle by WangRongsheng

Paper title generator fine-tuned on LLaMA using arXiv data

created 2 years ago

842 stars

Top 43.2% on sourcepulse

Project Summary

ChatGenTitle is a fine-tuned LLaMA model designed to generate academic paper titles based on abstracts. It targets researchers and students seeking to improve their paper titling process, offering a faster and more intelligent alternative to manual title creation.

How It Works

The project fine-tunes LLaMA models using a large dataset of arXiv paper titles and abstracts. It leverages LoRA (Low-Rank Adaptation) for efficient fine-tuning, significantly reducing computational requirements and training time compared to full model fine-tuning. This approach allows for high-quality title generation with fewer resources.

Quick Start & Requirements

Installation: LoRA model weights are available on HuggingFace and require a base LLaMA model for use. Specific installation instructions are not detailed in the README, but typically involve loading the LoRA weights onto a compatible LLaMA checkpoint.
Prerequisites: Requires a LLaMA base model (e.g., LLaMA-7B, LLaMA-13B). Fine-tuning was performed on A100 GPUs.
Online Demo: An online, free-to-use version is available.

Highlighted Details

Fine-tuned on a dataset derived from Cornell-University/arxiv, containing millions of arXiv paper abstracts and metadata.
Offers multiple LoRA model versions (e.g., LLaMa-Lora-7B-3, LLaMa-Lora-7B-cs-6-new) trained on different subsets of arXiv data.
Provides comparisons of title generation quality against ChatGPT (GPT-3.5) and GPT-4.
Includes a daily updated feed of LLM-related papers from arXiv.

Maintenance & Community

Models are open-sourced on HuggingFace.
The project aims to continuously crawl and process arXiv papers for research support.
Users are encouraged to ask questions and open PRs.

Licensing & Compatibility

Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Strictly for research use only. Commercial use and use in actual paper writing are prohibited. The LLaMA base models also have a non-commercial use restriction.

Limitations & Caveats

The project explicitly states that the models are for research purposes only and cannot be used for actual paper writing. The underlying LLaMA models are also restricted to non-commercial use, limiting the practical application of ChatGenTitle for publication.

ChatGenTitle by WangRongsheng

Explore Similar Projects

ExpertLLaMA by OFA-Sys

llama3-chinese by seanzhang-zhichen

JudgeLM by baaivision

Qwen2-Boundless by ystemsrx

cog-llama-template by replicate

llama by ypeleg

Chinese-alpaca-lora by LC1332

Llama-X by AetherCortex

KG_RAG by BaranziniLab

LLaMA-Adapter by OpenGVLab

llama-cookbook by meta-llama

stanford_alpaca by tatsu-lab