ChatGenTitle  by WangRongsheng

Paper title generator fine-tuned on LLaMA using arXiv data

created 2 years ago
842 stars

Top 43.2% on sourcepulse

GitHubView on GitHub
Project Summary

ChatGenTitle is a fine-tuned LLaMA model designed to generate academic paper titles based on abstracts. It targets researchers and students seeking to improve their paper titling process, offering a faster and more intelligent alternative to manual title creation.

How It Works

The project fine-tunes LLaMA models using a large dataset of arXiv paper titles and abstracts. It leverages LoRA (Low-Rank Adaptation) for efficient fine-tuning, significantly reducing computational requirements and training time compared to full model fine-tuning. This approach allows for high-quality title generation with fewer resources.

Quick Start & Requirements

  • Installation: LoRA model weights are available on HuggingFace and require a base LLaMA model for use. Specific installation instructions are not detailed in the README, but typically involve loading the LoRA weights onto a compatible LLaMA checkpoint.
  • Prerequisites: Requires a LLaMA base model (e.g., LLaMA-7B, LLaMA-13B). Fine-tuning was performed on A100 GPUs.
  • Online Demo: An online, free-to-use version is available.

Highlighted Details

  • Fine-tuned on a dataset derived from Cornell-University/arxiv, containing millions of arXiv paper abstracts and metadata.
  • Offers multiple LoRA model versions (e.g., LLaMa-Lora-7B-3, LLaMa-Lora-7B-cs-6-new) trained on different subsets of arXiv data.
  • Provides comparisons of title generation quality against ChatGPT (GPT-3.5) and GPT-4.
  • Includes a daily updated feed of LLM-related papers from arXiv.

Maintenance & Community

  • Models are open-sourced on HuggingFace.
  • The project aims to continuously crawl and process arXiv papers for research support.
  • Users are encouraged to ask questions and open PRs.

Licensing & Compatibility

  • Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
  • Strictly for research use only. Commercial use and use in actual paper writing are prohibited. The LLaMA base models also have a non-commercial use restriction.

Limitations & Caveats

The project explicitly states that the models are for research purposes only and cannot be used for actual paper writing. The underlying LLaMA models are also restricted to non-commercial use, limiting the practical application of ChatGenTitle for publication.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), John Yang John Yang(Author of SWE-bench, SWE-agent), and
13 more.

stanford_alpaca by tatsu-lab

0.1%
30k
Instruction-following LLaMA model training and data generation
created 2 years ago
updated 1 year ago
Feedback? Help us improve.