this-word-does-not-exist  by turtlesoupy

GPT-2 fine-tune for generating nonexistent words, definitions, and examples

created 5 years ago
1,021 stars

Top 37.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project enables users to train a GPT-2 variant to generate novel words, definitions, and example sentences from scratch. It's designed for researchers and developers interested in creative language generation and exploring the capabilities of fine-tuned language models. The primary benefit is the ability to create a unique lexicon and associated meanings.

How It Works

The project utilizes a GPT-2 architecture fine-tuned on a custom dataset of words and definitions. It employs a "forward model" to generate definitions for new words and an "inverse model" to create words from given definitions. This dual-model approach allows for flexible and creative text generation, enabling the creation of entirely new linguistic concepts.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies from cpu_deploy_environment.yml.
  • Pre-trained Models: Download blacklist.pickle.gz, forward-dictionary-model-v1.tar.gz, and inverse-dictionary-model-v1.tar.gz.
  • Usage:
    from title_maker_pro.word_generator import WordGenerator
    word_generator = WordGenerator(
      device="cpu",
      forward_model_path="path/to/forward-dictionary-model-v1.tar.gz",
      inverse_model_path="path/to/inverse-dictionary-model-v1.tar.gz",
      blacklist_path="path/to/blacklist.pickle.gz",
      quantize=False,
    )
    print(word_generator.generate_word())
    print(word_generator.generate_definition("glooberyblipboop"))
    print(word_generator.generate_word_from_definition("a word that does not exist"))
    
  • Demo: https://www.thisworddoesnotexist.com
  • Twitter Bot: https://twitter.com/robo_define

Highlighted Details

  • Generates words, definitions, and example sentences.
  • Includes a forward model (word -> definition) and inverse model (definition -> word).
  • Provides scripts for extracting definitions from Apple dictionaries and Urban Dictionary.
  • Offers a website development setup with aiohttp-devtools.

Maintenance & Community

Licensing & Compatibility

  • The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The project is described as a "variant of GPT-2," suggesting it may not use the latest GPT architectures. The README does not specify the exact GPT-2 version or fine-tuning details. The lack of a clear license is a significant caveat for adoption.

Health Check
Last commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.