This repository provides practical examples for the book "Natural Language Processing with TensorFlow 2 and Machine Learning," covering topics from logistic regression to BERT and GPT-3. It's designed for learners and practitioners of NLP using TensorFlow.
How It Works
The project offers hands-on code implementations for various NLP tasks, including text classification, similarity, chatbot development, and fine-tuning advanced models like BERT and GPT-3. It leverages TensorFlow 2 for building and training these models, providing a structured learning path aligned with the accompanying book.
Quick Start & Requirements
- Installation: Docker is recommended for environment setup.
- Build image:
bash build_jupyter_for_cpu.sh
or bash build_jupyter_for_gpu.sh
- Run Jupyter:
bash exec_jupyter_for_cpu.sh
or bash exec_jupyter_for_gpu.sh
(access via port 8889)
- Prerequisites: Docker (version 19.03+ recommended), Python 3.6 (if not using Docker).
- Resources: GPT-2 checkpoint available via direct download (
wget https://github.com/NLP-kr/tensorflow-ml-nlp-tf2/releases/download/v1.0/gpt_ckpt.zip -O gpt_ckpt.zip
).
- Documentation: Colab notebooks for chapters 7 and 8 are available at a separate repository link.
Highlighted Details
- Covers a broad spectrum of NLP techniques, from foundational models to state-of-the-art architectures.
- Includes practical examples for building chatbots and fine-tuning large language models.
- Provides Docker support for reproducible and consistent development environments.
- Offers guidance on setting up Python 3.6 environments via Anaconda.
Maintenance & Community
- Pull Requests are welcomed. Issues and questions should be posted in the GitHub Issues section, after consulting the Wiki.
- Authors include ChangWookJun, Taekyoon, JungHyun Cho, and Ryan S. Shin.
Licensing & Compatibility
- The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
- GPT-2 model download is no longer supported via Dropbox, requiring manual download.
- The repository is tied to a specific book, and its standalone utility without the book's context may be limited.
- No explicit license is provided, which may impact commercial adoption.