nlp-resume-parser by hxu296

GPT-3-based resume parser as a REST API

created 3 years ago

266 stars

Top 96.9% on sourcepulse

Project Summary

This project provides a REST API for parsing resume PDFs into structured JSON data, leveraging GPT-3 for natural language understanding. It targets developers and HR professionals needing to automate resume screening and data extraction, offering a cost-effective and accurate solution for transforming unstructured resume content.

How It Works

The system utilizes GPT-3's text-davinci-002 engine to interpret resume content extracted from PDFs. It converts PDFs to text using pdftotext and then sends this text to the OpenAI API for parsing into predefined JSON fields. This approach allows for robust handling of varied resume formats and sophisticated extraction of information like job titles, education, and project details.

Quick Start & Requirements

Install Python 3 and pip3.
Install pdftotext dependencies.
Clone the repository and run ./build.sh.
Obtain an OpenAI API Key and set it in a .env file or as an environment variable.
Run ./run.sh to start the Flask server on localhost:5001.
macOS users require Xcode or GCC tools and Homebrew for Python installation.

Highlighted Details

Parses common resume fields including personal information, education, job experience, and project experience.
Estimates parsing cost at $0.01 per 500 tokens using text-davinci-002.
Claims impressive out-of-the-box results, with potential for further accuracy via GPT-3 fine-tuning.

Maintenance & Community

No specific information on contributors, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The project relies on an external OpenAI API key, incurring costs per parse. The absence of a live demo is attributed to these API costs. No specific limitations on PDF formats or parsing accuracy are detailed beyond the general capabilities of GPT-3.

Health Check

Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 90 days