ResumeParser by OmkarPathak

Local AI-powered resume parser for privacy-conscious data extraction

Created 7 years ago

322 stars

Top 84.7% on SourcePulse

Project Summary

A privacy-focused resume parser that leverages local Large Language Models (LLMs) to extract structured data and generate insights from resumes. It targets engineers and researchers needing efficient, cost-effective, and secure resume analysis, offering automated professional summaries and key strength identification without external API dependencies.

How It Works

This project utilizes the Qwen2.5-1.5B-Instruct LLM, quantized to q4_k_m, for local inference via the llama-cpp-python library. This approach prioritizes speed and accuracy for structured data extraction, offering a lightweight (~1GB model) solution. Running entirely locally eliminates API costs and enhances data privacy by keeping sensitive resume information on the user's machine.

Quick Start & Requirements

Primary Install: Clone the repository, set up a Python virtual environment, install dependencies (pip install -r requirements.txt), download the AI model (python download_model.py qwen), and run the Django development server (python resume_parser/manage.py runserver). The GUI is accessible at http://127.0.0.1:8000/.
Docker: Use docker-compose up --build for an integrated setup.
Portable App (macOS): Execute ./build_mac.sh to create a standalone executable.
Prerequisites: Minimum 4GB RAM (8GB recommended), ~2GB disk space, and a modern CPU (AVX2 support recommended). GPU acceleration (NVIDIA CUDA or macOS Metal) is optional but supported.

Highlighted Details

AI-powered extraction of key fields: Name, Email, Mobile, Skills, Education, Experience, and Company names.
Automated generation of Professional Summaries and identification of Key Strengths.
Outputs strictly formatted JSON for straightforward integration into other systems.
Zero API costs and enhanced data privacy due to 100% local execution.
Cross-platform compatibility (Windows, macOS, Linux) with a portable build option.

Maintenance & Community

No specific details regarding contributors, sponsorships, community channels (like Discord/Slack), or roadmaps were provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: The permissive MIT license generally allows for commercial use and integration into closed-source applications without significant restrictions.

Limitations & Caveats

The LLM's context window is limited to 2048 tokens, requiring input text to be truncated to approximately 1500 characters. GPU acceleration (Metal/CUDA) is optional and may require explicit configuration, with the system defaulting to CPU execution for stability.

ResumeParser by OmkarPathak

Explore Similar Projects

A-Guide-to-Retrieval-Augmented-LLM by Wang-Shuo

llmparser by kyang6

doctran by finic-ai

SmartResume by alibaba

onprem by amaiya

FActScore by shmsw25

Chat_with_Datawhale_langchain by logan-zou

docstrange by NanoNets

langchain-extract by langchain-ai

Promptify by promptslab

langextract by google

private-gpt by zylon-ai