ResumeParser  by OmkarPathak

Local AI-powered resume parser for privacy-conscious data extraction

Created 7 years ago
313 stars

Top 86.3% on SourcePulse

GitHubView on GitHub
Project Summary

A privacy-focused resume parser that leverages local Large Language Models (LLMs) to extract structured data and generate insights from resumes. It targets engineers and researchers needing efficient, cost-effective, and secure resume analysis, offering automated professional summaries and key strength identification without external API dependencies.

How It Works

This project utilizes the Qwen2.5-1.5B-Instruct LLM, quantized to q4_k_m, for local inference via the llama-cpp-python library. This approach prioritizes speed and accuracy for structured data extraction, offering a lightweight (~1GB model) solution. Running entirely locally eliminates API costs and enhances data privacy by keeping sensitive resume information on the user's machine.

Quick Start & Requirements

  • Primary Install: Clone the repository, set up a Python virtual environment, install dependencies (pip install -r requirements.txt), download the AI model (python download_model.py qwen), and run the Django development server (python resume_parser/manage.py runserver). The GUI is accessible at http://127.0.0.1:8000/.
  • Docker: Use docker-compose up --build for an integrated setup.
  • Portable App (macOS): Execute ./build_mac.sh to create a standalone executable.
  • Prerequisites: Minimum 4GB RAM (8GB recommended), ~2GB disk space, and a modern CPU (AVX2 support recommended). GPU acceleration (NVIDIA CUDA or macOS Metal) is optional but supported.

Highlighted Details

  • AI-powered extraction of key fields: Name, Email, Mobile, Skills, Education, Experience, and Company names.
  • Automated generation of Professional Summaries and identification of Key Strengths.
  • Outputs strictly formatted JSON for straightforward integration into other systems.
  • Zero API costs and enhanced data privacy due to 100% local execution.
  • Cross-platform compatibility (Windows, macOS, Linux) with a portable build option.

Maintenance & Community

No specific details regarding contributors, sponsorships, community channels (like Discord/Slack), or roadmaps were provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: The permissive MIT license generally allows for commercial use and integration into closed-source applications without significant restrictions.

Limitations & Caveats

The LLM's context window is limited to 2048 tokens, requiring input text to be truncated to approximately 1500 characters. GPU acceleration (Metal/CUDA) is optional and may require explicit configuration, with the system defaulting to CPU execution for stability.

Health Check
Last Commit

16 hours ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Sasha Rush Sasha Rush(Research Scientist at Cursor; Professor at Cornell Tech), and
2 more.

llmparser by kyang6

0%
428
LLM tool for structured data extraction and classification
Created 2 years ago
Updated 2 years ago
Feedback? Help us improve.