hiring-agent  by interviewstreet

AI agent evaluates and scores resumes from PDFs

Created 10 months ago
1,089 stars

Top 34.6% on SourcePulse

GitHubView on GitHub
Project Summary

Hiring Agent Resume-to-Score pipeline that extracts structured data from PDFs, enriches with GitHub signals, and outputs a fair, explainable evaluation. This project provides an AI-powered agent for evaluating and scoring resumes. It automates the extraction of structured data from PDF resumes, enriches this information with GitHub signals, and generates a fair, explainable, and objective evaluation. The tool is designed for technical recruiters, hiring managers, and researchers seeking a robust resume analysis pipeline.

How It Works

The Hiring Agent employs a multi-stage pipeline. It first converts PDF resumes into a Markdown-like text format using PyMuPDF. Subsequently, it leverages Large Language Models (LLMs) with Jinja templates to parse specific resume sections (Basics, Work, Education, Skills, Projects) into a structured JSON Resume object. An enrichment step fetches GitHub profile and repository data, classifying projects and using an LLM to select the top seven contributions. Finally, an evaluator.py module applies strict scoring rules, incorporating fairness constraints, bonus points, and deductions, to produce a comprehensive evaluation with evidence.

Quick Start & Requirements

  • Installation: Clone the repository, create and activate a Python 3.11+ virtual environment, and install dependencies via pip install -r requirements.txt.
  • Prerequisites: Python 3.11.13 is pinned. An LLM backend is required: either Ollama (install and run ollama serve) or Google Gemini (requires an API key). Ollama users must pull desired models (e.g., ollama pull gemma3:4b). A GITHUB_TOKEN environment variable is optional but recommended for improved GitHub API rate limits.
  • Configuration: Copy .env.example to .env and set LLM_PROVIDER (ollama/gemini), DEFAULT_MODEL, and GEMINI_API_KEY if applicable. config.py controls DEVELOPMENT_MODE for caching and CSV export.
  • Links: GitHub Repository

Highlighted Details

  • Automated PDF-to-Markdown conversion via PyMuPDF.
  • LLM-driven, template-based extraction into JSON Resume format.
  • GitHub enrichment pipeline classifies projects and selects top contributions.
  • Evaluation framework enforces fairness constraints and provides explainable scores.
  • Supports both local Ollama and cloud Gemini LLM providers.
  • Development mode facilitates iteration with caching and CSV output.

Maintenance & Community

The repository includes a CONTRIBUTING.md file detailing contribution guidelines. Specific community channels (e.g., Discord, Slack), roadmap links, or notable maintainer information are not detailed in this README excerpt.

Licensing & Compatibility

The project is licensed under the MIT license. This permissive license generally allows for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

The system requires Python 3.11+. Its accuracy is dependent on the quality of the LLM responses and the clarity of the input PDF resumes. GitHub enrichment relies on the presence of a discoverable GitHub username within the resume. The project appears to be under active development, indicated by the DEVELOPMENT_MODE flag.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
17
Issues (30d)
15
Star History
946 stars in the last 30 days

Explore Similar Projects

Starred by Morgan Funtowicz Morgan Funtowicz(Head of ML Optimizations at Hugging Face), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
8 more.

lighteval by huggingface

0.2%
2k
LLM evaluation toolkit for multiple backends
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.