OCRAutoScore by vkgo

AI-powered tool for automated exam grading

Created 3 years ago

426 stars

Top 69.4% on SourcePulse

Project Summary

This project provides an automated grading system for exams using Optical Character Recognition (OCR) and deep learning models. It targets educators and researchers looking to streamline the grading process for various question types, including fill-in-the-blanks, multiple-choice, and essays, by automating text recognition and scoring.

How It Works

The system employs a modular approach. For fill-in-the-blank questions, it combines PaddleOCR for initial text recognition with CLIP for semantic similarity checking, significantly improving accuracy on challenging inputs. Essay scoring utilizes a fine-tuned DeBERTaV3-large model (MSPLM) for nuanced evaluation. Question segmentation (identifying different question types within an exam paper) is handled by YOLOv8, while individual character recognition for multiple-choice questions leverages SpinalNet and WaveMix models. Mathematical formula recognition is addressed using a Counting-Aware Network (CAN) adapted from existing research.

Quick Start & Requirements

Installation: Clone the repository and install dependencies via pip. Specific model weights may need to be downloaded separately from Hugging Face or PaddlePaddle.
Prerequisites: Python 3.6+, PyTorch 1.10.2+, PaddlePaddle, Transformers, OpenCV, YOLOv8, and potentially CUDA for GPU acceleration.
Setup: Requires downloading pre-trained models and potentially training custom models on specific datasets (e.g., EMNIST for character recognition, CROHME for formulas).
Documentation: Detailed explanations and usage instructions are provided within the README and specific module directories.

Highlighted Details

Combines OCR (PaddleOCR) with CLIP for robust fill-in-the-blank grading, achieving high accuracy on difficult cases.
Utilizes YOLOv8 for effective segmentation of exam papers into major question sections.
Implements SpinalNet and WaveMix for single-character recognition, trained on EMNIST.
Features a CAN-based model for mathematical formula recognition, trained on the CROHME dataset.
Includes an essay scoring module (MSPLM) based on DeBERTaV3-large, with custom loss functions.

Maintenance & Community

The project is hosted on GitHub under the vkgo organization. Specific community channels or active development status are not explicitly detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. The project integrates various libraries with their own licenses, which may impose restrictions on commercial use or redistribution.

Limitations & Caveats

The README mentions that the essay scoring model (MSPLM) did not achieve the same results as the original paper on the ASAP dataset, potentially due to implementation differences or dataset limitations. The project appears to be a collection of modules developed for a specific academic project ("大创集成仓库"), and its overall integration and production-readiness are not fully elaborated.

OCRAutoScore by vkgo

Explore Similar Projects

MMedLM by MAGIC-AI4Med

AI-ANSWER-ASSISTANT by rehuan

GAOKAO-Bench by OpenLMLab

ai-vocabulary-builder by piglei

hle by centerforaisafety

CBLUE by CBLUEbenchmark

RCPapers by thunlp

deepdoctection by deepdoctection

cdQA by cdqa-suite

FASPell by iqiyi

skid-homework by cubewhy

WriteGPT by Turing-Project