OneChart  by LingyvKong

AI model for extracting structured data from charts

Created 1 year ago
255 stars

Top 98.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

OneChart is the official codebase for a novel approach to chart structural extraction, presented at ACM Multimedia 2024. It aims to purify the process of extracting structured information from charts by introducing a single auxiliary token. This project is targeted at researchers and developers working on visual question answering, information extraction from documents, and chart understanding, offering a more robust method for converting visual chart data into structured formats like Python dictionaries.

How It Works

The core innovation of OneChart lies in its "Purify the Chart Structural Extraction via One Auxiliary Token" methodology. While specific algorithmic details are not deeply elaborated, the approach integrates an auxiliary token into the model's processing pipeline. This token is designed to guide and refine the extraction of chart structures, potentially improving accuracy and robustness compared to methods that do not explicitly handle structural purification. The project builds upon the "Vary" codebase and initial weights, suggesting a foundation in existing large vision-language model architectures.

Quick Start & Requirements

  • Primary Install: Clone the repository, navigate to OneChart_code/, create a conda environment (conda create -n onechart python=3.10 -y, conda activate onechart), and install dependencies (pip install -e ., pip install -r requirements.txt, pip install ninja).
  • Prerequisites: CUDA 11.8+, PyTorch 2.0.1, Python 3.10. Requires downloading pre-trained weights.
  • Demo: A web demo is available via Hugging Face (kppkkp/OneChart) and a local demo script (vary/demo/run_opt_v1.py) is provided.
  • Links: Hugging Face model: https://huggingface.co/kppkkp/OneChart.

Highlighted Details

  • Official code for ACM Multimedia 2024 Oral paper (3.97% acceptance rate).
  • Supports quick trial via Hugging Face demo and a local demo script.
  • Includes benchmark data and an evaluation tool (ChartSE).
  • Training scripts are provided, leveraging DeepSpeed for distributed training.

Maintenance & Community

The project is associated with authors from multiple institutions. No specific community channels (like Discord/Slack) or roadmap links are provided. The release date is recent (September 2024).

Licensing & Compatibility

The data, code, and checkpoints are explicitly stated to be "intended and licensed for research use only." Usage is restricted by the license agreement of "Vary, Opt," upon which the project is built. This implies potential copyleft or non-commercial restrictions inherited from the base project, making commercial use or integration into closed-source products highly questionable without further clarification.

Limitations & Caveats

The project is strictly licensed for research purposes only, posing a significant barrier to commercial adoption. The README also notes that the project builds upon "Vary" and "Opt," suggesting potential dependencies and licensing complexities inherited from those base projects that are not fully detailed here.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.