baibaiAIGC  by poleHansen

De-AIing Chinese academic documents

Created 2 weeks ago

New!

298 stars

Top 89.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

baibaiAIGC addresses the challenge of reducing AIGC-generated traces in Chinese academic papers and technical documents. It provides a structured, multi-round rewriting process to reduce AI markers while preserving original meaning, terminology, and academic style. Offering Web, Script, and Chat Skill interfaces, it targets users needing iterative refinement of AI-assisted content.

How It Works

The core methodology employs a strict two-round sequential rewriting process (1 -> 2). Documents are segmented into chunks, processed via an external OpenAI-compatible API, and reassembled to maintain original paragraph structure. This ensures systematic processing of long documents, prevents new fact introduction, and preserves original terminology, logic, and academic tone.

Quick Start & Requirements

  • Installation: Python dependencies (pip install -r requirements.txt), Web frontend (cd app && npm install).
  • Prerequisites: Python 3.x, Node.js/npm. Requires an OpenAI-compatible API endpoint (key, model, base URL) for Web/Script modes, configurable via environment variables or CLI.
  • Usage Modes:
    • Web Mode: Run backend (python scripts/web_app.py) and frontend (cd app && npm run dev:web).
    • Script API Mode: Use scripts/run_aigc_round.py with specified parameters.
    • Dialogue Skill Mode: Integrate via SKILL.md without manual API setup.
  • Input: Place .txt or .docx files in origin/.
  • Links: SKILL.md, references/usage.md, references/checklist.md.

Highlighted Details

  • Supports .txt and .docx files, with utilities for Word document extraction and rebuilding.
  • Features a rigid two-round sequential rewriting workflow (1 then 2).
  • Processes long documents via chunking (default 850 chars), respecting paragraph and sentence boundaries.
  • Offers distinct interfaces: local Web UI, command-line Script API, and integrated Chat Skill.

Maintenance & Community

Acknowledges feedback from the "linuxdo (linux.do) community". No specific community channel links or maintainer details are provided.

Licensing & Compatibility

No license information is specified in the README. This omission may hinder commercial use or integration.

Limitations & Caveats

Long documents require sequential, chunk-based processing; single-pass rewriting is unsupported. Dialogue Skill mode may be unstable for lengthy inputs. The two-round sequence is fixed. Focus is on stylistic refinement, not content alteration for detection evasion.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
3
Star History
299 stars in the last 16 days

Explore Similar Projects

Feedback? Help us improve.