thesis-docx  by the-shy123456

AI-powered thesis and dissertation document formatter

Created 1 month ago
288 stars

Top 91.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered "skill" for automating the revision and formatting of thesis and dissertation Word documents. It targets students and researchers needing to adhere to strict academic formatting guidelines, offering a structured approach to ensure consistency, fix common document issues, and generate supporting visuals and code snippets, ultimately streamlining the final document preparation process.

How It Works

The tool employs a methodical, AI-driven workflow prioritizing stability and accuracy. It begins by verifying the Word automation environment, followed by a deep audit of the OOXML structure to uncover hidden formatting inconsistencies. Formatting changes are applied in a defined order: first, strictly adhering to institutional requirements, then preserving existing styles where no specific rules apply. The process culminates in exporting a PDF for meticulous page-by-page review, ensuring all modifications are validated.

Quick Start & Requirements

  • Installation: Clone the repository: git clone https://github.com/the-shy123456/thesis-docx.git. No further file extraction is needed.
  • Prerequisites:
    • Windows operating system with desktop Microsoft Word installed.
    • Python environment with python-docx and lxml libraries.
    • Node.js with mmdc or npx for Mermaid diagram rendering.
  • Quick Start Commands:
    • Check Word COM/DOM availability: powershell -ExecutionPolicy Bypass -File scripts/check_word_com.ps1 -Json
    • Audit OOXML: python scripts/audit_docx_ooxml.py .\draft.docx --output_json .\draft.audit.json --output_txt .\draft.audit.txt
    • Normalize styles (dry-run): powershell -ExecutionPolicy Bypass -File scripts/normalize_word_styles.ps1 -InputPath .\draft.docx -AuditOnly
    • Export PDF: powershell -ExecutionPolicy Bypass -File scripts/export_word_pdf.ps1 -DocPath .\draft.docx -PdfPath .\draft.audit.pdf
  • Documentation: Links to SKILL.md, README_EN.md, references/script-usage.md, references/paper-format-workflow.md.

Highlighted Details

  • Automates revision of thesis/dissertation Word documents, including unified styling for text, headings, captions, and references.
  • Fixes common document issues such as table of contents, page numbering, section breaks, cross-references, and figure/table numbering.
  • Generates Mermaid diagrams based on real data and produces LaTeX-formatted code snippets or pseudocode.
  • Audits OOXML for subtle, hidden problems like styleId, firstLineChars, titlePg, REF field display values, and section-level headers/footers.

Maintenance & Community

The project is hosted on GitHub and mentions the "LINUX DO Community". No specific details on active contributors, sponsorships, or roadmap are provided in the README.

Licensing & Compatibility

The project is released under the MIT License. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The tool is explicitly designed for Windows environments with desktop Microsoft Word. It emphasizes a page-by-page PDF review as essential for confirming completion, implying that automated export alone should not be considered a final validation step. The core functionality relies on the stability and accessibility of Word's COM/DOM interface.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
180 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.