advisor-ledger  by the-hidden-fish

A system for preserving the integrity of community-edited documents

Created 1 week ago

New!

1,095 stars

Top 34.5% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> This project provides an immutable, auditable mirror for anonymously editable online documents, such as community-maintained "Advisor Red Flags Notes," which are vulnerable to silent content deletion. By capturing every edit and deletion in Git history and rendering a live view that preserves removed content, it offers researchers and power users a robust defense against censorship and data loss in collaborative information repositories.

How It Works

<2-4 sentences on core approach / design (key algorithms, models, data flow, or architectural choices) and why this approach is advantageous or novel.> An automated pipeline, triggered by a systemd timer every two minutes, fetches the source Google Doc. It normalizes content, generates deterministic snapshots and paragraph-level deltas based on content hashes, and runs a local LLM to flag PII, personal attacks, or suppressive edits. The system then renders a live index.html view via GitHub Pages, preserving deleted content as "ghost paragraphs" with timestamps, ensuring a complete, auditable history.

Quick Start & Requirements (only include this section if it contains useful information)

  • Setup involves configuring the systemd timer to execute Python scripts in scripts/.
  • Prerequisites include git, flock, Google Drive API access, a local LLM, and a Python environment.
  • The rendered view is hosted on GitHub Pages.

Highlighted Details

  • Maintains a complete, immutable Git history of all edits, including deletions, from an anonymously editable source.
  • Employs paragraph-level diffing based on content hashes to accurately track changes and minimize noise.
  • Integrates a local LLM for automated review of changes, flagging potential PII, personal attacks, and suppressive deletions.
  • Renders a live view that preserves deleted content as "ghost paragraphs" with timestamps.

Maintenance & Community

  • No specific details on contributors, sponsorships, community channels, or a public roadmap were found in the provided README.

Licensing & Compatibility

  • Pipeline automation code (scripts/) is Public Domain (CC0). Mirrored content (snapshots/, deltas/, docs/) retains original author rights.
  • Acts as an "observational mirror"; edits should be made to the original Google Doc.

Limitations & Caveats

<1-3 sentences on caveats: unsupported platforms, missing features, alpha status, known bugs, breaking changes, bus factor, deprecation, etc. Avoid vague non-statements and judgments.> Functionality depends on the original Google Doc's availability and API access. LLM review effectiveness is model-dependent. As an observational mirror, it passively reflects changes without controlling the source document.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
23
Star History
1,121 stars in the last 8 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
4 more.

dolma by allenai

0.3%
1k
Toolkit for curating datasets for language model pre-training
Created 2 years ago
Updated 5 months ago
Feedback? Help us improve.