RigorPilot-Skills  by lllllllama

Agent skills for rigorous deep learning research

Created 2 months ago
412 stars

Top 70.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary RigorPilot Skills addresses the critical need for grounded, reproducible, and auditable workflows in AI-assisted deep learning research. It targets researchers and engineers, aiming to ensure meaningful progress by prioritizing scientific rigor over mere score optimization.

How It Works The project employs a dual-lane approach: a "Trusted Lane" for reproduction, setup, analysis, and safe debugging, emphasizing scientific meaning and comparability; and an "Explore Lane" for researcher-authorized, candidate-only exploration, ensuring auditable and bounded changes. It enforces core principles like avoiding blind score chasing, not claiming novelty lightly, and maintaining comparability. Evidence outputs, such as SCIENTIFIC_CHANGELOG.md and COMPARABILITY_REPORT.md, are central to its workflow.

Quick Start & Requirements The primary installation method uses npx:

  • Full skill set: npx skills add lllllllama/rigorpilot-skills --all
  • Trusted reproduction entrypoint: npx skills add lllllllama/rigorpilot-skills --skill ai-research-reproduction
  • Explicit exploration entrypoint: npx skills add lllllllama/rigorpilot-skills --skill ai-research-explore Local development or project-scoped installs can be managed via Python scripts (scripts/install_skills.py). No specific hardware prerequisites (like GPUs) are mandated by the core skills themselves, though they are implied for deep learning experiments.

Highlighted Details

  • Output Directories: Organizes artifacts into distinct directories like repro_outputs/, analysis_outputs/, train_outputs/, debug_outputs/, and explore_outputs/.
  • Research Evidence Artifacts: Promotes detailed documentation including SCIENTIFIC_CHANGELOG.md, COMPARABILITY_REPORT.md, REPRODUCIBILITY_NOTES.md, NOVELTY_CLAIM.md, ABLATION_PLAN.md, and EXPERIMENT_LEDGER.md.
  • Campaign Inputs: For exploration, research_campaign.json or research_campaign.yaml are preferred for defining tasks, datasets, evaluation sources, and SOTA references.
  • Core Principles: Emphasizes scientific meaning, comparability, reproducibility, collaborator control, and auditable workflow boundaries.

Maintenance & Community No specific details regarding notable contributors, sponsorships, or community channels (e.g., Discord, Slack) are provided in the README.

Licensing & Compatibility The README does not specify a license type. Consequently, compatibility notes for commercial use or closed-source linking are not available.

Limitations & Caveats The run-train skill functions as a bounded training monitor, not a long-running scheduler. Trusted reproduction actively avoids silent semantic changes. Helper skills are designed to be narrow in scope, and exploratory work must remain isolated from trusted baselines. The ai-research-explore skill is a governed tool, not an open-ended autonomous research agent.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
339 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.