research-units-pipeline-skills by WILLOSCAR

Agentic research pipelines for evidence-based paper generation

Created 6 months ago

481 stars

Top 63.0% on SourcePulse

Project Summary

Summary

This repository introduces "research units" as semantic execution units for pipelines, addressing the brittleness of script-only approaches and the hollowness of documentation-only methods. It employs an evidence-first methodology, enforcing structured intermediate artifacts to ensure verifiable research outputs. The system targets researchers and engineers seeking reproducible, robust, and rigorously grounded research workflows.

How It Works

The core methodology, "Skills-First + Decomposed Pipeline + Evidence First," structures research into discrete, verifiable steps ("skills"). Each skill defines explicit inputs, outputs, acceptance criteria, and guardrails. Execution proceeds via granular "units" (UNITS.csv), enabling precise error localization and pipeline resumption. An "evidence-first" mandate requires generating structured artifacts (e.g., paper notes, context packs) before prose composition, ensuring content is grounded and auditable.

Quick Start & Requirements

Primary Command: Execute via the codex tool: codex --sandbox workspace-write --ask-for-approval never.
Prerequisites: Requires the codex tool and network access.
Setup: Claimed "30 seconds to get started from zero to PDF."
Key Documentation: English README (README.en.md), Skills Index (SKILL_INDEX.md), Skill/Pipeline Standard (SKILLS_STANDARD.md). Example pipeline: pipelines/arxiv-survey-latex.pipeline.md.

Highlighted Details

Semantic Skills: Formalizes research steps with defined inputs, outputs, acceptance criteria, and guardrails for verifiable execution.
Evidence-First Methodology: Enforces structured artifact creation (notes, context packs) prior to prose generation, preventing superficial content.
Reproducible & Recoverable Execution: Pipelines composed of discrete "units" (UNITS.csv) allow targeted fixes and resumption from failure points.
Automated Quality Gates: strict mode halts execution on unmet criteria, providing detailed reports for debugging and improvement.
End-to-End Survey Generation: An example pipeline (arxiv-survey-latex.pipeline.md) orchestrates paper retrieval, outline confirmation, evidence gathering, writing, and PDF compilation.

Maintenance & Community

The project welcomes issues and is actively developing features like multi-CLI collaboration and multi-agent design integration. Specific community links (Discord, Slack) or contributor details are not provided in the README.

Licensing & Compatibility

No license information is specified in the provided documentation. This absence is a significant consideration for adoption.

Limitations & Caveats

The system is marked as Work In Progress (WIP) with ongoing development for multi-agent capabilities. It relies on the external codex tool, whose availability and requirements are not detailed. The absence of a specified license poses a critical adoption blocker.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

22 stars in the last 30 days