Discover and explore top open-source AI tools and projects—updated daily.
PrathamLearnsToCodeCode generation agent for research papers
New!
Top 43.2% on SourcePulse
This project addresses the common challenge of implementing machine learning papers, where crucial details like hyperparameters are often vaguely specified or omitted, leading to significant detective work for researchers and engineers. It provides an agent skill that transforms an arXiv paper URL into a citation-anchored, verifiable code implementation, saving users time and increasing confidence in their reproductions. The target audience includes ML practitioners who need to quickly and accurately implement research papers.
How It Works
The core innovation lies in "citation anchoring" and "ambiguity auditing." Every line of generated code is linked to the specific section or equation in the source paper it implements. Before generating code, the system audits implementation choices, classifying them as SPECIFIED, PARTIALLY_SPECIFIED, or UNSPECIFIED. Unlike naive code generation, it explicitly flags unspecified choices with comments and lists common alternatives, ensuring transparency and preventing silent assumptions. Appendices, footnotes, and figure captions are treated as primary sources.
Quick Start & Requirements
npx skills add PrathamLearnsToCode/paper2code/skills/paper2code.claude (or preferred agent) followed by /paper2code <arxiv_url> [--framework <framework>] [--mode <mode>].Highlighted Details
§3.2 — "We apply layer normalization...").[UNSPECIFIED], [PARTIALLY_SPECIFIED], and [ASSUMPTION] with explanations and alternative suggestions.README.md, REPRODUCTION_NOTES.md (ambiguity audit), requirements.txt, source code (src/), configuration files (configs/base.yaml), and a pedagogical walkthrough.ipynb.model.py maps classes to paper sections, REPRODUCTION_NOTES.md details ambiguity, base.yaml centralizes cited hyperparameters, and walkthrough.ipynb offers runnable sanity checks.Maintenance & Community
The provided README does not contain specific details regarding notable contributors, sponsorships, community channels (like Discord or Slack), or a public roadmap.
Licensing & Compatibility
The license type is not explicitly stated in the README, which is a critical omission for due diligence. Compatibility for commercial use or closed-source linking cannot be determined without a specified license.
Limitations & Caveats
This tool does not guarantee implementation correctness; it faithfully translates the paper, meaning errors or vagueness in the paper will be reflected in the code. It will not invent details or hyperparameters, instead flagging them as [UNSPECIFIED] with common alternatives. The tool does not handle dataset downloads, training infrastructure setup (e.g., distributed training, experiment tracking), or the implementation of baseline methods; it focuses solely on the core contribution described in the paper.
1 week ago
Inactive