Curated list of articles on why data science projects fail
Top 66.2% on sourcepulse
This repository is a curated collection of articles and resources detailing common reasons for data science and machine learning project failures. It serves as a valuable reference for practitioners, managers, and researchers aiming to avoid pitfalls in AI/ML initiatives. The project categorizes failure points across organizational, technical, and product-related aspects, offering practical insights to improve project success rates.
How It Works
The project compiles links to articles, blog posts, and research papers that analyze data science project failures. It categorizes these failures into broad themes such as organizational issues (leadership, employees, infrastructure), intermediate concerns (legal, privacy, bias, security), product planning (business value, specification), project execution (data, modeling), and ongoing product management (operations). This structured approach helps users quickly identify common failure modes and understand their root causes.
Quick Start & Requirements
This is a static collection of links and does not require installation or execution. Users can browse the README for categorized links to external resources.
Highlighted Details
Maintenance & Community
The project is maintained by xLaszlo. Users are encouraged to suggest additional articles via the Issues tab. The author also shares updates on Twitter (@xLaszlo) and links to their company blog (hypergolic.co.uk).
Licensing & Compatibility
The repository content is presented as a collection of links to external resources. The specific licensing of the linked external content varies by source.
Limitations & Caveats
The README notes a "notable absence of any concern about domain experts and any collaboration with them" in the collected failures, suggesting this is a potential blind spot in the current categorization. The project is a curated list and does not provide tools or methodologies for active failure prevention.
4 years ago
Inactive