PaddleHelix  by PaddlePaddle

Bio-computing toolkit for applying deep learning to drug discovery and more

created 4 years ago
1,077 stars

Top 35.8% on sourcepulse

GitHubView on GitHub
Project Summary

PaddleHelix is a comprehensive bio-computing platform leveraging deep learning for drug discovery, vaccine design, and precision medicine. It offers large-scale pre-training models for compounds and proteins, various applications like molecular property prediction and drug-target affinity, and tools for RNA design and drug-drug synergy prediction. The platform targets researchers and developers in the life sciences and pharmaceutical industries.

How It Works

PaddleHelix utilizes PaddlePaddle, a high-performance deep learning framework, to implement advanced machine learning models. Its approach includes large-scale pre-training on vast datasets of compounds and proteins, enabling powerful representation learning. This is followed by specialized applications such as predicting molecular properties, drug-target interactions using graph neural networks and transformers, and RNA secondary structure prediction with linear-time algorithms. The platform also incorporates geometry-aware models and multi-task learning for enhanced accuracy and generalizability.

Quick Start & Requirements

  • Installation requires PaddlePaddle. Detailed installation guides and tutorials are available.
  • Specific applications may have additional dependencies, including GPU support for certain models.

Highlighted Details

  • Features HelixFold3, a biomolecular structure prediction model comparable to AlphaFold3, with open-source code for non-commercial use.
  • Includes HelixDock for protein-ligand structure prediction and HelixGEM-2, a top-ranked molecular property prediction network.
  • Offers MSA-free protein structure prediction (HelixFold-Single) that predicts structures in seconds.
  • Provides online services for structure prediction and drug design on the PaddleHelix platform.

Maintenance & Community

  • The project has a history of publications in Nature Machine Intelligence, Bioinformatics, and other peer-reviewed venues.
  • Active development is indicated by frequent updates and releases of new models and applications.

Licensing & Compatibility

  • Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
  • This license restricts commercial use and requires sharing modifications under the same terms.

Limitations & Caveats

The primary license restricts commercial applications, requiring users to seek alternative licensing or APIs for such use cases. Some advanced features, like HelixFold3, are available via a paid API for commercial applications.

Health Check
Last commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
0
Star History
23 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.