cnvkit  by etal

A toolkit for detecting copy number variants in DNA sequencing

Created 11 years ago
602 stars

Top 54.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

CNVkit is a command-line toolkit and Python library for genome-wide detection of copy number variants (CNVs) and alterations from high-throughput DNA sequencing data. It empowers researchers and bioinformaticians with a robust solution for analyzing sequencing data to identify structural variations.

How It Works

CNVkit offers a flexible, modular approach via a command-line interface and Python library. It processes sequencing data to infer copy number changes genome-wide. The toolkit supports advanced segmentation algorithms like Circular Binary Segmentation (CBS), which necessitates specific R package dependencies.

Quick Start & Requirements

Installation is recommended via Conda (conda create -n cnvkit cnvkit, source activate cnvkit) or pip (pip install cnvkit). Source installation is also supported (git clone ..., pip install -e .). CNVkit requires Python 3.10+. The CBS segmentation algorithm needs R's DNAcopy package. Key Python dependencies include Biopython, Reportlab, Matplotlib, NumPy, SciPy, Pandas, pyfaidx, pysam, and pyvcf. Full documentation is at http://cnvkit.readthedocs.io.

Highlighted Details

  • Accessible via DNAnexus app (https://platform.dnanexus.com/app/cnvkit_batch) and Galaxy tool (https://testtoolshed.g2.bx.psu.edu/view/etal/cnvkit).
  • Docker containers available on Docker Hub (https://registry.hub.docker.com/r/etal/cnvkit/) and BioContainers (Quay) for reproducible execution.
  • CI/CD via GitHub Actions ensures compatibility across Python 3.10-3.14.

Maintenance & Community

Support is available via Biostars (https://www.biostars.org/t/CNVkit/) for questions and GitHub issues (https://github.com/etal/cnvkit/issues/) for bugs/features. Development follows CONTRIBUTING.md, utilizing pre-commit hooks, Makefiles, Docker, and GitHub Actions.

Licensing & Compatibility

Licensed under Apache 2.0, offering broad permissiveness for commercial use and integration into closed-source projects.

Limitations & Caveats

The README does not explicitly list project limitations. The CBS segmentation algorithm requires separate installation of R packages if not using Conda.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
16
Issues (30d)
67
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.