GEMMA  by genetics-statistics

CLI tool for genome-wide association studies using linear mixed models

created 13 years ago
366 stars

Top 78.1% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

GEMMA is a software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and large-scale datasets. It addresses the need for efficient statistical analysis of genetic data, offering solutions for population structure correction, heritability estimation, and multi-marker modeling for researchers and bioinformaticians.

How It Works

GEMMA implements univariate and multivariate linear mixed models (LMMs) for association tests, efficiently correcting for population structure and sample non-exchangeability. It also offers a Bayesian sparse linear mixed model (BSLMM) for phenotype prediction and multi-marker modeling. Variance components can be estimated using raw data (HE regression, REML AI) or summary statistics (MQS algorithm), providing flexibility in analysis.

Quick Start & Requirements

  • Installation: Precompiled binaries, Docker, Conda, Homebrew, or Guix.
  • Prerequisites: None explicitly stated for precompiled binaries. Compilation from source may benefit from specialized C++ compilers and numerical libraries.
  • Resources: Docker images available for MacOS, Windows, and Linux (amd64 & arm64).
  • Links: INSTALL.md, GEMMA manual, Tutorial

Highlighted Details

  • Fast association tests using univariate LMMs.
  • Multivariate LMMs for jointly analyzing multiple phenotypes.
  • BSLMM for polygenic modeling and phenotype prediction.
  • Variance component estimation partitioned by SNP functional categories.

Maintenance & Community

  • Development has moved to PanGEMMA (as of Dec 2024).
  • Community support via GEMMA Google Group.
  • Bug reporting via GitHub issues.
  • Contributions encouraged via pull requests.

Licensing & Compatibility

  • License: GNU General Public License (GPL).
  • Compatibility: GPL is a copyleft license, potentially restricting use in closed-source commercial products without careful consideration of derivative works. Included libraries have LGPL and Boost Software License.

Limitations & Caveats

  • Main software development has transitioned to PanGEMMA, suggesting potential for reduced updates or support for this repository.
  • The GPL license may impose significant restrictions on commercial use or integration into proprietary software.
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.