Discover and explore top open-source AI tools and projects—updated daily.
Python library for multi-omics analysis
Top 46.8% on SourcePulse
OmicVerse is a Python library designed for comprehensive multi-omics analysis, specifically targeting bulk, single-cell, and spatial RNA sequencing data. It aims to provide a unified framework for researchers to integrate and analyze diverse transcriptomic datasets, facilitating deeper insights across different biological contexts. The library is particularly beneficial for bioinformaticians and computational biologists working with complex RNA-seq data.
How It Works
OmicVerse is built upon a data framework leveraging pandas
, anndata
, numpy
, and muData
. A key algorithmic contribution is the BulkTrajBlend
algorithm, which combines Beta-Variational Autoencoders for deconvolution with graph neural networks for community discovery. This approach is designed to interpolate and restore continuity in single-cell RNA-seq data, addressing "omission" cells and improving data completeness. The library also includes a research submodule (omicverse.llm.dr
) that utilizes large language models for automated report generation from user queries, including web search retrieval and synthesis.
Quick Start & Requirements
Installation can be done via conda (conda install omicverse -c conda-forge
) or pip (pip install -U omicverse
). PyTorch must be installed first. Additional dependencies may include scanpy
, tdigest
, peft
, datasets
, accelerate
, chromadb
, and langchain_community
, depending on the chosen LLM or vector store functionalities. Detailed installation guides for Windows, Linux, and macOS are available.
Highlighted Details
BulkTrajBlend
algorithm offers a novel approach to data imputation and continuity restoration in scRNA-seq data.Maintenance & Community
The project is actively maintained, with a primary contact listed as Zehua Zeng. Contributing guidelines are available, and the project is promoted via WeChat Official Accounts.
Licensing & Compatibility
OmicVerse is licensed under GPL-3.0. This license is copyleft, meaning derivative works must also be licensed under GPL-3.0, which may have implications for integration into closed-source commercial products.
Limitations & Caveats
The GPL-3.0 license may impose restrictions on commercial use or integration into proprietary software due to its strong copyleft provisions. Some advanced LLM features require API keys (e.g., OpenAI, Tavily) and additional dependencies.
1 day ago
Inactive