Discover and explore top open-source AI tools and projects—updated daily.
espnetSDK for managing pretrained ESPnet models, including Hugging Face models
Top 98.6% on SourcePulse
This repository provides utilities for managing and downloading pretrained models for ESPnet, a popular end-to-end speech processing toolkit. It targets researchers and developers working with Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and speech separation, offering a streamlined way to access and utilize state-of-the-art models.
How It Works
The library acts as a centralized hub for ESPnet models, supporting both models hosted on Zenodo and Hugging Face. It allows users to download and unpack models by specifying a Hugging Face ID, a Zenodo URL, or a tag listed in table.csv. For long audio processing in speech enhancement/separation, it supports segment-wise processing with configurable segment_size and hop_size.
Quick Start & Requirements
pip install torch espnet_model_zooHighlighted Details
espnet_model_zoo_query, espnet_model_zoo_download) for model management.Maintenance & Community
table.csv.Licensing & Compatibility
Limitations & Caveats
ModelDownloader API usage is required.2 years ago
Inactive
jonatasgrosman
lonePatient
xorbitsai
modelscope
PaddlePaddle