sourcepulse

Search results

294 results for "conversational speech model"

Showing 1 - 25 of 29425 of 294 repos

...

Rows

	Repository	Description	Stars	Stars 7d Δ	Stars 7d %	PRs 7d Δ	Created	Response rate	Last active
1	awesome-speech-enhancementWenzheLiu-Speech	A curated list of speech enhancement, dereverberation, and speech separation resources (papers, code, tools). Covers traditional and neural ...	1k Top 50%	3	0.3%	0	5y ago	Inactive	1y ago
2	whisperopenai Starred by +17	General-purpose speech recognition model. It performs multilingual speech recognition, speech translation, and language identification.	86k Top 1%	377	0.4%	0	2y ago	Inactive	1mo ago
3	csm-mlxsenstella	Text-to-speech model implemented in MLX, Apple's machine learning framework. It supports context input, quantization, and streaming.	367	1	0.3%	0	4mo ago	Inactive	2mo ago
4	local-talking-llmvndee	Build a local voice assistant with speech-to-text (Whisper), LLM (Ollama, Llama-2), and text-to-speech (Bark). Supports voice-based interact...	530	5	0.9%	0	1y ago	1 week	2mo ago
5	Meta-voiceboxSpeechifyInc	Implementation of Voicebox, a text-guided multilingual speech generation model. It performs zero-shot TTS, noise removal, and style conversi...	583	0	0%	0	2y ago	Inactive	2y ago
6	ChatTTS2noise Starred by	Generative speech model for daily dialogue, optimized for conversational TTS. It supports multiple speakers and fine-grained prosodic contro...	37k Top 1%	80	0.2%	0	1y ago	1 day	3w ago
7	Step-Audiostepfun-ai	Open-source framework for intelligent speech interaction. It supports multilingual conversations, voice cloning, and controllable speech syn...	4k Top 25%	7	0.2%	1	5mo ago	1 day	1mo ago
8	Freeze-OmniVITA-MLLM	Speech-to-speech dialogue model built on a frozen LLM. It features chunk-wise streaming input, AR-based speech output, and state prediction....	334	1	0.3%	0	9mo ago	1 day	2mo ago
9	QuickAgentgkamradt	Voice bot demo using Text-To-Speech, Speech-To-Text, and a language model to have a conversation with a user. Utilizes streaming.	371	1	0.3%	0	1y ago	1 week	1y ago
10	LLaSA_trainingzhenye234	Text-to-speech model trained on 250k hours of speech data. It uses a unified tokenizer for both speech (X-codec2) and text (LLaMA).	595	2	0.3%	0	6mo ago	1 week	3mo ago
11	csmSesameAILabs Starred by	Speech generation model that generates RVQ audio codes from text and audio inputs. It employs a Llama backbone and an audio decoder.	14k Top 5%	46	0.3%	0	5mo ago	1 week	2mo ago
12	SpeechGPT0nutation	Speech Large Language Models capable of perceiving and generating multi-modal content following multi-modal human instructions. Includes dat...	1k Top 50%	2	0.1%	0	2y ago	1 day	1y ago
13	Voilamaitrix-org	Voice-language foundation models for real-time, low-latency voice interaction. It supports ASR, TTS, and voice translation across six langua...	429	1	0.2%	0	4mo ago	Inactive	2mo ago
14	phemePolyAI-LDN	Framework for efficient, conversational TTS model training and inference. Uses semantic/acoustic token separation and MaskGit-style parallel...	260	0	0%	0	1y ago	1 week	1y ago
15	SpeechGPT-2.0-previewOpenMOSS	End-to-end speech dialogue model trained on millions of hours of speech data. It features low-latency response and natural, human-like speec...	347	1	0.3%	0	6mo ago	Inactive	6mo ago
16	ASR-LLM-TTSABexit	Voice interaction framework using SenceVoice ASR, QWen2.5 LLM, and TTS (CoosyVoice, pyttsx3, edgeTTS). Includes voiceprint recognition and K...	876 Top 50%	10	1.1%	0	8mo ago	1+ week	5mo ago
17	LLaMA-Omniictnlp Starred by	Speech-language model built upon Llama-3. It supports low-latency and high-quality speech interactions, generating both text and speech.	3k Top 25%	3	0.1%	0	10mo ago	1 day	2mo ago
18	talking_avatarbornfree	A ThreeJS-powered virtual human that uses Azure APIs for speech synthesis. Can be combined with a chat model for an interactive avatar.	369	2	0.6%	0	2y ago	Inactive	1mo ago
19	twewy-discord-chatbotRuolinZheng08	Discord chatbot using a fine-tuned conversational model. The model is trained on a character's lines and hosted on Hugging Face's Model Hub....	316	1	0.3%	0	4y ago	1 day	2y ago
20	Qwen2-AudioQwenLM	Large-scale audio-language model for audio analysis and voice chat. It accepts audio inputs and performs audio analysis or textual responses...	2k Top 25%	8	0.4%	0	1y ago	1 day	3mo ago
21	ZipVoicek2-fsa	Fast, high-quality zero-shot TTS with flow matching. Supports voice cloning, multi-lingual, and dialogue generation.	336	19	5.9%	1	1mo ago	Inactive	3d ago
22	smart-turnpipecat-ai	Open-source audio turn detection model. Uses Wav2Vec2-BERT to determine when a voice agent should respond to human speech. Supports English....	840 Top 50%	18	2.2%	0	4mo ago	1 day	1w ago
23	mini-omni2gpt-omni	Omni-interactive model that understands image, audio, and text inputs. Features real-time voice output and flexible interaction.	2k Top 25%	4	0.2%	0	9mo ago	1 week	6mo ago
24	vitsjaywalnut310 Starred by	End-to-end text-to-speech model using variational inference, normalizing flows, and adversarial training. Includes a stochastic duration pre...	8k Top 10%	15	0.2%	0	4y ago	Inactive	1y ago
25	INTERSPEECH-2023-24-PapersDmitryRyumin	A curated list of speech and language processing papers from INTERSPEECH 2023 & 2024, covering ASR, speech synthesis, and more. Includes cod...	678	0	0%	0	2y ago	Inactive	7mo ago