Stars
yu-haoyuan / CosyVoice
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
zero-shot voice conversion & singing voice conversion, with real-time support
A generative speech model for daily dialogue.
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
This package contains the original 2012 AlexNet code.
One command to start a streaming ASR server.
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Comfyui custom node for FunAudioLLM include CosyVoice2, SenseVoice and InspireMusic
Comfyui custom node for FunAudioLLM include CosyVoice and SenseVoice
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Port of Funasr's Sense-voice model in C/C++
Text Normalization & Inverse Text Normalization
peilongchencc / My-GLM-4-Voice
Forked from zai-org/GLM-4-Voiceubuntu 系统下 GLM-4-Voice 部署经验分享
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"