OmniVoice ComfyUI Node

ComfyUI custom node for OmniVoice TTS and voice cloning.

Upstream project:

https://github.com/k2-fsa/OmniVoice

Download Model Here:

https://huggingface.co/k2-fsa/OmniVoice

This node is built with only the necessary model forward core process in mind, so the nodes in this repo are already the final ones. I don’t like cluttering ComfyUI with unnecessary node mappings, so updates will only include bug fixes or truly urgent and necessary new nodes. If needed, I will create them.

Warning

⚠️ WARNING HF TRANSFORMER 5.3 and ABOVE REQUIRED. ⚠️

Check what your ComfyUI environment is using:

pip list | grep transformer

Why: some models and libraries might still heavily depend on 4.5X HF Transformers.

Install

Direct manual clone:

git clone https://github.com/komikndr/omnivoice_comfy inside ComfyUI/custom_nodes
cd omnivoice_comfy
pip install -r requirements.txt

ComfyUI manager:

comfy node install omnivoice_comfy
Put the OmniVoice weights in ComfyUI/models/tts/omnivoice/.

Expected layout:

ComfyUI/
  models/
    tts/
      omnivoice/
        model.safetensors
        audio_tokenizer.safetensors

You only need to place the two .safetensors files in the folder above. The node already includes the required tokenizer and config assets.

Nodes

OmniVoice Loader

Loads:

OmniVoice Model
Audio Tokenizer Model

The loader builds a local runtime snapshot from the embedded config assets and the two selected weight files.

OmniVoice TTS

Inputs:

text for the target speech
optional instruct
optional ref_audio and ref_text for voice cloning

If you use ref_audio, you must also provide ref_text.

Notes

Whisper auto-transcription is disabled. Voice cloning requires ref_text.
If you want voice cloning, install https://github.com/yuvraj108c/ComfyUI-Whisper or another similar workflow/pipeline that auto-transcribes the source audio. OmniVoice requires the transcript of the source audio. You can manually transcribe a 3 second clip, but that gets tedious in batch processing.
The node uses files from ComfyUI/models/tts/omnivoice/ and builds a symlinked runtime snapshot.
If symlink creation fails on your system, use a full HuggingFace-style OmniVoice folder instead.

LLM Disclaimer

This repo is build with the help of Qwen 3.5 9B and embeddinggemma-300m to store the original code into vector store for fast retrieval (most of my time in coding wasted on code repo search)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
doc		doc
src/omnivoice_comfy		src/omnivoice_comfy
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmniVoice ComfyUI Node

Warning

⚠️ WARNING HF TRANSFORMER 5.3 and ABOVE REQUIRED. ⚠️

Install

Nodes

OmniVoice Loader

OmniVoice TTS

Notes

LLM Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

OmniVoice ComfyUI Node

Warning

⚠️ WARNING HF TRANSFORMER 5.3 and ABOVE REQUIRED. ⚠️

Install

Nodes

OmniVoice Loader

OmniVoice TTS

Notes

LLM Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages