tts : add support for Orpheus #12476

ggerganov · 2025-03-20T08:11:43Z

HF: https://huggingface.co/collections/canopylabs/orpheus-tts-67d9ea3f6c05a941c06ad9d2

These TTS models seem suitable for supporting. To do that, we need to implement the SNAC audio codec: https://github.com/hubertsiuzdak/snac/

Sample implementation using Python-based inference of SNAC: https://github.com/isaiahbjork/orpheus-tts-local

Similar model support (OuteTTS): #10784
Can be used as a reference how to implement this.

leoflowers · 2025-03-20T14:32:04Z

Howdy! I'd like to give this issue a try.

LostRuins · 2025-03-21T04:26:45Z

It would be awesome if we could get xcodec, we already have plenty of tts but no ttmusic yet. Everything is a llama model and snac/wavtokenizer/xcodec the vocoding is all that's missing #11467

ggerganov · 2025-03-21T07:22:56Z

Everything is a llama model and snac/wavtokenizer/xcodec the vocoding is all that's missing

Technically yes, though we have to improve the way that these codecs are implemented and supported in general. Probably be able to have a single GGUF file with both the LLM and the codec, instead of separate. And be able to create either separate decoder / vocoder contexts from a single model. Or alternatively, a combined decoder+vocoder context. At least that's my general idea for supporting multi-modal cases, though I'm still figuring it out.

we already have plenty of tts

llama.cpp has only OuteTTS support via WavTokenizer. It would be nice to have at least one more TTS, so we can find some common patterns which would help to implement the above.

the vocoding is all that's missing

Mostly yes, but we also have to figure out how to do audio streaming. I am not sure yet how it works, but with Orpheus we should be able to understand it because it supports real-time streaming.

scalar27 · 2025-03-29T14:52:22Z

This works pretty well on my Mac M1 running two instances of llama-server and fastrtc.
https://github.com/PkmX/orpheus-chat-webui

ggerganov added good first issue Good for newcomers tts Text-to-speech labels Mar 20, 2025

jamorphy mentioned this issue Mar 21, 2025

(draft) tts: Orpheus support #12487

Draft

This was referenced Mar 21, 2025

suppport for orpheus mmwillet/TTS.cpp#9

Open

tts : add support for SparkTTS #12495

Closed

giladgd mentioned this issue Mar 23, 2025

feat: initial TTS support withcatai/node-llama-cpp#446

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tts : add support for Orpheus #12476

tts : add support for Orpheus #12476

ggerganov commented Mar 20, 2025 •

edited

Loading

leoflowers commented Mar 20, 2025

LostRuins commented Mar 21, 2025

ggerganov commented Mar 21, 2025

scalar27 commented Mar 29, 2025

tts : add support for Orpheus #12476

tts : add support for Orpheus #12476

Comments

ggerganov commented Mar 20, 2025 • edited Loading

leoflowers commented Mar 20, 2025

LostRuins commented Mar 21, 2025

ggerganov commented Mar 21, 2025

scalar27 commented Mar 29, 2025

ggerganov commented Mar 20, 2025 •

edited

Loading