What latency do you have when running through a full conversational pipeline? Not just TTS

mikefein · November 19, 2025, 4:02pm

I need to understand the latency I should be able to achieve for an STT-LLM-TTS turn

InworldAI · November 19, 2025, 4:04pm

Hey Mike,

The latency profile for end-to-end (audio input → audio output) depends a variety of factors, but you should be able to achieve something in the following range:

STT: ~100-300ms
LLM: 500ms-2s (depends on model)
TTS: ~200-400ms (streaming)

Total: ~1-3 seconds for a full conversational turn. We recommend you use streaming TTS to reduce perceived latency.

Topic		Replies	Views
Im experiencing a latency difference between TTS-1 and TTS-1-Max TTS tts , latency	1	23	November 19, 2025
Difference between TTS 1 vs TTS Max TTS tts-api , tts-models	1	28	November 6, 2025
Welcome! So, what exactly is the Runtime SDK? Runtime runtime	2	16	November 20, 2025
About the Runtime category Runtime	2	12	November 20, 2025
About the TTS category TTS	2	13	November 20, 2025

What latency do you have when running through a full conversational pipeline? Not just TTS

Related topics