I’ve been testing the model on your playground and I’m trying to get a better sense of how they compare. How do they vary beyond the pricepoint?
Hey Ichabod, here’s a high level comparison of both models:
| Feature | inworld-tts-1 | inworld-tts-max |
|---|---|---|
| Latency | Low (~200-400ms) | Higher |
| Expressiveness | High | Higher |
| Real-time | Yes | No |
| Use case | Games, apps, chat | Audiobooks, content |
Essentially, for now, if you have real-time use case we recommend you use inworld-tts-1 unless you specifically need the extra expressiveness or don’t care about latency.