Kitten TTS V0.8 Released: New State-of-the-Art Super-Tiny TTS Model Under 25 MB

20 February 2026 1 min read

Kitten MLdeveloper

Kitten TTS V0.8 marks a significant milestone for on-device speech synthesis. The release includes three models—80M, 40M, and 14M parameters—all under 25 MB, making them viable for edge devices, mobile applications, and resource-constrained environments. With open-source weights and code released under the permissive Apache 2.0 license, practitioners can now integrate expressive, low-latency TTS directly into local LLM pipelines.

The tiny model footprint doesn't compromise on quality—these represent state-of-the-art performance in the ultra-small TTS category. For local LLM deployments, this solves a critical gap: while inference engines like llama.cpp excel at text generation, reliable on-device speech output has lagged. Kitten TTS enables complete local AI assistants with voice capabilities without cloud dependency.

This is particularly valuable for privacy-sensitive applications, offline-first systems, and deployments where bandwidth or latency constraints make cloud TTS impractical. The community response (1,016 upvotes) reflects strong demand for this capability.

Source: r/LocalLLaMA · Relevance: 10/10