Kokoro TTS — Local AI

Voice Studio

Type any text or speak into the mic — hear it voiced by one of 60+ AI voices across 9 languages, running entirely on local infrastructure.

Kokoro TTS Kokoro-FastAPI Whisper STT FastAPI 60+ Voices 9 Languages 100% Local

Choose a Voice

Loading voices…

Voice: —

Text to Speech

Speak & Convert

0 / 2500

Speed1.00×

Download

Speak — your words, their voice

Press Record, speak, then Stop. Audio is sent to the backend where Whisper transcribes it locally — then Kokoro re-synthesises the text in your chosen voice.

Click to start

Language

Transcription

Your speech will appear here after recording…

Download

How it works

Step 1

Pick a Voice

60+ voices across 9 languages served by Kokoro TTS running locally on the home server.

Step 2

Text to Speech

Type any text — the FastAPI backend proxies to Kokoro-FastAPI's OpenAI-compatible /v1/audio/speech endpoint.

Step 3

Speak and Convert

Browser MediaRecorder captures audio; Whisper (faster-whisper) transcribes it in the backend, then Kokoro re-synthesises in your chosen voice.

Step 4

100% Local

No external APIs. Kokoro-FastAPI and Whisper run on the home server. Zero data leaves the network.