🎤 Praxy Voice

Open-source Hindi · Telugu · Tamil · English TTS — with code-mix and voice cloning.

Built on Chatterbox + IndicF5 + a Haiku-driven native-script transliteration preprocessor. Code: github.com/praxelhq/praxy · Model: Praxel/praxy-voice-r6 · Paper: arXiv (link soon).

⏱️ First synth takes 30–60 s — Modal spins up a GPU container on cold start. Subsequent generations are ~3–5 s. Be patient on the first click.

How to use

  1. Pick a voice from the dropdown (or clone your own in 📤 Clone my own voice below).
  2. Type or paste text in the language matching the voice — Hindi text for a Hindi voice, etc.
  3. Code-mix is automatic: write "मैंने WhatsApp पे message किया" and the model pronounces WhatsApp the way Indians say it (vaa-ts-ay-p), not the American way.
  4. Click Generate.

Tip: for natural code-mix output, write the way you'd actually message a friend — meeting, weekend, CEO, WhatsApp, coffee are all fair game.

🎙️ Voice

Pre-made voices use commercial-grade reference clips. Pick Use my own to clone any voice.

Upload a clean 8–15 s clip of someone speaking in the language you want output. Paste the exact transcript (word-for-word) in the box below.

Recording tips

  • Quiet room, single speaker, natural pace.
  • Phone audio works fine. WhatsApp voice notes work great.
  • One paragraph, ~10 s of audio.

Sample reference text — record yourself reading one of these, then paste it back as the transcript:

Lang Sample text to read
Hindi मेरा नाम राहुल है। मैं मुंबई में रहता हूँ और एक सॉफ्टवेयर इंजीनियर हूँ।
Telugu నా పేరు రాహుల్. నేను ముంబైలో ఉంటాను, సాఫ్ట్‌వేర్ ఇంజినీర్‌గా పని చేస్తున్నాను.
Tamil என் பெயர் ராகுல். நான் மும்பையில் வசிக்கிறேன், சாஃப்ட்வேர் இன்ஜினியராக வேலை செய்கிறேன்.
English My name is Rahul. I live in Mumbai and work as a software engineer.
🎯 Try one of these — click any row to load it
🎙️ Voice ✍️ Text to synthesise

About code-mix: when you mix English words into Hindi/Telugu/Tamil, the model auto-transliterates them to native-script phonetic spelling (WhatsAppव्हाट्सऐप) before synth. This matches how Bollywood subtitles, news tickers, and native Indian speakers actually write code-switched messages — closer to natural Indian English than American pronunciation.

Privacy: uploaded reference clips are processed in-memory and not stored. Generated audio is not logged.