🎤 Praxy Voice
Open-source Hindi · Telugu · Tamil · English TTS — with code-mix and voice cloning.
Built on Chatterbox + IndicF5 + a Haiku-driven native-script transliteration preprocessor. Code: github.com/praxelhq/praxy · Model: Praxel/praxy-voice-r6 · Paper: arXiv (link soon).
How to use
- Pick a voice from the dropdown (or clone your own in 📤 Clone my own voice below).
- Type or paste text in the language matching the voice — Hindi text for a Hindi voice, etc.
- Code-mix is automatic: write "मैंने WhatsApp पे message किया" and the model pronounces WhatsApp the way Indians say it (vaa-ts-ay-p), not the American way.
- Click Generate.
Tip: for natural code-mix output, write the way you'd actually message a friend — meeting, weekend, CEO, WhatsApp, coffee are all fair game.
Pre-made voices use commercial-grade reference clips. Pick Use my own to clone any voice.
Upload a clean 8–15 s clip of someone speaking in the language you want output. Paste the exact transcript (word-for-word) in the box below.
Recording tips
- Quiet room, single speaker, natural pace.
- Phone audio works fine. WhatsApp voice notes work great.
- One paragraph, ~10 s of audio.
Sample reference text — record yourself reading one of these, then paste it back as the transcript:
| Lang | Sample text to read |
|---|---|
| Hindi | मेरा नाम राहुल है। मैं मुंबई में रहता हूँ और एक सॉफ्टवेयर इंजीनियर हूँ। |
| Telugu | నా పేరు రాహుల్. నేను ముంబైలో ఉంటాను, సాఫ్ట్వేర్ ఇంజినీర్గా పని చేస్తున్నాను. |
| Tamil | என் பெயர் ராகுல். நான் மும்பையில் வசிக்கிறேன், சாஃப்ட்வேர் இன்ஜினியராக வேலை செய்கிறேன். |
| English | My name is Rahul. I live in Mumbai and work as a software engineer. |
| 🎙️ Voice | ✍️ Text to synthesise |
|---|
About code-mix: when you mix English words into Hindi/Telugu/Tamil, the model auto-transliterates them to native-script phonetic spelling (WhatsApp → व्हाट्सऐप) before synth. This matches how Bollywood subtitles, news tickers, and native Indian speakers actually write code-switched messages — closer to natural Indian English than American pronunciation.
Privacy: uploaded reference clips are processed in-memory and not stored. Generated audio is not logged.