#tts
#voice-cloning
In comparison, WhisperX (with GPU) was much slower, had slightly poorer diarization, and slightly better transcription.uvx --python 3.9 --index https://download.pytorch.org/whl/cu121 whisperx --diarize --lang en --hf_token $HUGGINGFACE_TOKEN`
<cough>...
, <laugh>...
, etc. But at ~$15/hr of output, it's too expensive. Ref #tts
#voice-cloning
#voice-cloning
#voice-cloning
#chatgpt
#voice-cloning
#github
#tts
#voice-cloning
#future
#tts
#voice-cloning
#tts
#voice-cloning
#tts
#voice-cloning
#tts
#voice-cloning
#future
#voice-cloning
#chatgpt
#tts
#voice-cloning
#tts
#voice-cloning
#voice-cloning
#voice-cloning
#ai-coding-tools
#voice-cloning
#tts
#voice-cloning
#embeddings
#voice-cloning
unoti/voice-embeddings,
retkowsky/audio_embeddings,
pyannote/embedding (for speaker similarity),
and more.#future
#voice-cloning
#future
#voice-cloning
#future
#huggingface
#voice-cloning
#voice-cloning
#voice-cloning
#voice-cloning
#voice-cloning
#voice-cloning
#ai-coding-tools
#voice-cloning
#embeddings
#future
#tts
#voice-cloning
#document-conversion
#tts
#voice-cloning