#ai-coding #automation #code-agents #prompt-engineering #voice-cloning#chatgpt #speech-to-text #tts #voice-cloning#voice-cloning#speech-to-text #tts #voice-cloning#automation #future #voice-cloning#voice-cloning#ai-coding #code-agents #llm-ops #prompt-engineering #voice-cloning#voice-cloning#llm-ops #voice-cloning #learning #lesson#automation #prompt-engineering #speech-to-text #voice-cloning #ai-codingpython asks it to use uv.#future #voice-cloning#chatgpt #speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning#future #models #voice-cloning#speech-to-text #voice-cloning#chatgpt #gpu #image-generation #voice-cloning#speech-to-text #voice-cloning#speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning#future #huggingface #speech-to-text #tts #voice-cloning#ai-coding #voice-cloning#speech-to-text #voice-cloning#llm-ops #speech-to-text #voice-cloning#future #llm-ops #speech-to-text #voice-cloning#future #speech-to-text #voice-cloning#speech-to-text #tts #voice-cloning#future #speech-to-text #tts #voice-cloning#speech-to-text #voice-cloning #5478#speech-to-text #tts #voice-cloning
In comparison, WhisperX (with GPU) was much slower, had slightly poorer diarization, and slightly better transcription.uvx --python 3.9 --index https://download.pytorch.org/whl/cu121 whisperx --diarize --lang en --hf_token $HUGGINGFACE_TOKEN
<cough>..., <laugh>..., etc. But at ~$15/hr of output, it's too expensive. Ref #speech-to-text #tts #voice-cloning#speech-to-text #voice-cloning#speech-to-text #voice-cloning[inaudible].#chatgpt #speech-to-text #voice-cloning#github #tts #voice-cloning#future #speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning
#future #speech-to-text #voice-cloning#chatgpt #speech-to-text #tts #voice-cloning#speech-to-text #tts #voice-cloning#speech-to-text #voice-cloning#speech-to-text #voice-cloning#speech-to-text #voice-cloning#speech-to-text #tts #voice-cloning
#embeddings #speech-to-text #voice-cloning
unoti/voice-embeddings,
retkowsky/audio_embeddings,
pyannote/embedding (for speaker similarity),
and more.#ai-art #future #image-generation #voice-cloning #write#future #speech-to-text #voice-cloning#future #huggingface #speech-to-text #voice-cloning#speech-to-text #voice-cloning#voice-cloning#speech-to-text #voice-cloning#voice-cloning#speech-to-text #voice-cloning#speech-to-text #voice-cloning#embeddings #future #speech-to-text #tts #voice-cloning#document-conversion #speech-to-text #tts #voice-cloning