How to generate voice from text using fish-speech?
Text to Voice
Clone repo
[email protected]:fishaudio/fish-speech.git
cd fish-speech
Prepare the environment
conda create -n fish-speech python=3.10
conda activate fish-speech
Install cli to download models from huggingface hub
pip install -U "huggingface_hub[cli]"
Download fish-speech-1.4 model
You can check if version 1.4 is still relevant in docs
huggingface-cli download fishaudio/fish-speech-1.4 --local-dir checkpoints/fish-speech-1.4/
Install dependencies in the root of the project
pip3 install -e .
Generate codes_0.npy file from text
python tools/llama/generate.py --text "Your text here" --checkpoint-path "checkpoints/fish-speech-1.4"
Generate audio from codes_0.npy file
python tools/vqgan/inference.py -i "codes_0.npy" --checkpoint-path "checkpoints/fish-speech-1.4/firefly-gan-vq-fsq-8x1024-21hz-generator.pth"
Install sox
yay -S sox
play audio
play fake.wav
Other tts tools
You can check festival, espeak, google text to speech, etc. for other text to speech tools.
Install festival
yay -S festival festival-english
and use it
echo "Hello world" | festival --tts
Install espeak
yay -S espeak
and use it
espeak "Hello world"
Install google text to speech
pip install gtts
and use it
gtts-cli "Hello, this is a Google TTS test." --lang en --output output.mp3
mpv output.mp3