Linux command
piper-tts 命令
网络
复制后可按需替换文件名、目录或参数。
常用示例
Download
python3 -m piper.download_voices [en_US-lessac-medium]
Synthesize
echo "[Hello world.]" | piper --model [en_US-lessac-medium] --output-file [out.wav]
Synthesize
piper --model [en_US-lessac-medium] --output-file [greeting.wav] -- "[This is a test.]"
Stream raw audio
echo "[Hi there.]" | piper --model [en_US-lessac-medium] --output-raw | aplay -r [22050] -f S16_LE -t raw -
Select a speaker
piper --model [de_DE-thorsten-medium] --speaker [0] --output-file [de.wav] -- "[Guten Tag.]"
Use GPU
piper --model [en_US-lessac-medium] --cuda --output-file [gpu.wav] -- "[Running on GPU.]"
说明
piper is an offline neural text-to-speech engine that runs VITS voice models exported to ONNX. Installing the piper-tts Python package provides the piper binary, which reads text from standard input (or the file given by --input-file) and writes 16-bit PCM WAV audio either to a file or to standard output for streaming. Voices are distributed separately and downloaded with python3 -m piper.download_voices. Models cover many languages and accents, and several are multi-speaker — use --speaker to pick a voice. Phonemization is performed via embedded espeak-ng, and inline overrides written in double brackets (for example `[ bˈætmæn ]`) are accepted for fine-grained pronunciation control. Output quality and latency depend on the model variant (_x_low_, _low_, _medium_, _high_). For interactive use, piper is typically wrapped behind a daemon so the model is loaded once instead of on every invocation.
参数
- -m, --model _voice_
- Voice model identifier (e.g. _en_US-lessac-medium_) or path to an ONNX file.
- -f, --output-file _file_
- Write the synthesized WAV audio to _file_.
- --output-raw
- Write raw 16-bit PCM audio to stdout (for piping into a player).
- --input-file _file_
- Read input text from _file_ instead of stdin.
- --data-dir _dir_
- Directory where downloaded voice files are stored.
- --speaker _n_
- Select speaker index for multi-speaker voices.
- --sentence-silence _seconds_
- Seconds of silence inserted between sentences.
- --volume _factor_
- Output volume multiplier (1.0 is unchanged).
- --cuda
- Use the CUDA execution provider (requires onnxruntime-gpu).
- --json-input
- Read JSON objects from stdin instead of plain text.
FAQ
What is the piper-tts command used for?
piper is an offline neural text-to-speech engine that runs VITS voice models exported to ONNX. Installing the piper-tts Python package provides the piper binary, which reads text from standard input (or the file given by --input-file) and writes 16-bit PCM WAV audio either to a file or to standard output for streaming. Voices are distributed separately and downloaded with python3 -m piper.download_voices. Models cover many languages and accents, and several are multi-speaker — use --speaker to pick a voice. Phonemization is performed via embedded espeak-ng, and inline overrides written in double brackets (for example `[ bˈætmæn ]`) are accepted for fine-grained pronunciation control. Output quality and latency depend on the model variant (_x_low_, _low_, _medium_, _high_). For interactive use, piper is typically wrapped behind a daemon so the model is loaded once instead of on every invocation.
How do I run a basic piper-tts example?
Run `python3 -m piper.download_voices [en_US-lessac-medium]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does -m, --model _voice_ do in piper-tts?
Voice model identifier (e.g. _en_US-lessac-medium_) or path to an ONNX file.