← 返回命令列表

Linux command

piper-tts 命令

网络

复制后可按需替换文件名、目录或参数。

常用示例

Download

python3 -m piper.download_voices [en_US-lessac-medium]

Synthesize

echo "[Hello world.]" | piper --model [en_US-lessac-medium] --output-file [out.wav]

Synthesize

piper --model [en_US-lessac-medium] --output-file [greeting.wav] -- "[This is a test.]"

Stream raw audio

echo "[Hi there.]" | piper --model [en_US-lessac-medium] --output-raw | aplay -r [22050] -f S16_LE -t raw -

Select a speaker

piper --model [de_DE-thorsten-medium] --speaker [0] --output-file [de.wav] -- "[Guten Tag.]"

Use GPU

piper --model [en_US-lessac-medium] --cuda --output-file [gpu.wav] -- "[Running on GPU.]"

说明

piper is an offline neural text-to-speech engine that runs VITS voice models exported to ONNX. Installing the piper-tts Python package provides the piper binary, which reads text from standard input (or the file given by --input-file) and writes 16-bit PCM WAV audio either to a file or to standard output for streaming. Voices are distributed separately and downloaded with python3 -m piper.download_voices. Models cover many languages and accents, and several are multi-speaker — use --speaker to pick a voice. Phonemization is performed via embedded espeak-ng, and inline overrides written in double brackets (for example `[ bˈætmæn ]`) are accepted for fine-grained pronunciation control. Output quality and latency depend on the model variant (_x_low_, _low_, _medium_, _high_). For interactive use, piper is typically wrapped behind a daemon so the model is loaded once instead of on every invocation.

参数

-m, --model _voice_
Voice model identifier (e.g. _en_US-lessac-medium_) or path to an ONNX file.
-f, --output-file _file_
Write the synthesized WAV audio to _file_.
--output-raw
Write raw 16-bit PCM audio to stdout (for piping into a player).
--input-file _file_
Read input text from _file_ instead of stdin.
--data-dir _dir_
Directory where downloaded voice files are stored.
--speaker _n_
Select speaker index for multi-speaker voices.
--sentence-silence _seconds_
Seconds of silence inserted between sentences.
--volume _factor_
Output volume multiplier (1.0 is unchanged).
--cuda
Use the CUDA execution provider (requires onnxruntime-gpu).
--json-input
Read JSON objects from stdin instead of plain text.

FAQ

What is the piper-tts command used for?

piper is an offline neural text-to-speech engine that runs VITS voice models exported to ONNX. Installing the piper-tts Python package provides the piper binary, which reads text from standard input (or the file given by --input-file) and writes 16-bit PCM WAV audio either to a file or to standard output for streaming. Voices are distributed separately and downloaded with python3 -m piper.download_voices. Models cover many languages and accents, and several are multi-speaker — use --speaker to pick a voice. Phonemization is performed via embedded espeak-ng, and inline overrides written in double brackets (for example `[ bˈætmæn ]`) are accepted for fine-grained pronunciation control. Output quality and latency depend on the model variant (_x_low_, _low_, _medium_, _high_). For interactive use, piper is typically wrapped behind a daemon so the model is loaded once instead of on every invocation.

How do I run a basic piper-tts example?

Run `python3 -m piper.download_voices [en_US-lessac-medium]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does -m, --model _voice_ do in piper-tts?

Voice model identifier (e.g. _en_US-lessac-medium_) or path to an ONNX file.