← 返回命令列表

Linux command

faster-whisper 命令

文本

复制后可按需替换文件名、目录或参数。

常用示例

Transcribe an audio file

faster-whisper [audio.mp3]

Transcribe with a specific model

faster-whisper [audio.mp3] --model [large-v3]

Transcribe with language hint

faster-whisper [audio.mp3] --language [en]

Output as SRT subtitles

faster-whisper [audio.mp3] --output_format srt

Translate to English

faster-whisper [audio.mp3] --task translate

Save output to directory

faster-whisper [audio.mp3] --output_dir [/path/to/output]

Transcribe with word timestamps

faster-whisper [audio.mp3] --word_timestamps true

说明

faster-whisper is a reimplementation of OpenAI's Whisper using CTranslate2, a fast inference engine for Transformer models. It provides up to 4x faster transcription than the original Whisper while using less memory. The tool supports all Whisper model sizes. Larger models are more accurate but slower. The compute type parameter controls precision: int8 is fastest and most memory-efficient, float16 is a good balance on GPU, float32 is highest precision. Voice activity detection (VAD) filtering skips silent sections, improving both speed and accuracy. Language detection is automatic but specifying the language avoids detection overhead. The base library is installed via `pip install faster-whisper` (Python API only). For CLI usage, install a wrapper such as `pip install faster-whisper-cli` or `pip install whisper-ctranslate2`. CTranslate2 handles model conversion automatically. GPU acceleration requires CUDA toolkit.

参数

--model _SIZE_
Model size: tiny, base, small, medium, large-v1, large-v2, large-v3 (default: small).
--language _LANG_
Language code (en, de, fr, etc.) or auto-detect.
--task _TASK_
Task: transcribe or translate.
--output_format _FORMAT_
Output format: txt, vtt, srt, tsv, json, all.
--output_dir _DIR_
Output directory for results.
--word_timestamps _BOOL_
Include word-level timestamps.
--device _DEVICE_
Device: cpu, cuda, auto (default: auto).
--compute_type _TYPE_
Compute type: int8, float16, float32 (default: int8 on CPU).
--beam_size _N_
Beam search size (default: 5).
--vad_filter _BOOL_
Enable voice activity detection filter (uses Silero VAD).
--initial_prompt _TEXT_
Optional text to provide as initial prompt for the decoder.
--threads _N_
Number of CPU threads.

FAQ

What is the faster-whisper command used for?

faster-whisper is a reimplementation of OpenAI's Whisper using CTranslate2, a fast inference engine for Transformer models. It provides up to 4x faster transcription than the original Whisper while using less memory. The tool supports all Whisper model sizes. Larger models are more accurate but slower. The compute type parameter controls precision: int8 is fastest and most memory-efficient, float16 is a good balance on GPU, float32 is highest precision. Voice activity detection (VAD) filtering skips silent sections, improving both speed and accuracy. Language detection is automatic but specifying the language avoids detection overhead. The base library is installed via `pip install faster-whisper` (Python API only). For CLI usage, install a wrapper such as `pip install faster-whisper-cli` or `pip install whisper-ctranslate2`. CTranslate2 handles model conversion automatically. GPU acceleration requires CUDA toolkit.

How do I run a basic faster-whisper example?

Run `faster-whisper [audio.mp3]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does --model _SIZE_ do in faster-whisper?

Model size: tiny, base, small, medium, large-v1, large-v2, large-v3 (default: small).