Linux command
faster-whisper 命令
文本
复制后可按需替换文件名、目录或参数。
常用示例
Transcribe an audio file
faster-whisper [audio.mp3]
Transcribe with a specific model
faster-whisper [audio.mp3] --model [large-v3]
Transcribe with language hint
faster-whisper [audio.mp3] --language [en]
Output as SRT subtitles
faster-whisper [audio.mp3] --output_format srt
Translate to English
faster-whisper [audio.mp3] --task translate
Save output to directory
faster-whisper [audio.mp3] --output_dir [/path/to/output]
Transcribe with word timestamps
faster-whisper [audio.mp3] --word_timestamps true
说明
faster-whisper is a reimplementation of OpenAI's Whisper using CTranslate2, a fast inference engine for Transformer models. It provides up to 4x faster transcription than the original Whisper while using less memory. The tool supports all Whisper model sizes. Larger models are more accurate but slower. The compute type parameter controls precision: int8 is fastest and most memory-efficient, float16 is a good balance on GPU, float32 is highest precision. Voice activity detection (VAD) filtering skips silent sections, improving both speed and accuracy. Language detection is automatic but specifying the language avoids detection overhead. The base library is installed via `pip install faster-whisper` (Python API only). For CLI usage, install a wrapper such as `pip install faster-whisper-cli` or `pip install whisper-ctranslate2`. CTranslate2 handles model conversion automatically. GPU acceleration requires CUDA toolkit.
参数
- --model _SIZE_
- Model size: tiny, base, small, medium, large-v1, large-v2, large-v3 (default: small).
- --language _LANG_
- Language code (en, de, fr, etc.) or auto-detect.
- --task _TASK_
- Task: transcribe or translate.
- --output_format _FORMAT_
- Output format: txt, vtt, srt, tsv, json, all.
- --output_dir _DIR_
- Output directory for results.
- --word_timestamps _BOOL_
- Include word-level timestamps.
- --device _DEVICE_
- Device: cpu, cuda, auto (default: auto).
- --compute_type _TYPE_
- Compute type: int8, float16, float32 (default: int8 on CPU).
- --beam_size _N_
- Beam search size (default: 5).
- --vad_filter _BOOL_
- Enable voice activity detection filter (uses Silero VAD).
- --initial_prompt _TEXT_
- Optional text to provide as initial prompt for the decoder.
- --threads _N_
- Number of CPU threads.
FAQ
What is the faster-whisper command used for?
faster-whisper is a reimplementation of OpenAI's Whisper using CTranslate2, a fast inference engine for Transformer models. It provides up to 4x faster transcription than the original Whisper while using less memory. The tool supports all Whisper model sizes. Larger models are more accurate but slower. The compute type parameter controls precision: int8 is fastest and most memory-efficient, float16 is a good balance on GPU, float32 is highest precision. Voice activity detection (VAD) filtering skips silent sections, improving both speed and accuracy. Language detection is automatic but specifying the language avoids detection overhead. The base library is installed via `pip install faster-whisper` (Python API only). For CLI usage, install a wrapper such as `pip install faster-whisper-cli` or `pip install whisper-ctranslate2`. CTranslate2 handles model conversion automatically. GPU acceleration requires CUDA toolkit.
How do I run a basic faster-whisper example?
Run `faster-whisper [audio.mp3]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does --model _SIZE_ do in faster-whisper?
Model size: tiny, base, small, medium, large-v1, large-v2, large-v3 (default: small).