← 返回命令列表

Linux command

vcftools 命令

文件

复制后可按需替换文件名、目录或参数。

常用示例

Filter VCF file

vcftools --vcf [input.vcf] --chr [chr1] --recode --out [output]

Calculate allele frequency

vcftools --vcf [input.vcf] --freq --out [output]

Extract specific individuals

vcftools --vcf [input.vcf] --keep [individuals.txt] --recode --out [output]

Filter by minimum quality score

vcftools --vcf [input.vcf] --minQ [30] --recode --out [output]

Calculate depth statistics

vcftools --vcf [input.vcf] --depth --out [output]

Filter by minor allele frequency

vcftools --vcf [input.vcf] --maf [0.05] --recode --out [output]

Read compressed VCF

vcftools --gzvcf [input.vcf.gz] --freq --out [output]

说明

VCFtools is a suite of utilities for analyzing Variant Call Format (VCF) and Binary Call Format (BCF) files, the standard formats for storing genomic sequence variations. It provides comprehensive tools for filtering, manipulating, and computing statistics from variant data. The tool supports filtering variants by quality scores, allele frequencies, missing data, genomic regions, and individual samples. It calculates population genetics statistics including allele frequencies, nucleotide diversity, Fst, linkage disequilibrium, and relatedness measures. VCFtools can convert between formats, compare VCF files, and extract subsets of data for downstream analysis. Output files use the prefix specified by --out with appropriate extensions for each analysis type.

参数

--vcf _file_
Input VCF file (v4.0, v4.1, or v4.2).
--gzvcf _file_
Input compressed (gzipped) VCF file.
--bcf _file_
Input BCF2 format file.
--out _prefix_
Output file prefix. Results are written to prefix.extension.
--recode
Output a new VCF file after applying filters.
--recode-INFO-all
Retain all INFO fields in recoded output.
--chr _name_
Process only variants on specified chromosome.
--keep _file_
Retain only individuals listed in file (one ID per line).
--remove _file_
Remove individuals listed in file.
--maf _float_
Filter by minimum minor allele frequency.
--minQ _int_
Minimum variant quality score.
--freq
Calculate allele frequencies.
--depth
Calculate mean depth per individual.
--relatedness
Calculate pairwise relatedness statistics.
--hap-r2
Calculate linkage disequilibrium statistics using phased haplotypes.

FAQ

What is the vcftools command used for?

VCFtools is a suite of utilities for analyzing Variant Call Format (VCF) and Binary Call Format (BCF) files, the standard formats for storing genomic sequence variations. It provides comprehensive tools for filtering, manipulating, and computing statistics from variant data. The tool supports filtering variants by quality scores, allele frequencies, missing data, genomic regions, and individual samples. It calculates population genetics statistics including allele frequencies, nucleotide diversity, Fst, linkage disequilibrium, and relatedness measures. VCFtools can convert between formats, compare VCF files, and extract subsets of data for downstream analysis. Output files use the prefix specified by --out with appropriate extensions for each analysis type.

How do I run a basic vcftools example?

Run `vcftools --vcf [input.vcf] --chr [chr1] --recode --out [output]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does --vcf _file_ do in vcftools?

Input VCF file (v4.0, v4.1, or v4.2).