← 返回命令列表

Linux command

tabix 命令

文件

复制后可按需替换文件名、目录或参数。

常用示例

Index a VCF file

tabix -p vcf [file.vcf.gz]

Index a BED file

tabix -p bed [file.bed.gz]

Index a GFF file

tabix -p gff [file.gff.gz]

Query a region

tabix [file.vcf.gz] [chr1:1000000-2000000]

Query with header output

tabix -h [file.vcf.gz] [chr1:1000000-2000000]

List chromosomes in index

tabix -l [file.vcf.gz]

Query regions from file

tabix -R [regions.bed] [file.vcf.gz]

Create CSI index

tabix -C -p vcf [file.vcf.gz]

说明

tabix is a generic indexer for TAB-delimited genome position files. It creates an index that enables fast retrieval of data lines overlapping specified genomic regions. Input files must be position-sorted and compressed with bgzip. The index file (.tbi or .csi) enables random access to compressed data without decompressing the entire file. Common applications include indexing VCF variant files, BED annotation files, and GFF/GTF gene annotation files. The tool is essential for working with large genomic datasets in bioinformatics pipelines. Region queries use 1-based inclusive coordinates in the format chr:start-end.

参数

-p, --preset _format_
Input format preset: gff, bed, sam, vcf.
-s, --sequence _col_
Column of sequence name (default: 1).
-b, --begin _col_
Column of start position (default: 4).
-e, --end _col_
Column of end position (default: 5).
-S, --skip-lines _n_
Skip first n lines.
-c, --comment _char_
Skip lines starting with character (default: #).
-0, --zero-based
Positions are 0-based half-open.
-C, --csi
Create CSI index instead of TBI.
-f, --force
Overwrite existing index.
-h, --print-header
Print header lines with output.
-H, --only-header
Print only header/meta lines.
-l, --list-chroms
List sequence names stored in the index file.
-r, --reheader _file_
Replace the header with the content of file.
-R, --regions _file_
Query regions from BED or TAB-delimited file.
-T, --targets _file_
Similar to -R but reads input sequentially.
-m, --min-shift _INT_
Set minimal interval size for CSI indices to 2^INT (default: 14).
-D
Do not download index file before opening (remote files only).
--separate-regions
Insert region name before each group in output.
--cache _INT_
Set BGZF block cache size in megabytes (default: 10).

FAQ

What is the tabix command used for?

tabix is a generic indexer for TAB-delimited genome position files. It creates an index that enables fast retrieval of data lines overlapping specified genomic regions. Input files must be position-sorted and compressed with bgzip. The index file (.tbi or .csi) enables random access to compressed data without decompressing the entire file. Common applications include indexing VCF variant files, BED annotation files, and GFF/GTF gene annotation files. The tool is essential for working with large genomic datasets in bioinformatics pipelines. Region queries use 1-based inclusive coordinates in the format chr:start-end.

How do I run a basic tabix example?

Run `tabix -p vcf [file.vcf.gz]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does -p, --preset _format_ do in tabix?

Input format preset: gff, bed, sam, vcf.