Linux command
mashtree 命令
文本
涉及管道、覆盖或删除,执行前请先确认路径和参数。
常用示例
Fastest method
mashtree --numcpus [12] [*.fastq.gz] [*.fasta] > [mashtree.dnd]
Most accurate method
mashtree --mindepth 0 --numcpus [12] [*.fastq.gz] [*.fasta] > [mashtree.dnd]
Example
mashtree_bootstrap.pl --reps [100] --numcpus [12] [*.fastq.gz] -- --min-depth 0 > [mashtree.bootstrap.dnd]
说明
mashtree rapidly creates distance trees from genome sequences using MinHash sketching. It computes pairwise distances between genomes based on k-mer similarity and constructs a neighbor-joining tree. The tool accepts FASTA, FASTQ, and compressed versions (.gz) of both formats. It uses the Mash algorithm internally for efficient sketch-based distance calculation, making it suitable for thousands of genomes. Output is in Newick format (.dnd), compatible with tree visualization tools. Note that mashtree creates distance trees, not phylogenetic trees—it shows similarity relationships, not evolutionary history.
参数
- --numcpus _n_
- Number of CPU threads to use for parallel processing
- --mindepth _n_
- Minimum depth for k-mer counting (0 for most accuracy)
- --genomesize _size_
- Estimated genome size for sketch calculations
- --truncLength _n_
- Truncate sequence names at this length
- --outtree _file_
- Output file for the tree (default: stdout)
FAQ
What is the mashtree command used for?
mashtree rapidly creates distance trees from genome sequences using MinHash sketching. It computes pairwise distances between genomes based on k-mer similarity and constructs a neighbor-joining tree. The tool accepts FASTA, FASTQ, and compressed versions (.gz) of both formats. It uses the Mash algorithm internally for efficient sketch-based distance calculation, making it suitable for thousands of genomes. Output is in Newick format (.dnd), compatible with tree visualization tools. Note that mashtree creates distance trees, not phylogenetic trees—it shows similarity relationships, not evolutionary history.
How do I run a basic mashtree example?
Run `mashtree --numcpus [12] [*.fastq.gz] [*.fasta] > [mashtree.dnd]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does --numcpus _n_ do in mashtree?
Number of CPU threads to use for parallel processing