← 返回命令列表

Linux command

datamash 命令

文本

涉及管道、覆盖或删除,执行前请先确认路径和参数。

常用示例

Example

seq 3 | datamash max 1 min 1 mean 1 median 1

Example

echo -e '1.0\n2.5\n3.1' | tr '.' ',' | datamash mean 1

Example

echo -e '1\n2\n3' | datamash -R [decimals] mean 1

Example

echo -e '1\n2\nNa\n3\nNaN' | datamash --narm mean 1

说明

datamash performs basic numeric, textual, and statistical operations on input data from the command line. It's designed for quick data analysis tasks that would otherwise require scripting or statistical software, supporting operations like sum, mean, median, standard deviation, variance, and more. Input is read from stdin or files, with columns separated by whitespace or a specified delimiter. The tool can group data by fields and compute aggregate statistics for each group, similar to SQL's GROUP BY functionality. datamash is part of the GNU project and excels at one-liners for data exploration. It's commonly used in pipelines to analyze CSV files, log data, or any tabular text data. The tool can handle both numeric and textual operations, including counting unique values, string operations, and random sampling.

参数

-R, --round _digits_
Round numeric output to specified decimals
--narm
Ignore NA and NaN values
-t _char_
Use specified field separator
-g, --group _fields_
Group by specified fields
-H, --headers
First line is header

FAQ

What is the datamash command used for?

datamash performs basic numeric, textual, and statistical operations on input data from the command line. It's designed for quick data analysis tasks that would otherwise require scripting or statistical software, supporting operations like sum, mean, median, standard deviation, variance, and more. Input is read from stdin or files, with columns separated by whitespace or a specified delimiter. The tool can group data by fields and compute aggregate statistics for each group, similar to SQL's GROUP BY functionality. datamash is part of the GNU project and excels at one-liners for data exploration. It's commonly used in pipelines to analyze CSV files, log data, or any tabular text data. The tool can handle both numeric and textual operations, including counting unique values, string operations, and random sampling.

How do I run a basic datamash example?

Run `seq 3 | datamash max 1 min 1 mean 1 median 1` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does -R, --round _digits_ do in datamash?

Round numeric output to specified decimals