Linux command
csvjoin 命令
文件
复制后可按需替换文件名、目录或参数。
常用示例
Join two CSV files
csvjoin -c [id] [file1.csv] [file2.csv]
Join on different column names
csvjoin -c "[id1,id2]" [file1.csv] [file2.csv]
Perform left outer join
csvjoin --left -c [id] [file1.csv] [file2.csv]
Perform right outer join
csvjoin --right -c [id] [file1.csv] [file2.csv]
Perform full outer join
csvjoin --outer -c [id] [file1.csv] [file2.csv]
Join on multiple columns
csvjoin -c "[col1,col2]" [file1.csv] [file2.csv]
说明
csvjoin is part of csvkit that performs SQL-style joins on CSV files. It combines data from two files based on matching values in specified columns, similar to JOIN operations in databases. The default join is an inner join, returning only rows with matches in both files. Left, right, and outer joins preserve unmatched rows from one or both files, filling missing values with empty strings. When joining on columns with different names, specify both names separated by a comma. Multiple columns can be used as composite keys for more complex joins.
参数
- -c _COLUMN_, --columns _COLUMN_
- Column(s) to join on. Comma-separated if different in each file.
- --left
- Perform a left outer join (keep all rows from first file).
- --right
- Perform a right outer join (keep all rows from second file).
- --outer
- Perform a full outer join (keep all rows from both files).
- -d _CHAR_, --delimiter _CHAR_
- Field delimiter (default: comma).
- -e _ENCODING_, --encoding _ENCODING_
- Input file encoding.
- -H, --no-inference
- Disable type inference.
- --no-header-row
- Files have no header row.
FAQ
What is the csvjoin command used for?
csvjoin is part of csvkit that performs SQL-style joins on CSV files. It combines data from two files based on matching values in specified columns, similar to JOIN operations in databases. The default join is an inner join, returning only rows with matches in both files. Left, right, and outer joins preserve unmatched rows from one or both files, filling missing values with empty strings. When joining on columns with different names, specify both names separated by a comma. Multiple columns can be used as composite keys for more complex joins.
How do I run a basic csvjoin example?
Run `csvjoin -c [id] [file1.csv] [file2.csv]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does -c _COLUMN_, --columns _COLUMN_ do in csvjoin?
Column(s) to join on. Comma-separated if different in each file.