Linux command
parquet-tools 命令
文件
复制后可按需替换文件名、目录或参数。
常用示例
Show schema
parquet-tools schema [file.parquet]
View data
parquet-tools cat [file.parquet]
Show metadata
parquet-tools meta [file.parquet]
View first N rows
parquet-tools head -n [10] [file.parquet]
Show row count
parquet-tools rowcount [file.parquet]
Convert to JSON
parquet-tools cat --json [file.parquet]
Show column index info
parquet-tools column-index [file.parquet]
Dump specific columns
parquet-tools cat --columns [col1,col2] [file.parquet]
说明
parquet-tools inspects Apache Parquet files. Parquet is a columnar storage format used in big data systems. Schema inspection shows column names, types, and nesting. This helps understand data structure without reading contents. Cat and head commands display actual data. JSON output integrates with other tools. Metadata shows compression, encoding, and statistics. Row groups and column chunks reveal physical layout. Parquet files from Spark, Hive, and other systems can be examined. Useful for debugging data pipelines.
参数
- cat
- Print file contents.
- head
- Print first rows.
- schema
- Show schema.
- meta
- Show file metadata.
- rowcount
- Count rows.
- column-index
- Show column index.
- -n _N_
- Number of rows.
- --json
- JSON output format.
- --columns _COLS_
- Specific columns.
FAQ
What is the parquet-tools command used for?
parquet-tools inspects Apache Parquet files. Parquet is a columnar storage format used in big data systems. Schema inspection shows column names, types, and nesting. This helps understand data structure without reading contents. Cat and head commands display actual data. JSON output integrates with other tools. Metadata shows compression, encoding, and statistics. Row groups and column chunks reveal physical layout. Parquet files from Spark, Hive, and other systems can be examined. Useful for debugging data pipelines.
How do I run a basic parquet-tools example?
Run `parquet-tools schema [file.parquet]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does cat do in parquet-tools?
Print file contents.