← 返回命令列表

Linux command

aws-glue 命令

文件

复制后可按需替换文件名、目录或参数。

常用示例

Create a crawler

aws glue create-crawler --name [my-crawler] --role [arn:aws:iam::account:role/glue-role] --database-name [my-database] --targets S3Targets=[{Path=s3://my-bucket/data/}]

Start a crawler

aws glue start-crawler --name [my-crawler]

List all databases

aws glue get-databases

Get table schema

aws glue get-table --database-name [my-database] --name [my-table]

Create and start a Glue job

aws glue create-job --name [my-job] --role [arn:aws:iam::account:role/glue-role] --command Name=glueetl,ScriptLocation=s3://my-bucket/scripts/job.py

Start a job run

aws glue start-job-run --job-name [my-job] --arguments '{"--input_path":"s3://bucket/input","--output_path":"s3://bucket/output"}'

Get job run status

aws glue get-job-run --job-name [my-job] --run-id [jr_abc123]

说明

aws glue is the AWS CLI interface for AWS Glue, a serverless data integration service for ETL (extract, transform, load) workloads. Glue discovers, prepares, and combines data for analytics, machine learning, and application development. Key components include the Data Catalog (central metadata repository), Crawlers (automatic schema discovery), Jobs (ETL scripts in Python or Scala), and Triggers (job orchestration). Glue integrates with S3, Redshift, RDS, and other data stores.

FAQ

What is the aws-glue command used for?

aws glue is the AWS CLI interface for AWS Glue, a serverless data integration service for ETL (extract, transform, load) workloads. Glue discovers, prepares, and combines data for analytics, machine learning, and application development. Key components include the Data Catalog (central metadata repository), Crawlers (automatic schema discovery), Jobs (ETL scripts in Python or Scala), and Triggers (job orchestration). Glue integrates with S3, Redshift, RDS, and other data stores.

How do I run a basic aws-glue example?

Run `aws glue create-crawler --name [my-crawler] --role [arn:aws:iam::account:role/glue-role] --database-name [my-database] --targets S3Targets=[{Path=s3://my-bucket/data/}]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

Where can I find more aws-glue examples?

This page includes 7 examples for aws-glue, plus related commands for nearby Linux tasks.