Linux command
airflow 命令
文本
复制后可按需替换文件名、目录或参数。
常用示例
Start the Airflow scheduler
airflow scheduler
Start the web server
airflow webserver --port [8080]
List all DAGs
airflow dags list
Trigger a DAG run
airflow dags trigger [dag_id]
Trigger a DAG
airflow dags trigger [dag_id] --conf '{"key": "value"}'
Test a specific task
airflow tasks test [dag_id] [task_id] [execution_date]
Pause a DAG
airflow dags pause [dag_id]
Unpause a DAG
airflow dags unpause [dag_id]
List all DAG runs
airflow dags list-runs -d [dag_id]
Initialize the database
airflow db migrate
说明
Apache Airflow is a platform for programmatically authoring, scheduling, and monitoring workflows. The CLI provides comprehensive control over DAGs (Directed Acyclic Graphs), tasks, connections, and the Airflow services. Workflows are defined as Python code, creating DAGs that describe how tasks should be organized and executed. The scheduler triggers tasks based on defined schedules and dependencies, while the web interface provides monitoring and manual intervention capabilities. The tool manages connections to external systems (databases, APIs, cloud services) and variables for configuration. Resource pools allow controlling task concurrency. The database stores metadata about DAG runs, task states, and history. Common workflows include initializing the database with db migrate, starting the scheduler and webserver, and using dags trigger to manually start DAG runs. Tasks can be tested individually without affecting production state using tasks test.
参数
- scheduler
- Start the Airflow scheduler daemon to trigger DAG runs
- webserver
- Start the Airflow web interface server
- triggerer
- Start the async trigger service for deferrable operators
- dags
- Manage DAGs (list, trigger, pause, unpause, test, delete, backfill)
- tasks
- Manage and test individual tasks (run, test, clear, list, render)
- db
- Database operations (migrate, reset, clean, check, shell)
- connections
- Manage connection configurations (add, delete, list, export, import)
- variables
- Manage Airflow variables (get, set, delete, list, export, import)
- pools
- Manage resource pools for task concurrency control
- users
- Manage Airflow users (create, delete, list)
- config
- View and manage configuration settings
- providers
- Display information about installed providers
- info
- Show system and environment information
- version
- Display Airflow version
- -o, --output _format_
- Output format: table, json, yaml, plain
- -v, --verbose
- Enable verbose logging
FAQ
What is the airflow command used for?
Apache Airflow is a platform for programmatically authoring, scheduling, and monitoring workflows. The CLI provides comprehensive control over DAGs (Directed Acyclic Graphs), tasks, connections, and the Airflow services. Workflows are defined as Python code, creating DAGs that describe how tasks should be organized and executed. The scheduler triggers tasks based on defined schedules and dependencies, while the web interface provides monitoring and manual intervention capabilities. The tool manages connections to external systems (databases, APIs, cloud services) and variables for configuration. Resource pools allow controlling task concurrency. The database stores metadata about DAG runs, task states, and history. Common workflows include initializing the database with db migrate, starting the scheduler and webserver, and using dags trigger to manually start DAG runs. Tasks can be tested individually without affecting production state using tasks test.
How do I run a basic airflow example?
Run `airflow scheduler` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does scheduler do in airflow?
Start the Airflow scheduler daemon to trigger DAG runs