Linux command
sdiag 命令
文本
复制后可按需替换文件名、目录或参数。
常用示例
Show scheduling diagnostic information
sdiag
Show diagnostics sorted by RPC total run time
sdiag -t
Show diagnostics sorted by RPC average run time
sdiag -T
Reset performance counters
sdiag -r
Output diagnostics as JSON
sdiag --json
说明
sdiag displays diagnostic information about slurmctld, the Slurm controller daemon. It shows performance metrics, scheduling statistics, RPC counters, and resource usage data. This is useful for monitoring cluster health, troubleshooting scheduling performance, and identifying bottlenecks in the Slurm controller.
参数
- -a, --all
- Get and report information. This is the default mode of operation.
- -h, --help
- Print description of options and exit.
- -i, --sort-by-id
- Sort RPC data by message type ID and user ID.
- -r, --reset
- Reset scheduler and RPC counters to 0. Only supported for Slurm operators and administrators.
- -t, --sort-by-time
- Sort RPC data by total run time.
- -T, --sort-by-time2
- Sort RPC data by average run time.
- --json
- Output information as JSON.
- --yaml
- Output information as YAML.
- -V, --version
- Print version number and exit.
- --usage
- Print list of options and exit.
FAQ
What is the sdiag command used for?
sdiag displays diagnostic information about slurmctld, the Slurm controller daemon. It shows performance metrics, scheduling statistics, RPC counters, and resource usage data. This is useful for monitoring cluster health, troubleshooting scheduling performance, and identifying bottlenecks in the Slurm controller.
How do I run a basic sdiag example?
Run `sdiag` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does -a, --all do in sdiag?
Get and report information. This is the default mode of operation.