← 返回命令列表

Linux command

nvcc 命令

文本

复制后可按需替换文件名、目录或参数。

常用示例

Compile CUDA program

nvcc [program.cu] -o [program]

Compile to object file

nvcc -c [kernel.cu] -o [kernel.o]

Compile with specific GPU architecture

nvcc -arch=sm_[75] [program.cu] -o [program]

Generate PTX code

nvcc -ptx [kernel.cu]

Compile with optimization

nvcc -O3 [program.cu] -o [program]

Compile with debug symbols

nvcc -g -G [program.cu] -o [program]

Link with external library

nvcc [program.cu] -o [program] -l[cublas]

Show compilation stages

nvcc --dryrun [program.cu]

说明

nvcc is NVIDIA's CUDA compiler driver. It compiles CUDA C/C++ code that runs on NVIDIA GPUs along with host code that runs on the CPU. Compilation separates device code (kernels running on GPU) from host code (CPU). Device code compiles to PTX intermediate representation or directly to SASS (GPU machine code). Architecture flags (-arch) target specific GPU generations. Use `-arch=native` to auto-detect visible GPUs, or `-arch=all` to compile for all supported architectures. Forward compatibility uses PTX that JIT-compiles at runtime. The compiler integrates with host compilers (gcc, clang, MSVC) for CPU code. Separate compilation allows mixing CUDA with regular C++ in large projects. Debug builds (-g -G) enable cuda-gdb debugging. Optimization levels affect both host and device code performance. CUDA libraries (cuBLAS, cuDNN, cuFFT) link like regular libraries. Header paths and library paths may need specification for non-standard installations.

参数

-o _FILE_
Output file.
-c
Compile only, don't link.
-arch _ARCH_
GPU architecture (sm_50, sm_75, sm_86, etc.).
-code _CODE_
GPU code generation.
-gencode _SPEC_
Architecture/code pair (e.g., arch=compute_75,code=sm_75).
-ptx
Generate PTX assembly.
-g
Host debug symbols.
-G
Device debug symbols.
-O _LEVEL_
Optimization level (0-3).
-I _DIR_
Include directory.
-L _DIR_
Library directory.
-l _LIB_
Link library.
--dryrun
Show commands without executing.
-Xcompiler _options_
Pass options directly to the host compiler.
-std _standard_
C++ standard (e.g., c++14, c++17, c++20). Also accepted as `--std`.
-dc
Compile to relocatable device code (enables separate compilation).
-rdc _true|false_
Enable or disable relocatable device code.
-dlink
Link relocatable device code objects.
-ccbin _PATH_
Specify the host compiler binary (e.g., `/usr/bin/g++`).
-Xlinker _options_
Pass options directly to the host linker.
-lineinfo
Generate line-number information for device code (useful for profilers).
-use_fast_math
Enable fast math optimizations (implies `-ftz=true -prec-div=false -prec-sqrt=false`).
-keep
Retain intermediate compilation files.
-t _N_
Parallelize compilation across N threads.

FAQ

What is the nvcc command used for?

nvcc is NVIDIA's CUDA compiler driver. It compiles CUDA C/C++ code that runs on NVIDIA GPUs along with host code that runs on the CPU. Compilation separates device code (kernels running on GPU) from host code (CPU). Device code compiles to PTX intermediate representation or directly to SASS (GPU machine code). Architecture flags (-arch) target specific GPU generations. Use `-arch=native` to auto-detect visible GPUs, or `-arch=all` to compile for all supported architectures. Forward compatibility uses PTX that JIT-compiles at runtime. The compiler integrates with host compilers (gcc, clang, MSVC) for CPU code. Separate compilation allows mixing CUDA with regular C++ in large projects. Debug builds (-g -G) enable cuda-gdb debugging. Optimization levels affect both host and device code performance. CUDA libraries (cuBLAS, cuDNN, cuFFT) link like regular libraries. Header paths and library paths may need specification for non-standard installations.

How do I run a basic nvcc example?

Run `nvcc [program.cu] -o [program]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does -o _FILE_ do in nvcc?

Output file.