Linux command
nvcc 命令
文本
复制后可按需替换文件名、目录或参数。
常用示例
Compile CUDA program
nvcc [program.cu] -o [program]
Compile to object file
nvcc -c [kernel.cu] -o [kernel.o]
Compile with specific GPU architecture
nvcc -arch=sm_[75] [program.cu] -o [program]
Generate PTX code
nvcc -ptx [kernel.cu]
Compile with optimization
nvcc -O3 [program.cu] -o [program]
Compile with debug symbols
nvcc -g -G [program.cu] -o [program]
Link with external library
nvcc [program.cu] -o [program] -l[cublas]
Show compilation stages
nvcc --dryrun [program.cu]
说明
nvcc is NVIDIA's CUDA compiler driver. It compiles CUDA C/C++ code that runs on NVIDIA GPUs along with host code that runs on the CPU. Compilation separates device code (kernels running on GPU) from host code (CPU). Device code compiles to PTX intermediate representation or directly to SASS (GPU machine code). Architecture flags (-arch) target specific GPU generations. Use `-arch=native` to auto-detect visible GPUs, or `-arch=all` to compile for all supported architectures. Forward compatibility uses PTX that JIT-compiles at runtime. The compiler integrates with host compilers (gcc, clang, MSVC) for CPU code. Separate compilation allows mixing CUDA with regular C++ in large projects. Debug builds (-g -G) enable cuda-gdb debugging. Optimization levels affect both host and device code performance. CUDA libraries (cuBLAS, cuDNN, cuFFT) link like regular libraries. Header paths and library paths may need specification for non-standard installations.
参数
- -o _FILE_
- Output file.
- -c
- Compile only, don't link.
- -arch _ARCH_
- GPU architecture (sm_50, sm_75, sm_86, etc.).
- -code _CODE_
- GPU code generation.
- -gencode _SPEC_
- Architecture/code pair (e.g., arch=compute_75,code=sm_75).
- -ptx
- Generate PTX assembly.
- -g
- Host debug symbols.
- -G
- Device debug symbols.
- -O _LEVEL_
- Optimization level (0-3).
- -I _DIR_
- Include directory.
- -L _DIR_
- Library directory.
- -l _LIB_
- Link library.
- --dryrun
- Show commands without executing.
- -Xcompiler _options_
- Pass options directly to the host compiler.
- -std _standard_
- C++ standard (e.g., c++14, c++17, c++20). Also accepted as `--std`.
- -dc
- Compile to relocatable device code (enables separate compilation).
- -rdc _true|false_
- Enable or disable relocatable device code.
- -dlink
- Link relocatable device code objects.
- -ccbin _PATH_
- Specify the host compiler binary (e.g., `/usr/bin/g++`).
- -Xlinker _options_
- Pass options directly to the host linker.
- -lineinfo
- Generate line-number information for device code (useful for profilers).
- -use_fast_math
- Enable fast math optimizations (implies `-ftz=true -prec-div=false -prec-sqrt=false`).
- -keep
- Retain intermediate compilation files.
- -t _N_
- Parallelize compilation across N threads.
FAQ
What is the nvcc command used for?
nvcc is NVIDIA's CUDA compiler driver. It compiles CUDA C/C++ code that runs on NVIDIA GPUs along with host code that runs on the CPU. Compilation separates device code (kernels running on GPU) from host code (CPU). Device code compiles to PTX intermediate representation or directly to SASS (GPU machine code). Architecture flags (-arch) target specific GPU generations. Use `-arch=native` to auto-detect visible GPUs, or `-arch=all` to compile for all supported architectures. Forward compatibility uses PTX that JIT-compiles at runtime. The compiler integrates with host compilers (gcc, clang, MSVC) for CPU code. Separate compilation allows mixing CUDA with regular C++ in large projects. Debug builds (-g -G) enable cuda-gdb debugging. Optimization levels affect both host and device code performance. CUDA libraries (cuBLAS, cuDNN, cuFFT) link like regular libraries. Header paths and library paths may need specification for non-standard installations.
How do I run a basic nvcc example?
Run `nvcc [program.cu] -o [program]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does -o _FILE_ do in nvcc?
Output file.