Debugging Numba-CUDA kernels
Debugging CUDA kernels can be challenging, especially for Python users, because traditional debugging tools such as breakpoints, variable inspection, or step-by-step execution are not directly available within GPU code. Nevertheless, Numba-CUDA provides several features that enable basic debugging and inspection of GPU kernels, such as print statements, a CUDA simulator, and JIT compilation. The following sections will discuss each of these in detail.
Standard CUDA provides debugging tools such as CUDA-GDB and Compute Sanitizer, which can inspect CUDA applications at runtime. However, we avoid these tools in this book because our target audience consists of Python users who are assumed to have limited experience with the CUDA C workflow.
Using the print statement
Numba provides the ability to inspect variable values inside CUDA kernels using the built-in print statement. We can use this to display variable values and program flow during execution. This can...