Kernel invocations with GPUArray
In the previous recipe, we saw how to invoke a kernel function using the class:
pycuda.compiler.SourceModule(kernel_source, nvcc="nvcc", options=None, other_options)
It creates a module from the CUDA source code called kernel_source. Then, the NVIDIA nvcc compiler is invoked with options to compile the code.
However, PyCUDA introduces the class pycuda.gpuarray.GPUArray that provides a high-level interface to perform calculations with CUDA:
class pycuda.gpuarray.GPUArray(shape, dtype, *, allocator=None, order="C")
This works in a similar way to numpy.ndarray, which stores its data and performs its computations on the compute device. The shape and dtype arguments work exactly as in NumPy.
All the arithmetic methods in GPUArray support the broadcasting of scalars. The creation of gpuarray is quite easy. One way is to create a NumPy array and convert it, as shown in the following code:
>>> import pycuda.gpuarray as gpuarray >>> from numpy.random import...
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                