Parallel Algorithms with CUDA
We can now better discuss and understand parallel programs and how to express them using CUDA, and how the hardware details and programming concepts affect and direct our programs. The next step on our learning path is to understand better what it is that makes a parallel algorithm parallel, and what the limits are to the degree of parallelization that can be achieved.
First we will discuss the principles that guide parallel algorithms, in more detail than we did in Chapter 1 since we can now refer to the programs we wrote there. After that we will look at some common algorithms that illustrate different levels of parallelization, allowing us to see the challenges ahead. The chapter wraps up with a discussion of different implementations, to help you apply the knowledge gained to real situations.
Over the course of the chapter we will be covering the following topics:
- Design principles of parallel algorithms
- Parallel matrix operations...