Reducing from many
In parallel computing, reduction is a key operation that allows us to take a collection of values and combine them into a single result. It might sound like a straightforward task — adding up a list of numbers or finding a maximum value, for instance — but in parallel contexts reductions can be trickier than you might suppose. This is because reductions require us to combine partial results from multiple threads, which introduces the need for synchronization. If we do not tackle this carefully, the performance gains from parallelism can quickly vanish, as we saw in the example from the last section.
The need for coordination comes from the fact that different threads are working together to reduce a dataset to a single result. Without synchronized timing and controlled data access, threads can get in each other’s way, and this can potentially result in overwritten values or incorrect answers.
A classic example of a reduction is summing...