Improving performance
We'll look at two performance tweaks for the Power3 class.
First, a better algorithm. Then, a better algorithm combined with memoization, which involves a cache; therefore, the function becomes stateful. This is where callable objects shine.
The first modification is to use a Divide and Conquer design strategy. The previous version chopped
into n steps; the loop carried out n individual multiplication operations. If we can find a way to split the problem into two equal portions, the problem decomposes into
steps. Given pow1(2,1024), the Power1 callable performs the calculation 1024 multiplications by 2. We can optimize this down to 10 multiplications, a significant speedup.
Rather than simply multiplying by a fixed value, we'll use the "fast exponentiation" algorithm. It uses three essential rules for computing
, as follows:
-
If
:
, the result is simply 1.
-
If n is odd and
, the result is
. This involves a recursive computation of
. This still does a multiplication but...