Summary
This chapter looked at how to analyze the way that your code behaves and understand why it might be underperforming compared to expectations. The main tool we use for this is the Linux perf tool for examining the performance counters within the Linux kernel. These give very fine-grained information about the way that your program is running, and help find places where performance might be hindered. We gave a couple of examples that illustrated the kind of issues that might appear in real code. There are many factors to consider here, including the way that the CPU decodes and schedules individual micro operations, the cache performance and branch predictions, and the global movement of data between various forms on the system.
Performance is a moving target. Altering code to address a specific issue can expose other issues that mean the performance does not improve as expected; unrolling a loop to improve cache performance might cause the instruction cache to overflow...