Reference
[1] Volkov, V. and Demmel, J.W. (2008) Benchmarking GPUs to tune dense linear algebra. SC ‘08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Article No. 31, pp. 1-11. (https://dl.acm.org/doi/10.5555/1413370.1413402)
[1] Volkov, V. and Demmel, J.W. (2008) Benchmarking GPUs to tune dense linear algebra. SC ‘08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Article No. 31, pp. 1-11. (https://dl.acm.org/doi/10.5555/1413370.1413402)