Reader small image

You're reading from  NumPy Essentials

Product typeBook
Published inApr 2016
Reading LevelIntermediate
Publisher
ISBN-139781784393670
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Leo (Liang-Huan) Chin
Leo (Liang-Huan) Chin
author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

Tanmay Dutta
Tanmay Dutta
author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

Shane Holloway
Shane Holloway
author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway

View More author details
Right arrow

Chapter 8. Speeding Up NumPy with Cython

Python combined with the NumPy library provides the user with a tool to write highly complex functions and analysis. As the size and complexity of code grow, the number of inefficiencies in the code base starts to creep in. Once the project is in its completion stages, developers should start focusing on the performance of the code and analyze the bottlenecks. Python provides many tools and libraries to create optimized and faster-performing code.

In this chapter, we will be looking at one such tool called Cython. Cython is a static compiler for Python and the language "Cython," which is particularly popular among developers working on scientific libraries/numerical computing. Many famous analytics libraries written in Python make intensive use of Cython (pandas, SciPy, scikit-learn, and so on).

The Cython programming language is a superset of Python and the user still enjoys all the functionalities and higher level constructs provided by Python the...

The first step toward optimizing code


The questions that every developer should have in mind while optimizing their code are as follows:

  • What number of function calls is your code making?
  • Are there redundant calls?
  • How much memory is the code using?
  • Are there memory leaks?
  • Where are the bottlenecks?

The first four questions are mostly answered by profiler tools. You are advised to learn at least about one profiling tool. Profiling tools will not be covered in this chapter. In most cases, it is suggested to first try to optimize function calls and memory usage before diving into lower-level approaches such as Cython or assembly languages (in C-derived languages).

Once the bottlenecks have been identified and all the issues with algorithms and logic have been tackled, a Python developer can dive into the world of Cython to get extra speed out of your application.

Setting up Cython


Cython is a compiler that converts Python code with the type definition to C code, which still runs in the Python environment. The final output is native machine code, which runs much faster than the bytecode produced by Python. The magnitude of speed-up for Python code is more evident in code that heavily uses loops. In order to compile C code, the first prerequisite is to have a C/C++ compiler such as gcc (Linux) or mingw (Windows) installed on the computer.

The second step is to install Cython. Cython comes just like any other library with a Python module and you can install it using any of your preferred methods (pip, easy_install, and so on). Once these two steps are done, you can test your setup by just trying to call Cython from the shell. If you get an error message, then you have missed the second step and you need to reinstall Cython or download the TAR archive from the Cython official website (http://cython.org/#download), then run the following command from the...

Hello world in Cython


Cython programs look quite similar to Python ones, mostly with added type information. Let's have a look at a simple program that computes the n th Fibonacci number given n:

defcompute_fibonacchi(n): 
    """ 
    Computes fibonacchi sequence 
 
    """ 
 
    a = 1 
    b = 1 
    intermediate = 0  
    for x in xrange(n): 
intermediate = a 
        a = a + b 
        b = intermediate 
    return a 

Let's study this program to understand what is going on under the hood when you call this function with some numeric output; let's say compute_fibonacchi(3).

As we know, Python is an interpreted and dynamic language, which means you do not need to declare variables before using them. This means that, at the start of a function call, the Python interpreter is agnostic about the type of value that n will hold. When you call the function with some integral value, Python does the type inference automatically...

Multithreaded code


Chances are that your application may be using multithreaded code. Python is not considered suitable for multithreaded code because of the Global Interpreter Lock (GIL). The good news is that, in Cython, you can explicitly unlock the GIL and make your code truly multithreaded. This is done by simply putting a statement with nogil: in your code. You can later acquire the GIL using with gil in your code:

with nogil: 
<The code block here> 
function_name(args) with gil:  
<function body> 

NumPy and Cython


Cython has built-in support to provide faster access to NumPy arrays. These facilities make Cython an ideal candidate to optimize NumPy code. For this section, we will study code that calculates the price of the European option, a financial instrument using the Monte-Carlo technique. Knowledge of finance is not expected; however, we assume you have a basic understanding of Monte-Carlo simulations:

defprice_european(strike = 100, S0 = 100, time = 1.0,  
rate = 0.5, mu = 0.2, steps = 50,  
N = 10000, option = "call"): 
 
dt = time / steps 
rand = np.random.standard_normal((steps + 1, N)) 
S = np.zeros((steps+1, N)); 
S[0] = S0 
 
for t in range(1,steps+1): 
S[t] = S[t-1] * np.exp((rate-0.5 * mu ** 2) * dt 
+ mu * np.sqrt(dt) * rand[t]) 
price_call = (np.exp(-rate * time) 
* np.sum(np.maximum(S[-1] - strike, 0))/N) 
price_put = (np.exp(-rate * time) 
* np.sum(np.maximum(strike - S[-1], 0))/N) 
 ...

Summary


In this chapter, we saw how to covert Python code into Cython. We also looked into some example Python code that involved NumPy arrays. We briefly explained the concept of boxing and unboxing in the Python language and how they affect the performance of code. We also explained how you can explicitly unlock the notorious GIL. To dig further deep in the Cython world, we recommend Learning Cython ProgrammingPhilip HerronPackt Publishing. In the next chapter, you will learn about the NumPy C API and how to use it.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
NumPy Essentials
Published in: Apr 2016Publisher: ISBN-13: 9781784393670
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway