You're reading from Learning Cython Programming (Second Edition) - Second Edition
Other topics we will discuss in this chapter are OpenMP support, Cython's preprocessor and other related projects. Consider other implementations of Python such as PyPy or making it work with Python 3. Not only that but what are the Cython alter-natives and related Cython tools that are available. We will look at numba and Parakeet and look at numpy the flag ship usage of Cython.
OpenMP is a standard API in the shared-memory parallel computing for languages; it's used in several open source projects such as ImageMagick (http://www.imagemagick.org/) to try and speed up the processing on large image manipulations. Cython has some support for this compiler extension. But, you must be aware that you need to use compilers such as GCC or MSVC, which support OpenMP. Clang/LLVM has no OpenMP support yet. This isn't really a place to explain when and why to use OpenMP since it is really a vast subject, but you should check out the following website: http://docs.cython.org/src/userguide/parallelism.html.
At compile time, similar to C/C++, we have the C-preprocessor to make some decisions on what gets compiled mostly from conditionals, defines, and a mixture of both. In Cython, we can replicate some of this behavior using IF
, ELIF
, ELSE
, and DEF
. This is demonstrated as an example in the following code line:
DEF myConstant = "hello cython"
We also have access to os.uname
as predefined constants from the Cython compiler:
UNAME_SYSNAME
UNAME_NODENAME
UNAME_RELEASE
UNAME_VERSION
UNAME_MACHINE
We can also run conditional expressions against these as follows:
IF UNAME_SYSNAME == "Windows": include "windows.pyx" ELSE: include "unix.pyx"
You also have ELIF
to use in conditional expressions. If you compare something as this against some of your headers in C programs, you will see how you can replicate basic C-preprocessor behavior in Cython. This gives you a quick idea of how you can replicate C-preprocessor usage in your headers.
Porting to Python 3 can be painful, but reading around the subject shows us that people have had success porting their code to 3.x by simply compiling their module with Cython instead of actually porting their code! With Cython, you can specify the output to conform to the Python 3 API via the following:
$ cython -3 <options>
This will make sure you are outputting Python 3 stuff instead of the default argument of -2
, which generates for the 2.x standard.
PyPy has become a popular alternative to the standard Python implementation. More importantly, it is now being used by many companies (small and large) in their production environments to boost performance and scalability. How does PyPy differ from normal CPython? While the latter is a traditional interpreter, the former is a full-fledged virtual machine. It maintains a just-in-time compiler backend for runtime optimization on most relevant architectures.
Getting Cythonized modules to run on PyPy is dependent on their cpyext emulation layer. This isn't quite complete and has many inconsistencies. But, if you are brave and up to trying it out, it's going to get better and better with each release.
When it comes to writing Cython modules most of your work will comprise of get-ting your pxd declarations correct so that you can manipulate native code correctly. There are several projects attempting to create a compiler to read C/C++ headers and generate your pxd declarations as output. The main issue is maintaining a fully compliant C and C++ parser. Part of my Google Summer of Code project was to use the Python plugin system as part of GCC to reuse GCC's code for parsing C/C++ code. The plugin could intercept the declarations, types and prototypes. It isn't fully ready for use and there are other similar projects attempting the same issue. More information can be found at https://github.com/cython/cython/wiki/AutoPxd.
Overall, if you consider SWIG (http://swig.org/) as a way to write a native Python module, you could be fooled to think that Cython and SWIG are similar. SWIG is mainly used to write wrappers for language bindings. For example, if you have some C code as follows:
int myFunction (int, const char *){ … }
You can write the SWIG interface file as follows:
/* example.i */
%module example
%{
extern int myFunction (int, const char *);
...
%}
Compile this with the following:
$ swig -python example.i
You can compile and link the module as you would do for a Cython output since this generates the necessary C code. This is fine if you want a basic module to simply call into C from Python. But Cython provides users with much more.
Cython is much more developed and optimized, and it truly understands how to work with C types and memory management and how to handle exceptions. With SWIG, you cannot manipulate data; you simply call into functions on the C side from Python. In Cython, we can...
NumPy is a scientific library designed to provide functionality similar to or on par with MATLAB, which is a paid proprietary mathematics package. NumPy has a lot of popularity with Cython users since you can seek out more performance from your highly computational code using C types. In Cython, you can import this library as follows:
import numpy as np
cimport numpy as np
np.import_array()
You can access full Python APIs as follows:
np.PyArray_ITER_NOTDONE
So, you can integrate with iterators at a very native area of the API. This allows NumPy users to get a lot of speed when working with native types via something as follows:
cdef double * val = (<double*>np.PyArray_MultiIter_DATA(it, 0))[0]
We can cast the data from the array to double
, and it's a cdef
type in Cython to work with now. For more information and NumPy tutorials, visit https://github.com/cython/cython/wiki/tutorials-numpy.
Numba is another way to get your Python code to become almost native to your host system by outputting the code to be run on LLVM seamlessly. Numba makes use of decorators such as the following:
@autojit
def myFunction (): ...
Numba also integrates with NumPy. On the whole, it sounds great. Unlike Cython, you only apply decorators to pure Python code, and it does everything for you, but you may find that the optimizations will be fewer and not as powerful.
Numba does not integrate with C/C++ to the extent that Cython does. If you want it to integrate, you need to use Foreign Function Interfaces (FFI) to wrap calls. You also need to define structs and work with C types in Python code in a very abstract sense to a point where you don't really have much control as compared with Cython.
Numba is mostly comprised of decorators, such as @locals
, from Cython. But in the end, all this creates is just-in-time-compiled functions with a proper native function signature. Since you can...
Parakeet is another project that works alongside Numba, adding extremely specific optimizations to the Python code that uses lots of nested loops and parallelism. As with OpenMP, where it's really cool, Numba too requires using annotations on your code to do all this for the programmer. The downside is that you won't just magically optimize any Python code, the optimization that Parakeet does is on very specific sets of code.
Some useful links for referencing are:
If you've read this far, you should now be familiar with Cython to such an extent that you can embed it with C bindings and even make some of your pure Python code more efficient. I've shown you how to apply Cython against an actual open source project and even how to extend native software with a Twisted Web server! As I kept saying throughout the book, it makes C feel as though there are endless possibilities to control logic or that you can extend the system with the plethora of Python modules available. Thanks for reading.