Reader small image

You're reading from  Learning Cython Programming (Second Edition) - Second Edition

Product typeBook
Published inFeb 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781783551675
Edition2nd Edition
Languages
Tools
Right arrow
Author (1)
Philip Herron
Philip Herron
author image
Philip Herron

Philip Herron is a developer who focuses his passion toward compilers and virtual machine implementations. When he was first accepted to Google Summer of Code 2010, he used inspiration from Paul Biggar's PhD on the optimization of dynamic languages to develop a proof of the concept GCC frontend to compile Python. This project sparked his deep interest in how Python works. After completing a consecutive year on the same project in 2011, Philip applied to Cython under the Python foundation to gain a deeper appreciation of the standard Python implementation. Through this he started leveraging the advantages of Python to control the logic in systems or even add more high-level interfaces, such as embedding Flask web servers in a REST API to a system-level piece of software, without writing any C code. Philip currently works as a software consultant for Instil Software based in Northern Ireland. He develops mobile applications with embedded native code for video streaming. Instil has given him a lot of support in becoming a better engineer. He has written several tutorials for the UK-based Linux Format magazine on Python and loves to share his passion for the Python programming language.
Read more about Philip Herron

Right arrow

Chapter 5. Advanced Cython

Throughout this book, we have exclusively been mixing C and Python together. In this chapter, we will delve into C++ and Cython. With every release of Cython C++, the support has improved. This is not to say that it's not ready for use yet. In this chapter, we will cover the following topics:

  • Make native C++ classes callable from Python.

  • Wrapping C++ namespaces and templates

  • How exceptions can be propagated to and from C++ and Python

  • C++ new and del keyword

  • Operator overloading

  • Cython gil and nogil keywords

We will wrap up this chapter by embedding a web server into a toy C++ messaging server.

Cython and C++


Cython, above all binding generators, works with C++ the most seamlessly. C++ has some complexity when writing bindings for it, such as calling conventions, templates, and classes. I find this exception handling to be a shining feature of Cython, and we will look at the examples of each.

Namespaces

I am introducing namespaces first because Cython uses namespaces as a way to reference C++ code within your module. Consider this C++ header with the following namespace:

#ifndef __MY_HEADER_H__
#define __MY_HEADER_H__

namespace mynamespace {
….
}

#endif //__MY_HEADER_H__

You will wrap this with the cdef extern declaration:

cdef extern from "header.h" namespace "mynamespace":

You can now address it in Cython as you normally would do for a module:

import cythonfile
cythonfile.mynamespace.attribute

It really feels like a Python module simply by using a namespace.

Classes

I would take a guess that most of your C++ code revolves around using classes. Being an object-oriented language...

C++ new and del keyword


Cython understands the new keyword from C++; so, consider that you have a C++ class:

 class Car {
    int doors;
    int wheels;
  public:
    Car ();
    ~Car ();
    void printCar (void);
    void setWheels (int x) { wheels = x; };
    void setDoors (int x) { doors = x; };
  };

It is defined in Cython as follows:

cdef extern from "cppcode.h" namespace "mynamespace":
    cppclass Car:
        Car ()
        void printCar ()
        void setWheels (int)
        void setDoors (int)

Note that we do not declare the ~Car destructor because we never call this directly. It's not an explicitly callable public member; this is why we never call it directly but delete will and the compiler will ensure this is called when it will go out of scope on the stack. To instantiate the raw C++ class in Cython code on the heap, we can simply run the following:

cdef Car * c = new Car ()

You can then go and use del to delete the object at any time using Python's del keyword:

del c

You will see...

Overloading


Since Python supports overloading to wrap C++ overload, just list the members as normal:

cdef foobar (int)
cdef foobar (int, int)
…

Cython understands that we are in C++ mode and can handle all the type conversion as normal. It's interesting that it can also handle an operator overload easily since it is just another hook! For example, let's take the Car class again and perform some operator overriding such as the following:

namespace mynamespace {
  class Car {
    int doors;
    int wheels;
  public:
    Car ();
    ~Car ();
    Car * operator+(Car *);
    void printCar (void);
    void setWheels (int x) { wheels = x; };
    void setDoors (int x) { doors = x; };
  };
};

Remember to add these operator-overloading class members to your Cythonized class; otherwise, your Cython will throw the following error:

Invalid operand types for '+' (Car *; Car *)

The Cython declaration of the operator overload looks as you expected:

cdef extern from "cppcode.h" namespace "mynamespace":
    cppclass...

Templates


Templates are supported in Cython. Though, for the sake of completeness, template meta-programming patterns don't wrap up correctly or fail to compile. This keeps getting better with every release, so take this comment with a pinch of salt.

C++ class templates work very well; we can implement a template called LinkedList as the following class:

cppclass LinkedList[T]:
        LinkedList ()
        void append (T)
        int getLength ()
...

Now, you can access the template type with the declaration called T. You can follow the rest of this code in chapter5/cpptemplates.

Static class member attribute


Sometimes, in classes, it's useful to have a static attribute such as the following:

namespace mynamespace {
  class myClass {
    public:
      static void myStaticMethod (void);
  };
}

In Cython, there is no support for this via a static keyword, but what you can do is tie this function to a namespace so that it becomes the following:

cdef extern from "header.h" namespace "mynamespace::myClass":
    void myStaticMethod ()

Now, you simply call this method as a global method in Cython.

Calling C++ functions – Caveat


When you write a code to call in a C++ function from C, you need to wrap the prototypes in the following:

extern "C" { … }

This allows you to call C++ prototypes because C won't understand a C++ class. With Cython, if you are telling your C output to call in C++ functions, you need to be careful about which compiler you are using or you need to write a new header to implement the minimal wrapper functions required to make the C++ calls.

Namespaces – Caveat

Cython seems to generally require a namespace to keep things nested, which you are already probably doing in your C++ code. Making PXD on non-namespaced code seems to make new declarations, meaning that you will get linking errors due to multiple symbols. The C++ support looks really good from these templates, and more metaprogramming idioms can be difficult to express in Cython. When polymorphism comes into play, it can be difficult to track down compilation errors. I would stress that you should keep your interfaces...

Python distutils


As usual, we can also use Python distutils, but you will need to specify the language so that the auxiliary C++ code required will be compiled by the correct compiler:

from distutils.core import setup
from Cython.Build import cythonize

setup (ext_modules = cythonize(
    "mycython.pyx",
    sources = ["mysource.cc"],
    language = "c++",
))

Now, you can compile your C++ code to your Python module.

Python threading and GIL


GIL stands for Global Interpreter Lock. What this means is when you link your program against libpython.so and use it, you really have the entire Python interpreter in your code. The reason this exists is to make concurrent applications really easy. In Python you can have two threads reading/writing to the same location and Python automatically handles all of this for you; unlike say in Java, where you need to specify that everything is under the GIL in Python. There are two things to consider when talking about the GIL and what it does—instruction atomicity and read/write lock.

Atomic instructions

Remember that Cython necessarily generates the C code to make it look similar to any Python module that you can import. So, what's happening under the hood is that it will generate all the code to acquire lock on the GIL so that it can manipulate Python objects at runtime. Let's consider two types of execution. Firstly, you have the C stack where it executes atomically as...

Cython keywords


Okay, so how does this affect you and, more importantly, your code? It is important to know what way your code should and/or will execute in a concurrent manner. Without an understanding of this, your debugging will be confusing. There are times when the GIL gets in the way and can cause issues by blocking the execution of your C code from Python or vice versa. Cython allows us to control the GIL with the gil and nogil keywords, which is much simpler by wrapping this state for us:

Cython

Python

With gil

PyGILState_Ensure ()

With nogil

PyGILState_Release (state)

I find that it's easier to think of multithreading in Python in terms of blocking and nonblocking the execution. In the next example, we will examine the steps needed to embed a web server into a toy messaging server.

Messaging server


The messaging server is an example of something that would be highly concurrent; let's say we want to embed a web server into this to show the list of clients that are connected to the server. If you look at the flask, you can see how easily you can have a full web container in about eight lines of code.

The messaging server is asynchronous; therefore, it is callback based in C code. These callbacks can then call into Python roster object via Cython. Then, we can iterate over the roster dictionary to get online clients and simply return some JSON as a web service very easily reusing Python code and no need to write anything in C/C++.

It's important to note when embedding web servers is that they start a lot of threads. Calling the start web server function will block until it will exit, meaning if we start the web server first, we won't have the messaging server running concurrently. Also, due to the web-server function blocking, if we start it on a separate thread, it will...

Caveat on GIL


There is a caveat to remember when using gil. In our callbacks, we need to acquire the GIL on each callback before we call any Python code; otherwise, we will segfault and get really confused. So, if you look into each of the libevent callbacks when calling the Cython functions, you have the following:

 PyGILState_STATE gilstate_save = PyGILState_Ensure();
 readcb (client, (char *)data);
 PyGILState_Release(gilstate_save);

Notice that this is also called on the other two callbacks—firstly on the discb callback:

  PyGILState_STATE gilstate_save = PyGILState_Ensure();
  discb (client, NULL);
  PyGILState_Release(gilstate_save);

Finally, on the connect callback, we must be a little safer and call it this way:

 PyGILState_STATE gilstate_save = PyGILState_Ensure();
  if (!conncb (NULL, inet_ntoa (client_addr.sin_addr)))
    {
…
    }
 else
    close (client_fd);
  PyGILState_Release(gilstate_save);

We have to do this since we executed this with nogil from Cython. We need to acquire gil...

Unit testing the native code


Another use of Cython is unit testing the core functionality of shared C libraries. If you maintain a .pxd file (this is all you need really), you can write your own wrapper classes and do scalability testing of data structures with the expressiveness of Python. For example, we can write unit tests for something such as std::map and std::vector as follows:

from libcpp.vector cimport vector

PASSED = False

cdef vector[int] vect
cdef int i
for i in range(10):
    vect.push_back(i)
for i in range(10):
    print vect[i]

PASSED = True

Then, write a test for map as follows:

from libcpp.map cimport map

PASSED = False

cdef map[int,int] mymap
cdef int i
for i in range (10):
    mymap[i] = (i + 1)

for i in range (10):
    print mymap[i]

PASSED = True

Then, if we compile them into separate modules, we can simply write a test executor:

#!/usr/bin/env python
print "Cython C++ Unit test executor"

print "[TEST] std::map"
import testmap
assert testmap.PASSED
print "[PASS...

Preventing subclassing


If you create an extension type in Cython, something you never want to be subclassed, it is a cpp class wrapped in a Python class. To prevent this, you can do the following:

cimport cython

@cython.final
cdef class A: pass

cdef class B (A): pass

This annotation will give an error when someone tries to subclass:

pycode.pyx:7:5: Base class 'A' of type 'B' is final

Note that these annotations only work on the cdef or cpdef functions and not on normal Python def functions.

Parsing large amounts of data


I want to try and prove how powerful and natively compiled C types are to programmers by showing the difference in parsing large amounts of XML. We can take the geographic data from the government as the test data for this experiment (http://www.epa.gov/enviro/geospatial-data-download-service).

Let's look at the size of this XML data:

 ls -liah
total 480184
7849156 drwxr-xr-x   5 redbrain  staff   170B 25 Jul 16:42 ./
5803438 drwxr-xr-x  11 redbrain  staff   374B 25 Jul 16:41 ../
7849208 -rw-r--r--@  1 redbrain  staff   222M  9 Mar 04:27 EPAXMLDownload.xml
7849030 -rw-r--r--@  1 redbrain  staff    12M 25 Jul 16:38 EPAXMLDownload.zip
7849174 -rw-r--r--   1 redbrain  staff    57B 25 Jul 16:42 README

It's huge! Before we write programs, we need to understand a little bit about the structure of this data to see what we want to do with it. It contains facility site locations with addresses. This seems to be the bulk of the data in here, so let's try and parse it all...

Summary


Up to now, you will have seen the core of what's possible with Cython. In this chapter, we covered calling into C++ classes from Cython. You learned to wrap templates and even look at a more complex application demonstrating the usage of gil and nogil.

Chapter 6, Further Reading is the final chapter and will review some final caveats and usages with Cython. I will show how you can use Cython with Python 3. Finally, we will look at related projects and my opinions on their usages.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Cython Programming (Second Edition) - Second Edition
Published in: Feb 2016Publisher: PacktISBN-13: 9781783551675
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Philip Herron

Philip Herron is a developer who focuses his passion toward compilers and virtual machine implementations. When he was first accepted to Google Summer of Code 2010, he used inspiration from Paul Biggar's PhD on the optimization of dynamic languages to develop a proof of the concept GCC frontend to compile Python. This project sparked his deep interest in how Python works. After completing a consecutive year on the same project in 2011, Philip applied to Cython under the Python foundation to gain a deeper appreciation of the standard Python implementation. Through this he started leveraging the advantages of Python to control the logic in systems or even add more high-level interfaces, such as embedding Flask web servers in a REST API to a system-level piece of software, without writing any C code. Philip currently works as a software consultant for Instil Software based in Northern Ireland. He develops mobile applications with embedded native code for video streaming. Instil has given him a lot of support in becoming a better engineer. He has written several tutorials for the UK-based Linux Format magazine on Python and loves to share his passion for the Python programming language.
Read more about Philip Herron