NumPy is a general-purpose library, designed to address most of the needs of a developer of scientific applications. However, as the code base and coverage of an application increase, so does the computation, and sometimes users demand more specific operations and optimized code segments. We have shown how NumPy and Python have tools, such as f2py and Cython, to address these demands. These tools may be an excellent choice for rewriting your functions to a natively compiled code in order to provide extra speed. But there may be some cases (leveraging a C library, such as NAG, to write some analytics) where you may want to do something more radical such as create a new data structure specifically for your own library. This would require you to have access to low-level controls in the Python interpreter. In this chapter, we will be looking into how to do this using the C-API provided by Python and its extension, the NumPy C-API. The C-API itself is...
You're reading from NumPy Essentials
The Python implementation that we are using is a C-based implementation of the Python interpreter. NumPy is specifically for this C-based Python implementation. This implementation of Python comes with a C-API, which is the backbone of the interpreter and provides low-level control to its user. NumPy has further augmented this by providing a rich C-API.
Writing functions in C/C++ can provide developers with the flexibility to leverage some of the advanced libraries available in these languages. However, the cost is apparent in terms of having to write too much boilerplate code around parsing input in order to construct return values. Additionally, developers have to take care while referencing/dereferencing objects since this could eventually create nasty bugs and memory leaks. There is also the problem of future compatibility of the code as the C-API keeps on evolving; hence, if a developer wants to migrate to a later version of Python, they may be up for a lot...
An extension module written in C will have the following parts:
- A header segment, where you include all your external libraries and
Python.h
- An initialization segment, where you define the module name and the functions in your C module
- A method structure array to define all the functions in your module
- An implementation segment, where you define all the functions that you would like to expose
Header snippets are quite standard, just like a normal C module. We need to include the Python.h
header file to give our C code access to the internals of the C-API. This file is present in <path_to_python>/include
. We will be using an array object in our example code, hence we have included the numpy/arrayobject.h
header file as well. We don't need to specify the full path of the header file here as the path resolution is taken care of in setup.py
, which we will take a look at later:
/* Header Segment */ #include...
The Python function passes a reference to itself as the first argument, followed by real arguments given to the function. The PyArg_ParseTuple
function is used to parse values from the Python function to local variables in the C function. In this function, we cast a value to a double, and hence we use d
as the second argument. You can see a full list of strings that are accepted by this function at
https://docs.python.org/2/c-api/arg.html
.
The final result of the computations is returned using Py_Buildvalue
, which takes a similar type of format string to create a Python value from your answer. We use f
here, which stands for float, to demonstrate that double and float are treated similarly:
/* Implementation of the actual C funtions */ static PyObject* square_func(PyObject* self, PyObject* args) { double value; double answer; /* parse the input, from python float to c double */...
In this section, we will create a function to square all the values of the NumPy Array. The aim here is to demonstrate how to get a NumPy Array in C and then iterate over it. In a real-world scenario, this can be done in an easier way using a map or by vectorizing a square function. We are using the same PyArg_ParseTuple
function with the O!
format string. This format string has a (object) [typeobject, PyObject *]
signature and takes the Python type object as the first argument. Users should go through the official API doc to take a look at what other format strings are permissible and which one suits their needs:
The following code snippet explain how to parse the argument using PyArg_ParseTuple
.
// Implementation of square of numpy array static PyObject* square_nparray_func(PyObject* self, PyObject* args) { // variable declarations...
Once we have written the functions successfully, the next thing to do is build the module and use it in our Python modules. The setup.py
file looks like the following code snippet:
from distutils.core import setup, Extension import numpy # define the extension module demo_module = Extension('numpy_api_demo', sources=['numpy_api.c'], include_dirs=[numpy.get_include()]) # run the setup setup(ext_modules=[demo_module])
As we are using NumPy-specific headers, we need to have the numpy.get_include
function in the include_dirs
variable. To run this setup file, we will use a familiar command:
python setup.py build_ext -inplace
The preceding command will create a numpy_api_demo.pyd
file in the directory for us to use in the Python interpreter.
To test our module, we will open a Python interpreter test, and try to call these functions from the module exactly like we do for a module written in Python:
...
In this chapter, we introduced you to yet another way of optimizing or integrating C/C++ code using the C-API provided by Python and NumPy. We explained the basic structure of the code and the additional boilerplate code, which a developer has to write in order to create an extension module. Afterwards, we created two functions that calculated the square of a number and mapped the square function from the math.h
library to a Numpy Array. The intention here was to familiarize you with how to leverage numerical libraries written in C/C++ to create your own modules with a minimal rewriting of code. The scope for writing C code is much wider than what is described here; however, we hope that this chapter has given you the confidence to leverage the C-API if the need arises.