Reader small image

You're reading from  NumPy Essentials

Product typeBook
Published inApr 2016
Reading LevelIntermediate
Publisher
ISBN-139781784393670
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Leo (Liang-Huan) Chin
Leo (Liang-Huan) Chin
author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

Tanmay Dutta
Tanmay Dutta
author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

Shane Holloway
Shane Holloway
author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway

View More author details
Right arrow

Chapter 9. Introduction to the NumPy C-API

NumPy is a general-purpose library, designed to address most of the needs of a developer of scientific applications. However, as the code base and coverage of an application increase, so does the computation, and sometimes users demand more specific operations and optimized code segments. We have shown how NumPy and Python have tools, such as f2py and Cython, to address these demands. These tools may be an excellent choice for rewriting your functions to a natively compiled code in order to provide extra speed. But there may be some cases (leveraging a C library, such as NAG, to write some analytics) where you may want to do something more radical such as create a new data structure specifically for your own library. This would require you to have access to low-level controls in the Python interpreter. In this chapter, we will be looking into how to do this using the C-API provided by Python and its extension, the NumPy C-API. The C-API itself is...

The Python and NumPy C-API


The Python implementation that we are using is a C-based implementation of the Python interpreter. NumPy is specifically for this C-based Python implementation. This implementation of Python comes with a C-API, which is the backbone of the interpreter and provides low-level control to its user. NumPy has further augmented this by providing a rich C-API.

Writing functions in C/C++ can provide developers with the flexibility to leverage some of the advanced libraries available in these languages. However, the cost is apparent in terms of having to write too much boilerplate code around parsing input in order to construct return values. Additionally, developers have to take care while referencing/dereferencing objects since this could eventually create nasty bugs and memory leaks. There is also the problem of future compatibility of the code as the C-API keeps on evolving; hence, if a developer wants to migrate to a later version of Python, they may be up for a lot...

The basic structure of an extension module


An extension module written in C will have the following parts:

  • A header segment, where you include all your external libraries and Python.h
  • An initialization segment, where you define the module name and the functions in your C module
  • A method structure array to define all the functions in your module
  • An implementation segment, where you define all the functions that you would like to expose

The header segment

Header snippets are quite standard, just like a normal C module. We need to include the Python.h header file to give our C code access to the internals of the C-API. This file is present in <path_to_python>/include. We will be using an array object in our example code, hence we have included the numpy/arrayobject.h header file as well. We don't need to specify the full path of the header file here as the path resolution is taken care of in setup.py, which we will take a look at later:

/* 
Header Segment 
*/ 
 
#include...

Creating an array squared function using Python C-API


The Python function passes a reference to itself as the first argument, followed by real arguments given to the function. The PyArg_ParseTuple function is used to parse values from the Python function to local variables in the C function. In this function, we cast a value to a double, and hence we use d as the second argument. You can see a full list of strings that are accepted by this function at  https://docs.python.org/2/c-api/arg.html .

The final result of the computations is returned using Py_Buildvalue, which takes a similar type of format string to create a Python value from your answer. We use f here, which stands for float, to demonstrate that double and float are treated similarly:

/* 
Implementation of the actual C funtions 
*/ 
 
static PyObject* square_func(PyObject* self, PyObject* args) 
{ 
double value; 
double answer; 
 
/*  parse the input, from python float to c double */...

Creating an array squared function using NumPy C-API


In this section, we will create a function to square all the values of the NumPy Array. The aim here is to demonstrate how to get a NumPy Array in C and then iterate over it. In a real-world scenario, this can be done in an easier way using a map or by vectorizing a square function. We are using the same PyArg_ParseTuple function with the O! format string. This format string has a (object) [typeobject, PyObject *] signature and takes the Python type object as the first argument. Users should go through the official API doc to take a look at what other format strings are permissible and which one suits their needs:

Note

If the passed value does not have the same type, then a TypeError is raised.

The following code snippet explain how to parse the argument using PyArg_ParseTuple.

// Implementation of square of numpy array 
 
static PyObject* square_nparray_func(PyObject* self, PyObject* args) 
{ 
 
// variable declarations...

Building and installing the extension module


Once we have written the functions successfully, the next thing to do is build the module and use it in our Python modules. The setup.py file looks like the following code snippet:

from distutils.core import setup, Extension 
import numpy 
# define the extension module 
demo_module = Extension('numpy_api_demo', sources=['numpy_api.c'], 
include_dirs=[numpy.get_include()]) 
 
# run the setup 
setup(ext_modules=[demo_module]) 

As we are using NumPy-specific headers, we need to have the numpy.get_include function in the include_dirs variable. To run this setup file, we will use a familiar command:

python setup.py build_ext -inplace 

The preceding command will create a numpy_api_demo.pyd file in the directory for us to use in the Python interpreter.

To test our module, we will open a Python interpreter test, and try to call these functions from the module exactly like we do for a module written in Python:

...

Summary


In this chapter, we introduced you to yet another way of optimizing or integrating C/C++ code using the C-API provided by Python and NumPy. We explained the basic structure of the code and the additional boilerplate code, which a developer has to write in order to create an extension module. Afterwards, we created two functions that calculated the square of a number and mapped the square function from the math.h library to a Numpy Array. The intention here was to familiarize you with how to leverage numerical libraries written in C/C++ to create your own modules with a minimal rewriting of code. The scope for writing C code is much wider than what is described here; however, we hope that this chapter has given you the confidence to leverage the C-API if the need arises.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
NumPy Essentials
Published in: Apr 2016Publisher: ISBN-13: 9781784393670
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Leo (Liang-Huan) Chin

Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Read more about Leo (Liang-Huan) Chin

author image
Tanmay Dutta

Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
Read more about Tanmay Dutta

author image
Shane Holloway

http://shaneholloway.com/resume/
Read more about Shane Holloway