Advanced Array Concepts

In this chapter, we will explain some more advanced aspects of arrays. First, we will cover the notion of an array view – a concept that a NumPy programmer absolutely must be aware of to avoid hard-to-debug programming errors. Then, Boolean arrays will be introduced along with the ways to compare arrays. Furthermore, we will briefly describe indexing and vectorization, explaining special topics such as broadcasting and sparse matrices.

In this chapter, we will be covering the following topics:

Array views and copies
Comparing arrays
Array indexing
Performance and vectorization
Broadcasting
Sparse matrices

5.1 Array views and copies

In order to control precisely how memory is used, NumPy offers the concept of a view of an array. Views are smaller arrays that share the same data as a larger array. This works just like a reference to one single object.

5.1.1 Array views

The simplest example of a view is given by a slice of an array:

M = array([[1.,2.],[3.,4.]])
v = M[0,:] # first row of M

The preceding slice v is a view of M. It shares the same data as M. Modifying v will modify M as well:

v[-1] = 0.
v # array([[1.,0.]])
M # array([[1.,0.],[3.,4.]]) # M is modified as well

It is possible to access the object that owns the data using the array attribute base:

v.base # array([[1.,0.],[3.,4.]])
v.base is M # True

If an array owns its data, the attribute base is None:

M.base # None

5.1.2 Slices as views

There are precise rules on which slices will return views and which ones will return copies. Only basic slices (mainly index expressions with :) return views, whereas any advanced selections (such as slicing with a Boolean) will return a copy of the data. For instance, it is possible to create new matrices by indexing with lists (or arrays):

a = arange(4) # array([0.,1.,2.,3.])
b = a[[2,3]] # the index is a list [2,3]
b # array([2.,3.])
b.base is None # True, the data was copied
c = a[1:3] 
c.base is None # False, this is just a view

In the preceding example, the array b is not a view, whereas the array c, obtained with a simpler slice, is a view.

There is an especially simple slice of an array that returns a view of the whole array:

N = M[:] # this is a view of the whole array M

5.1.3 Generating views by transposing and reshaping

Some other important operations return views. For instance, transposing an array returns a view:

M = random.random_sample((3,3))
N = M.T
N.base is M # True

The same applies to all reshaping operations:

v = arange(10)
C = v.reshape(-1,1) # column matrix
C.base is v # True

5.1.4 Array copies

Sometimes it is necessary to explicitly request that the data be copied. This is simply achieved with the NumPy function called array:

M = array([[1.,2.],[3.,4.]])
N = array(M.T) # copy of M.T

We can verify that the data has indeed been copied:

N.base is None # True

In this section, you saw the concept of array views. NumPy works with views instead of copies of a given array to save memory, which – especially for large arrays – can be crucial. On the other hand, unintentionally using views may cause programming errors that are hard to debug.

5.2 Comparing arrays

Comparing two arrays is not as simple as it may seem. Consider the following code, which is intended to check whether two matrices are close to each other:

A = array([0.,0.])
B = array([0.,0.])
if abs(B-A) < 1e-10: # an exception is raised here
    print("The two arrays are close enough")

This code raises the following exception when the if statement is executed:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In this section, we'll explain why this is so and how to remedy this state of affairs.

5.2.1 Boolean arrays

Boolean arrays are useful for advanced array indexing (also see Section 5.3.1: Indexing with Boolean arrays). A Boolean array is simply an array for which the entries have the type bool:

A = array([True,False]) # Boolean array
A.dtype # dtype('bool')

Any comparison operator acting on arrays will create a Boolean array instead of a simple Boolean:

M = array([[2, 3],
           [1, 4]])
M > 2 # array([[False, True],
             # [False, True]])
M == 0 # array([[False, False],
             # [False, False]])
N = array([[2, 3],
           [0, 0]])
M == N # array([[True, True],
       # [False, False]])

Note that because array comparison creates Boolean arrays, one cannot use array comparison directly in conditional statements, for example, if statements. The solution is to use the methods all and any to create a simple True or False:

A = array([[1,2],[3,4]])
B = array([[1,2],[3,3]])
A == B # creates array([[True...

5.2.2 Checking for array equality

Checking the equality of two float arrays is not straightforward, because two floats may be very close without being equal. In NumPy, it is possible to check for equality with allclose. This function checks for the equality of two arrays up to a given precision:

data = random.rand(2)*1e-3
small_error = random.rand(2)*1e-16
data == data + small_error # False
allclose(data, data + small_error, rtol=1.e-5, atol=1.e-8)   # True

The tolerance is given in terms of a relative tolerance bound, rtol, and an absolute error bound, atol. The command allclose is a short form of:

(abs(A-B) < atol+rtol*abs(B)).all()

Note that allclose can be also applied to scalars:

data = 1e-3
error = 1e-16
data == data + error # False
allclose(data, data + error, rtol=1.e-5, atol=1.e-8)  #True

5.2.3 Boolean operations on arrays

You cannot use and, or, and not on Boolean arrays. Indeed, those operators force the casting from array to Boolean, which is not permitted. Instead, we can use the operators given in Table 5.1 for component-wise logical operations on Boolean arrays:

Logic operator	Replacement for Boolean arrays
`A and B`	`A & B`
`A or B`	`A \| B`
`not A`	`~ A`

Table 5.1: Logical operators for component-wise logical array operations

Here is an example usage of logical operators with Boolean arrays:

A = array([True, True, False, False])
B = array([True, False, True, False])
A and B # error!
A & B # array([True, False, False, False])
A | B # array([True, True, True, False])
~A # array([False, False, True, True])

Suppose that we have a sequence of data that is marred with some measurement error. Suppose further that we run a regression and it gives us a deviation for each value. We wish to obtain all...

5.3 Array indexing

We have already seen that we can index arrays with combinations of slices and integers – this is a basic slicing technique. There are, however, many more possibilities that allow for a variety of ways to access and modify array elements.

5.3.1 Indexing with Boolean arrays

It is often useful to access and modify only parts of an array, depending on its value. For instance, you might want to access all the positive elements of an array. This turns out to be possible using Boolean arrays, which act like masks to select only some elements of an array. The result of such indexing is always a vector. For instance, consider the following example:

B = array([[True, False],
           [False, True]])
M = array([[2, 3],
           [1, 4]])
M[B] # array([2,4]), a vector

In fact, the command M[B] is equivalent to M[B].flatten(). You can then replace the resulting vector with another vector. For instance, you can replace all the elements with zero:

M[B] = 0
M # [[0, 3], [1, 0]]

Or you can replace all the selected values with others:

M[B] = 10, 20
M # [[10, 3], [1, 20]]

By combining the creation of Boolean arrays (M > 2), smart indexing (indexing with a Boolean array), and broadcasting, you can...

5.3.2 Using the command where

The command where gives a useful construct that can take a Boolean array as a condition and either return the indexes of the array elements satisfying the condition or return different values depending on the values in the Boolean array.

The basic structure is:

where(condition, a, b)

This will return values from a when the condition is True and values from b when it is False.

For instance, consider a Heaviside function:

and its implementation with the command where:

def H(x):
    return where(x < 0, 0, 1)
x = linspace(-1,1,11)  # [-1. -0.8 -0.6 -0.4 -0.2 0. 0.2 0.4 0.6 0.8 1. ]
print(H(x))            # [0 0 0 0 0 1 1 1 1 1 1]

The second and third arguments can be either arrays of the same size as the condition (the Boolean array) or scalars. We'll give two more examples to demonstrate how to manipulate elements from an array or a scalar depending on a condition:

x = linspace...

5.4 Performance and vectorization

When it comes to the performance of your Python code, it often boils down to the difference between interpreted code and compiled code. Python is an interpreted programming language and basic Python code is executed directly without any intermediate compilation to machine code. With a compiled language, the code needs to be translated to machine instructions before execution.

The benefits of an interpreted language are many but interpreted code cannot compete with compiled code for speed. To make your code faster, you can write some parts in a compiled language such as FORTRAN, C, or C++. This is what NumPy and SciPy do.

For this reason, it is best to use functions in NumPy and SciPy over interpreted versions whenever possible. NumPy array operations such as matrix multiplication, matrix-vector multiplication, matrix factorization, scalar products, and so on are much faster than any pure Python equivalent. Consider the...

5.4.1 Vectorization

To improve performance, you have often to vectorize the code. Replacing for loops and other slower parts of the code with NumPy slicing, operations, and functions can give significant improvements.

For example, the simple addition of a scalar to a vector by iterating over the elements is very slow:

for i in range(len(v)):
    w[i] = v[i] + 5

But using NumPy's addition is much faster:

w = v + 5

Using NumPy slicing can also give significant speed improvements over iterating with for loops. To demonstrate this, let's consider forming the average of neighbors in a two-dimensional array:

def my_avg(A):
    m,n = A.shape
    B = A.copy()
    for i in range(1,m-1):
        for j in range(1,n-1):
            B[i,j] = (A[i-1,j] + A[i+1,j] + A[i,j-1] + A[i,j+1])/4
    return B

def slicing_avg(A):
    A[1:-1,1:-1] = (A[:-2,1:-1] + A[2:,1:-1] +
                    A[1:-1,:-2] + A[1:-1,2:])/4
    return A

These functions both assign each element...

5.5 Broadcasting

Broadcasting in NumPy denotes the ability to guess a common, compatible shape between two arrays. For instance, when adding a vector (one-dimensional array) and a scalar (zero-dimensional array), the scalar is extended to a vector, in order to allow for the addition. The general mechanism is called broadcasting. We will first review that mechanism from a mathematical point of view, and then proceed to give the precise rules for broadcasting in NumPy. The mathematical view might give a mathematically trained reader easier access to broadcasting, while other readers might want to skip the mathematical details and directly continue reading Section 5.5.2: Broadcasting arrays.

5.5.1 Mathematical views

Broadcasting is often performed in mathematics, mainly implicitly. Examples are expressions such as or . We will give an explicit description of that technique in this section.

We have in mind the very close relationship between functions and NumPy arrays, as described in Section 4.2.1: Arrays as functions.

Constant functions

One of the most common examples of broadcasting is the addition of a function and a constant; if is a scalar, we often write:

This is an abuse of notation since you should not be able to add functions and constants. Constants are, however, implicitly broadcast to functions. The broadcast version of the constant is the function defined by:

Now it makes sense to add two functions together:

We are not being pedantic for the sake of it, but because a similar situation may arise for arrays, as in the following code:

vector = arange(4) # array([0.,1.,2.,3.])
vector + 1.        # array([1.,2.,3.,4.])

In this example, everything happens as if the scalar 1. had been converted to an array of the same length as vector, that is, array([1.,1.,1.,1.]), and then added to vector.

This example is exceedingly simple, so we'll proceed to show less obvious situations.

Functions of several variables

A more intricate example of broadcasting arises when building functions of several variables. Suppose, for instance, that we were given two functions of one variable, and , and that we want to construct a new function, , according to the formula:

This is clearly a valid mathematical definition. We would like to express this definition as the sum of two functions in two variables defined as

and now we may simply write:

The situation is similar to that arising when adding a column matrix and a row matrix:

C = arange(2).reshape(-1,1) # column
R = arange(2).reshape(1,-1) # row
C + R                       # valid addition: array([[0.,1.],[1.,2.]])

This is especially useful when sampling functions of two variables, as shown in Section 5.5.3: Typical examples.

General mechanism

We have seen how to add a function and a scalar and how to build a function of two variables from two functions of one variable. Let's now focus on the general mechanism that makes this possible. The general mechanism consists of two steps: reshaping and extending.

First, the function is reshaped to the function , which takes two arguments. One of these arguments is a dummy argument, which we take to be zero, as a convention:

Mathematically, the domain of the definition of is now Then the function is reshaped in a way similar to:

Now both and take two arguments, although one of them is always zero. We proceed to the next step, extending. It is the same step that converted a constant into a constant function.

The function is extended to:

The function is extended to:

Now the function of two variables, , which was sloppily defined by , may be defined without reference to its arguments:

For...

Conventions

The last ingredient is the convention for how to add the extra arguments to a function, that is, how the reshaping is automatically performed. By convention, a function is automatically reshaped by adding zeros on the left.

For example, if a function of two arguments has to be reshaped to three arguments, the new function will be defined by:

After having seen a more mathematical motivation for broadcasting, we now show how this applies to NumPy arrays.

5.5.2 Broadcasting arrays

We'll now repeat the observation that arrays are merely functions of several variables (see Section 4.2: Mathematical preliminaries). Array broadcasting thus follows exactly the same procedure as explained above for mathematical functions. Broadcasting is done automatically in NumPy.

In Figure 5.1, we show what happens when adding a matrix of shape (4, 3) to a matrix of size (1, 3). The resulting matrix is of the shape (4, 3):

Figure 5.1: Broadcasting between a matrix and a vector

The broadcasting problem

When NumPy is given two arrays with different shapes and is asked to perform an operation that would require the two shapes to be the same, both arrays are broadcast to a common shape.

Suppose the two arrays have shapes and . Broadcasting consists of the two steps:

If the shape is shorter than the shape , that is, len(s1) < len(s2), then ones are added on the left of the shape . This is reshaping.
When the shapes have the same length, the first array is extended to match the shape s₂ (if possible).

Suppose we want to add a vector of shape to a matrix of shape . The vector needs to be broadcast. The first operation is reshaping; the shape of the vector is converted from (3, ) to (1, 3). The second operation is an extension; the shape is converted from (1, 3) to (4, 3).

For instance, suppose a vector of size n is to be broadcast to the shape (m, n):

is automatically reshaped to (1,...

Shape mismatch

It is not possible to automatically broadcast a vector v of length n to the shape (n,m). This is illustrated in Figure 5.2.

Figure 5.2: Failure of broadcasting due to shape mismatch

The broadcasting fails, because the shape (n,) may not be automatically broadcast to the shape (m, n). The solution is to manually reshape v to the shape (n,1). The broadcasting will now work as usual (by extension only):

M + v.reshape(-1,1)

This is illustrated by the following example.

Define a matrix with:

M = array([[11, 12, 13, 14],
           [21, 22, 23, 24],
           [31, 32, 33, 34]])

and a vector with:

v = array([100, 200, 300])

Now automatic broadcasting will fail because automatic reshaping does not work:

M + v # shape mismatch error

The solution is thus to take care of the reshaping manually. What we want in that case is to add 1 on the right, that is, transform the vector into a column matrix. The broadcasting...

5.5.3 Typical examples

Let's examine some typical examples where broadcasting may come in handy.

Rescale rows

Suppose M is an matrix and we want to multiply each row by a coefficient. The coefficients are stored in a vector coeff with n components. In that case, automatic reshaping will not work, and we have to execute:

rescaled = M*coeff.reshape(-1,1)

Rescale columns

The setup is the same here, but we would like to rescale each column with a coefficient stored in a vector coeff of length m. In this case, automatic reshaping will work:

rescaled = M*coeff

Obviously, we may also do the reshaping manually and achieve the same result with:

rescaled = M*coeff.reshape(1,-1)

Functions of two variables

Suppose and are vectors and we want to form the matrix with elements . This would correspond to the function . The matrix is merely defined by:

W=u.reshape(-1,1) + v

If the vectors and are and respectively, the result is:

array([[2, 3, 4],
       [3, 4, 5]])

More generally, suppose that we want to sample the function . Supposing that the vectors and are defined, the matrix of sampled values is obtained with:

W = cos(x).reshape(-1,1) + sin(2*y)

Note that this is very frequently used in combination with ogrid. The vectors obtained from ogrid are already conveniently shaped for broadcasting. This allows for the following elegant sampling of the function :

x,y = ogrid[0:1:3j,0:1:3j] 
# x,y are vectors with the contents of linspace(0,1,3)
w = cos(x) + sin(2*y)

The syntax of ogrid needs some explanation: First, ogrid is not a function. It is an instance...

5.6. Sparse matrices

Matrices with a small number of nonzero entries are called sparse matrices. Sparse matrices occur, for example, in scientific computing when describing discrete differential operators in the context of numerically solving partial differential equations.

Sparse matrices often have large dimensions, sometimes so large that the entire matrix (with zero entries) would not even fit in the available memory. This is one motivation for a special datatype for sparse matrices. Another motivation is better performance of operations where zero matrix entries can be avoided.

There are only a very limited number of algorithms for general, unstructured sparse matrices in linear algebra. Most of them are iterative in nature and based on efficient implementations of matrix-vector multiplication for sparse matrices.

Examples of sparse matrices are diagonal or banded matrices. The simple pattern of these matrices allows straightforward storing strategies...

5.6.1 Sparse matrix formats

The module scipy.sparse provides many different storing formats for sparse matrices. Here, we'll describe only the most important ones: CSR, CSC, and LIL. The LIL format should be used for generating and altering sparse matrices; CSR and CSC are efficient formats for matrix-matrix and matrix-vector operations.

Compressed sparse row format (CSR)

The compressed sparse row (CSR) format uses three arrays: data, indptr, and indices:

The 1D array data stores all the nonzero values in order. It has as many elements as there are nonzero elements, often denoted by the variable nnz.

The 1D array indptr contains integers such that indptr[i] is the index of the element in data, which is the first nonzero element of row i. If the entire row i is zero, then indptr[i]==indptr[i+1]. If the original matrix has m rows, then len(indptr)==m+1.
The 1D array indices contains the column index information in such a way that indices[indptr[i]:indptr[i+1]] is an integer array with the column indexes of the nonzero elements in row i. Obviously, len(indices)==len(data)==nnz.

Let's see an example:

The CSR format of the matrix:

is given by the...

Compressed sparse column format (CSC)

The CSR format has a column-oriented twin – the compressed sparse column (CSC) format. The only difference in it compared to the CSR format is the definition of the indptr and indices arrays, which are now column-related. The type for the CSC format is csc_matrix and its use corresponds to csr_matrix, explained previously in this subsection.

Continuing the same example in CSC format:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0],[3.,0.,0.,0.],[1.,0.,0.,4.]])
AS = sp.csc_matrix(A)
AS.data         # returns array([ 1., 3., 1., 2., 4.]) 
AS.indptr       # returns array([0, 3, 3, 4, 5])
AS.indices      # returns array([0, 2, 3, 0, 3])
AS.nnz          # returns 5

Row-based linked list format (LIL)

The linked list sparse format stores the nonzero matrix entries row-wise in a list data such that data[k] is a list of the nonzero entries in row k. If all entries in that row are 0, it contains an empty list.

A second list, rows, contains at position k a list of column indexes of the nonzero elements in row k. Here is an example in row-based linked list (LIL) format:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0], [3.,0.,0.,0.], [1.,0.,0.,4.]]) 
AS = sp.lil_matrix(A)
AS.data     # returns array([[1.0, 2.0], [], [3.0], [1.0, 4.0]], dtype=object)
AS.rows     # returns array([[0, 2], [], [0], [0, 3]], dtype=object)
AS.nnz      # returns 5

Altering and slicing matrices in LIL format

The LIL format is the one best suited for slicing, that is, extracting submatrices in LIL format, and for changing the sparsity pattern by inserting nonzero elements. Slicing is demonstrated by the next example:

BS = AS[1:3,0:2]
BS.data     # returns array([[], [3.0]], dtype=object)
BS.rows     # returns array([[], [0]], dtype=object)

The insertion of a new nonzero element automatically updates the attributes:

AS[0,1] = 17 
AS.data # returns array([[1.0, 17.0, 2.0], [], [3.0], [1.0, 4.0]])
AS.rows              # returns array([[0, 1, 2], [], [0], [0, 3]])
AS.nnz               # returns 6

These operations are discouraged in the other sparse matrix formats as they are extremely inefficient.

5.6.2 Generating sparse matrices

The NumPy commands eye, identity, diag, and rand have their sparse counterparts. They take an additional argument; it specifies the sparse matrix format of the resulting matrix.

The following commands generate the identity matrix but in different sparse matrix formats:

import scipy.sparse as sp
sp.eye(20,20,format = 'lil') 
sp.spdiags(ones((20,)),0,20,20, format = 'csr') 
sp.identity(20,format ='csc')

The command sp.rand takes an additional argument describing the density of the generated random matrix. A dense matrix has density 1 while a zero matrix has density 0:

import scipy.sparse as sp 
AS=sp.rand(20,200,density=0.1,format='csr')
AS.nnz # returns 400

There is no direct correspondence to the NumPy command zeroes. Matrices completely filled with zeros are generated by instantiating the corresponding type with the shape parameters as constructor parameters...

5.6.3 Sparse matrix methods

There are methods to convert one sparse type into another or into an array:

AS.toarray # converts sparse formats to a numpy array 
AS.tocsr
AS.tocsc
AS.tolil

The type of a sparse matrix can be inspected by the methods issparse, isspmatrix_lil, isspmatrix_csr, and isspmatrix_csc.

Elementwise operations +, *, /, and ** on sparse matrices are defined as for NumPy arrays. Regardless of the sparse matrix format of the operands, the result is always csr_matrix. Applying elementwise operating functions to sparse matrices requires first transforming them to either CSR or CSC format and applying the functions to their data attribute, as demonstrated by the next example.

The elementwise sine of a sparse matrix can be defined by an operation on its data attribute:

import scipy.sparse as sp
def sparse_sin(A):
    if not (sp.isspmatrix_csr(A) or sp.isspmatrix_csc(A)):
        A = A.tocsr()
    A.data = sin(A.data...

5.7 Summary

The concept of views is one of the important topics you should have learned from this chapter. Missing this topic will give you a hard time when debugging your code. Boolean arrays occur at various places throughout this book. They are handy and compact tools for avoiding lengthy if constructions and loops when working with arrays. In nearly all large computational projects, sparse matrices become an issue. You have seen how they are handled and which related methods are available.