Linear Algebra - Arrays

Linear algebra is one of the essential building blocks of computational mathematics. The objects of linear algebra are vectors and matrices. The package NumPy includes all the necessary tools to manipulate those objects.

The first task is to build matrices and vectors or to alter existing ones by slicing. The other main task is the dot operation, which embodies most linear algebra operations (scalar product, matrix-vector product, and matrix-matrix product). Finally, various methods are available to solve linear problems.

The following topics will be covered in this chapter:

Overview of the array type
Mathematical preliminaries
The array type
Accessing array entries
Functions to construct arrays
Accessing and changing the shape
Stacking
Functions acting on arrays
Linear algebra methods...

4.1 Overview of the array type

For the impatient, here is how to use arrays in a nutshell. Be aware though that the behavior of arrays may be surprising at first, so we encourage you to read on after this introductory section.

Note again, the presentation in this chapter assumes like everywhere else in this book that you have the module NumPy imported:

from numpy import *

By importing NumPy, we give access to the datatype ndarray, which we'll describe in the next sections.

4.1.1 Vectors and matrices

Creating vectors is as simple as using the function array to convert a list into an array:

v = array([1.,2.,3.])

The object v is now a vector that behaves much like a vector in linear algebra. We have already emphasized the differences with the list object in Python in Section 3.2: A quick glance at the concept of arrays.

Here are some illustrations of the basic linear algebra operations on vectors:

# two vectors with three components
v1 = array([1., 2., 3.])
v2 = array([2, 0, 1.])

# scalar multiplications/divisions
2*v1 # array([2., 4., 6.])
v1/2 # array([0.5, 1., 1.5])

# linear combinations
3*v1 # array([ 3., 6., 9.])
3*v1 + 2*v2 # array([ 7., 6., 11.])

# norm
from numpy.linalg import norm
norm(v1) # 3.7416573867739413
# scalar product
dot(v1, v2) # 5.0
v1 @ v2 # 5.0 ; alternative formulation

Note that all basic arithmetic operations are performed elementwise:

# elementwise operations:
v1 * v2 # array([2., 0...

4.1.2 Indexing and slices

Indexing and slicing are similar to the corresponding operations for lists. The main difference is that there may be several indexes or slices when the array is a matrix. The subject will be covered in depth in Section 4.4.1: Basic array slicing; here, we just give some illustrative examples of indexing and slicing:

v = array([1., 2., 3])
M = array([[1., 2],[3., 4]])

v[0] # works as for lists
v[1:] # array([2., 3.])

M[0, 0] # 1.
M[1:] # returns the matrix array([[3., 4]])
M[1] # returns the vector array([3., 4.])

# access
v[0] # 1.
v[0] = 10

# slices
v[:2] # array([10., 2.])
v[:2] = [0, 1] # now v == array([0., 1., 3.])
v[:2] = [1, 2, 3] # error!

As arrays are the basic datatype for all tasks in computational linear algebra, we now present in this overview section some examples, the dot product and the solution of linear equation systems.

4.1.3 Linear algebra operations

The essential operator that performs most of the usual operations of linear algebra is the Python function dot. It is used for matrix-vector multiplications (see Section 4.2.4: The dot operations for more details):

dot(M, v) # matrix vector multiplication; returns a vector
M @ v # alternative formulation

It may be used to compute a scalar product between two vectors:

dot(v, w) 
# scalar product; the result is a scalar
v @ w # alternative formulation

Lastly, it is used to compute matrix-matrix products:

dot(M, N) # results in a matrix
M @ N # alternative formulation

Solving a linear system

If is a matrix and is a vector, you can solve the linear equation system

by using the function solve from the linear algebra submodule numpy.linalg:

from numpy.linalg import solve
x = solve(A, b)

For example, to solve

the following Python statements are executed:

from numpy.linalg import solve
A = array([[1., 2.], [3., 4.]])
b = array([1., 4.])
x = solve(A, b)
allclose(dot(A, x), b) # True
allclose(A @ x, b) # alternative formulation

The command allclose is used here to compare two vectors. If they are close enough to each other, this command returns True. Optionally a tolerance value can be set. For more methods related to linear equation systems, see Section 4.9: Linear algebra methods in SciPy.

Now, you have seen the first and essential way of how to use arrays in Python. In the following sections, we'll show you more details and the underlying principles.

4.2 Mathematical preliminaries

In order to understand how arrays work in NumPy, it is useful to understand the mathematical parallel between accessing tensor (matrix and vector) elements by indexes and evaluating mathematical functions by providing arguments. We also cover in this section the generalization of the dot product as a reduction operator.

4.2.1 Arrays as functions

Arrays may be considered from several different points of view. If you want to approach the concept from a mathematical point of view, you might benefit from understanding arrays through an analogy of functions of several variables. This view will later be taken again, when explaining the concept of broadcasting in Section 5.5: Broadcasting.

For instance, selecting a component of a given vector in may just be considered a function from the set of to , where we define the set:

Here the set has n elements. The Python function range generates .

Selecting an element of a given matrix, on the other hand, is a function of two parameters, taking its value in . Picking a particular element of an matrix may thus be considered a function from to .

4.2.2 Operations are elementwise

NumPy arrays are essentially treated as mathematical functions. This is in particular true for operations. Consider two functions, and , defined on the same domain and taking real values. The product of those two functions is defined as the pointwise product, that is,

Note that this construction is possible for any operation between two functions. For an arbitrary operation defined on two scalars, which we denote here by , we could define as follows:

This innocuous remark allows us to understand NumPy's stance on operations; all operations are elementwise in arrays. For instance, the product between two matrices, and , is defined, as with functions, as follows:

4.2.3 Shape and number of dimensions

There is a clear distinction between a:

Scalar: A function with no arguments
Vector: A function with one argument
Matrix: A function with two arguments
Higher-order tensor: A function with more than two arguments

In what follows, the number of dimensions is the number of arguments of a function. The shape corresponds essentially to the domain of a function.

For instance, a vector of size n is a function from the set to . As a result, its domain is . Its shape is defined as the singleton (n,). Similarly, a matrix of size is a function defined on . The corresponding shape is simply the pair (m, n). The shape of an array is obtained by the function numpy.shape, and the number of dimensions by the function numpy.ndim; see also Section 4.6: Accessing and changing the shape.

4.2.4 The dot operations

Treating arrays as functions, although very powerful, completely neglects the linear algebra structures we are familiar with, that is, matrix-vector and matrix-matrix operations. Fortunately, these linear algebra operations may all be written in a similar unified form:

The vector-vector operation:

The matrix-vector operation:

The matrix-matrix operation:

The vector-matrix operation:

The essential mathematical concept is that of reduction. For a matrix-vector operation, the reduction is given by:

In general, a reduction operation defined between two tensors, and , of respective number of dimensions and may be defined as:

Clearly, the shapes of the tensors must be compatible with that operation to make
any sense. This requirement is familiar for matrix-matrix multiplication. The multiplication
of matrices and only makes sense if the number of columns of equals the number of...

4.3 The array type

The objects used to manipulate vectors, matrices, and more general tensors in NumPy are called ndarrays, or just arrays for short. In this section, we examine their essential properties, how to create them, and how to access their information.

4.3.1 Array properties

Arrays are essentially characterized by the three properties, described in Table 4.2:

Name	Description
`shape`	This describes how the data should be interpreted, as a vector, a matrix, or a higher-order tensor, and it gives the corresponding dimension. It is accessed with the attribute `shape`.
`dtype`	This gives the type of the underlying data (float, complex, integer, and so on).
`strides`	This attribute specifies in which order the data should be read. For instance, a matrix could be stored in memory contiguously column by column (FORTRAN convention), or row by row (C convention). The attribute is a tuple with the numbers of bytes that have to be skipped in memory to reach the next row and the number of bytes to be skipped to reach the next column. It even allows for a more flexible interpretation of the data in memory, which is what makes array views possible.

Table 4.2: The three...

4.3.2 Creating arrays from lists

The general way to create an array is by using the function array. The syntax to create a real-valued vector would be:

V = array([1., 2., 1.], dtype=float)

To create a complex vector with the same data, you use:

V = array([1., 2., 1.], dtype=complex)

When no type is specified, the type is guessed. The array function chooses the type that allows storing all the specified values:

V = array([1, 2]) # [1, 2] is a list of integers
V.dtype # int64
V = array([1., 2]) # [1., 2] mix float/integer
V.dtype # float64
V = array([1. + 0j, 2.]) # mix float/complex
V.dtype # complex128

NumPy silently casts floats into integers, which might give unexpected results:

a = array([1, 2, 3])
a[0] = 0.5
a # now: array([0, 2, 3])

The same, often unexpected array type casting happens from complex to float.

Array and Python parentheses

As we noticed in Section 1.2.2: Line joining, Python allows a line break when some opening brace or parenthesis is not closed. This allows a convenient syntax for array creation, which makes it more pleasing to the human eye:

# the identity matrix in 2D
Id = array([[1., 0.], [0., 1.]])
# Python allows this:
Id = array([[1., 0.],
            [0., 1.]])
# which is more readable

So far, you saw a lot of differences in the definition and use between arrays and lists. Accessing array elements, in contrast, seems quite similar to the way list elements are accessed. But especially the use of multiple indexes and the resulting objects from the slicing operations require that we look at these issues in more detail.

4.4 Accessing array entries

Array entries are accessed by indexes. In contrast to vector coefficients, two indexes are needed to access matrix coefficients. These are given in one pair of brackets. This distinguishes the array syntax from a list of lists. There, two pairs of brackets are needed to access elements.

M = array([[1., 2.],[3., 4.]])
M[0, 0] # first row, first column: 1.0
M[-1, 0] # last row, first column: 3.0

Let's look now in more detail at the use of double indexes and slices.

4.4.1 Basic array slicing

Slices are similar to those of lists (see also Section 3.1.1: Slicing) except that they might now be in more than one dimension:

M[i,:] is a vector filled by the row of .
M[:,j] is a vector filled by the column of .
M[2:4,:] is a slice of 2:4 on the rows only.
M[2:4,1:4] is a slice of rows and columns.

The result of matrix slicing is given in the following Figure 4.1:

Figure 4.1: The result of matrix slicing

If you omit an index or a slice, NumPy assumes you are taking rows only. M[3] is a vector that is a view on the third row of M and M[1:3] is a matrix that is a view on the second and third rows of .

Changing the elements of a slice affects the entire array (see also Section 5.1: Array views and copies):

v = array([1., 2., 3.])
v1 = v[:2] # v1 is array([1., 2.])
v1[0] = 0. # if v1 is changed ...
v # ... v is changed too: array([0., 2., 3.])

General slicing rules...

4.4.2 Altering an array using slices

You may alter an array using slices or by direct access. The following changes only one element in a matrix :

M[1, 2] = 2.0 # scalar

Also, we may change one full row of the matrix:

M[2, :] = [1., 2., 3.] # vector

Similarly, we may also replace a full submatrix:

M[1:3, :] = array([[1., 2., 3.],[-1.,-2., -3.]])

There is a distinction between a column matrix and a vector. The following assignment with a column matrix returns no error:

M[1:4, 1:2] = array([[1.],[0.],[-1.0]])

while the assignment with a vector returns a ValueError:

M[1:4, 1:2] = array([1., 0., -1.0]) #  error

The general slicing rules are shown in Table 4.3. The matrices and vectors in the preceding examples must have the right size to fit into matrix . You may also make use of the broadcasting rules (see Section 5.5: Broadcasting) to determine the allowed size of the replacement arrays. If the replacement array does not have...

4.5 Functions to construct arrays

The usual way to set up an array is via a list. But there are also a couple of convenient methods for generating special arrays, which are given in Table 4.5:

Methods	Shape	Generates
`zeros((n,m))`	(n,m)	Matrix filled with zeros
`ones((n,m))`	(n,m)	Matrix filled with ones
`full((n,m),q)`	(n,m)	Matrix filled with
`diag(v,k)`	(n,n)	(Sub-, super-) diagonal matrix from a vector
`random.rand(n,m)`	(n,m)	Matrix filled with uniformly distributed random numbers in `(0,1)`
`arange(n)`	(n,)	First integers
`linspace(a,b,n)`	(n,)	Vector with equispaced points between and

Table 4.5: Commands to create arrays

These commands may take additional arguments. In particular, the commands zeros, ones, full, and arange take dtype as an optional argument. The default type is float, except for arange. There are also methods...

4.6 Accessing and changing the shape

The number of dimensions is what distinguishes a vector from a matrix. The shape is what distinguishes vectors of different sizes, or matrices of different sizes. In this section, we examine how to obtain and change the shape of an array.

4.6.1 The function shape

The shape of a matrix is the tuple of its dimensions. The shape of an matrix is the tuple (n, m). It can be obtained by the function shape:

M = identity(3)
shape(M) # (3, 3)

or, simply by its attribute

M.shape  # (3, 3)

However, the advantage of using shape as a function and not as an attribute is that the function may be used on scalars and lists as well. This may come in handy when code is supposed to work with both scalars and arrays:

shape(1.) # ()
shape([1,2]) # (2,)
shape([[1,2]]) # (1,2)

For a vector, the shape is a singleton containing the length of that vector:

v = array([1., 2., 1., 4.])
shape(v) # (4,) <- singleton (1-tuple)

4.6.2 Number of dimensions

The number of dimensions of an array is obtained with the function ndim or using the array attribute ndim:

ndim(A) # 2
A.ndim # 2

Note that the number of dimensions, given by the function ndim, of a tensor T (a vector, matrix, or higher-order tensor) is always equal to the length of its shape:

T = zeros((2,2,3)) # tensor of shape (2,2,3); three dimensions
ndim(T) # 3
len(shape(T)) # 3

4.6.3 Reshape

The method reshape gives a new view of the array, with a new shape, without copying the data:

v = array([0,1,2,3,4,5])
M = v.reshape(2,3)
shape(M) # returns (2,3)
M[0,0] = 10 # now v[0] is 10

The various effects of reshape on an array defined by arange(6) are given in Figure 4.2:

Figure 4.2: The various effects of reshape on an array

reshape does not create a new array. It rather gives a new view on the existing array. In the preceding example, changing one element of M would automatically result in a change in the corresponding element in v. When this behavior is not acceptable, you need to copy the data, as explained in Section 5.1: Array views and copies.

If you try to reshape an array with a shape that does not multiply to the original shape, an error is raised:

 ValueError: total size of new array must be unchanged.

Sometimes, it is convenient to specify only one shape parameter and let...

Transpose

A special form of reshaping is transposing. It just switches the two shape elements of the matrix. The transpose of a matrix is a matrix such that

which is resolved in the following way:

A = ...
shape(A) # (3,4)

B = A.T  # A transpose
shape(B) # (4,3)

transpose does not copy: transposition is very similar to reshaping. In particular, it does not copy the data either and just returns a view on the same array:

A= array([[ 1., 2.],[ 3., 4.]]) 
B=A.T 
A[1,1]=5. 
B[1,1] # 5.0

Transposing a vector makes no sense since vectors are tensors of one dimension, that is, functions of one variable – the index. NumPy will, however, comply and return exactly the same object:

v = array([1., 2., 3.])
v.T # exactly the same vector!

What you have in mind when you want to transpose a vector is probably to create a row or column matrix. This is done using reshape:

v.reshape(-1, 1) # column matrix containing v
v.reshape(1, -1) # row matrix...

4.7 Stacking

The universal method to build matrices from a couple of (matching) submatrices is concatenate. Its syntax is:

concatenate((a1, a2, ...), axis = 0)

This command stacks the submatrices vertically (on top of each other) when axis=0 is specified. With the argument axis=1, they are stacked horizontally, and this generalizes according to arrays with more dimensions. This function is called by several convenient functions, as follows:

hstack: Used to stack arrays horizontally
vstack: Used to stack arrays vertically
columnstack: Used to stack vectors in columns

4.7.1 Stacking vectors

You may stack vectors row-wise or column-wise using vstack and column_stack, as illustrated in Figure 4.3:

Figure 4.3: Difference between vstack and column_stack

Note that hstack would produce the concatenation of v1 and v2.

Let's consider the symplectic permutation as an example for vector stacking: we have a vector of size . We want to perform a symplectic transformation of a vector with an even number of components, that is, exchange the first half with the second half of the vector with sign change:

This operation is resolved in Python as follows:

# v is supposed to have an even length.
def symp(v):
    n = len(v) // 2 # use the integer division //
    return hstack([v[-n:], -v[:n]])

4.8 Functions acting on arrays

There are different types of functions acting on arrays. Some act elementwise, and they return an array of the same shape. Those are called universal functions. Other array functions return an array of a different shape. In this section, we will meet both types of functions and also learn how to convert scalar functions into universal functions.

4.8.1 Universal functions

Universal functions are functions that act elementwise on arrays. They thus have an output array that has the same shape as the input array. These functions allow us to compute the result of a scalar function on a whole array at once.

Built-in universal functions

A typical example is the cos function (the one provided by NumPy):

cos(pi) # -1
cos(array([[0, pi/2, pi]])) # array([[1, 0, -1]])

Note that universal functions work on arrays in a componentwise manner. This is also true for operators, such as multiplication or exponent:

2 * array([2, 4]) # array([4, 8])
array([1, 2]) * array([1, 8]) # array([1, 16])
array([1, 2])**2 # array([1, 4])
2**array([1, 2]) # array([2, 4])
array([1, 2])**array([1, 2]) # array([1, 4])

Creation of universal functions

Your function will automatically be universal if you use only universal functions in it. If, however, your function uses functions that are not universal, you might get scalar results, or even an error when trying to apply them on an array:

def const(x):
    return 1
const(array([0, 2])) # returns 1 instead of array([1, 1])

Another example is the following:

def heaviside(x):
    if x >= 0:
        return 1.
    else: 
        return 0.
 
heaviside(array([-1, 2])) # error

The expected behavior would be that the heaviside function applied to a vector [a, b] would return [heaviside(a), heaviside(b)]. Alas, this does not work because the function always returns a scalar, no matter the size of the input argument. Besides, using the function with an array input would cause the statement if to raise an exception, as is explained in detail in Section 5.2.1: Boolean arrays.

The NumPy function vectorize...

4.8.2 Array functions

There are a number of functions acting on arrays that do not act componentwise. Examples of such functions are max, min, and sum. These functions may operate on the entire matrix, row-wise, or column-wise. When no argument is provided, they act on the entire matrix.

Suppose:

The function sum acting on that matrix returns a scalar:

sum(A) # 36

The command has an optional parameter, axis. It allows us to choose along which axis to perform the operation. For instance, if the axis is , it means that the sum should be computed along the first axis. The sum along axis of an array of shape will be a vector of length .

Suppose we compute the sum of A along the axis :

sum(A, axis=0) # array([ 6, 8, 10, 12])

This amounts to computing the sum on the columns:

The result is a vector:

Now suppose we compute the sum along axis 1:

A.sum(axis=1) # array([10, 26])

This amounts to computing...

4.9 Linear algebra methods in SciPy

SciPy offers a large range of methods from numerical linear algebra in its module scipy.linalg. Many of these methods are Python wrapping programs from LAPACK, a collection of well-approved FORTRAN subroutines used to solve linear equation systems and eigenvalue problems, see [5]. Linear algebra methods are the core of any method in scientific computing, and the fact that SciPy uses wrappers instead of pure Python code makes these central methods extremely fast. We present in detail here how two linear algebra problems are solved with Scipy to give you a flavor of this module.

You met before some linear algebra functions taken from the module numpy.linalg. Both packages NumPy and SciPy are compatible, but Scipy has its focus on scientific computing methods and is more comprehensive, while NumPy's focus is on the array datatype and it provides only some linear algebra methods for convenience.

4.9.1 Solving several linear equation systems with LU

Let be an matrix and be a sequence of vectors. We consider the problem to find vectors such that:

We assume that the vectors are not known simultaneously. In particular, it is quite a common situation that the ^th problem has to be solved before becomes available, for example in the context of the simplified Newton iteration, see [24].

factorization is a way to organize the classical Gauss elimination method in such a way that the computation is done in two steps:

A factorization step of the matrix to get matrices in triangular form
A relatively cheap backward and forward elimination step that works on the instances of and benefits from the more time-consuming factorization step

The method also uses the fact that if is a permutation matrix such that is the original matrix with its rows permuted, the two systems and have the same...

4.9.2 Solving a least square problem with SVD

A linear equation system , with being an matrix and , is called an overdetermined linear system. In general, it has no classical solution and you seek a vector with the property:

Here, denotes the Euclidean vector norm .

This problem is called a least square problem. A stable method to solve it is based on factorizing , with being an orthogonal matrix, an orthogonal matrix, and an matrix with the property for all . This factorization is called a singular value decomposition (SVD).

We write

with a diagonal matrix . If we assume that has full rank, then is invertible and it can be shown that

holds.

If we split with being an submatrix, then the preceding equation can be simplified to:

SciPy provides a function called svd, which we use to solve this task:

import scipy.linalg...

4.9.3 More methods

In the examples so far, you met a couple of methods for computational tasks in linear algebra, for example, solve. More methods are available after the command import scipy.linalg as sl is executed. The most common of them are listed in Table 4.6:

Methods	Description
`sl.det`	Determinant of a matrix
`sl.eig`	Eigenvalues and eigenvectors of a matrix
`sl.inv`	Matrix inverse
`sl.pinv`	Matrix pseudoinverse
`sl.norm`	Matrix or vector norm
`sl.svd`	Singular value decomposition
`sl.lu`	LU decomposition
`sl.qr`	QR decomposition
`sl.cholesky`	Cholesky decomposition
`sl.solve`	Solution of a general or symmetric linear system: Ax = b
`sl.solve.banded`	The same for banded matrices
`sl.lstsq`	Least squares solution

Table 4.6: Linear algebra functions of the module scipy.linalg

Execute import scipy.linalg as sl first.

4.10 Summary

In this chapter, we worked with the most important objects in linear algebra – vectors and matrices. For this, we learned how to define arrays and we met important array methods. A smaller section demonstrated how to use modules from scipy.linalg to solve central tasks in linear algebra.

In the following chapter, we consider more advanced and special aspects of arrays.

4.11 Exercises

Ex. 1: Consider a matrix:

Construct this matrix in Python using the function array.
Construct the same matrix using the function arange followed by a suitable reshape.
What is the result of the expression M[2,:]? What is the result of the similar expression M[2:]?

Ex. 2: Given a vector x, construct in Python the following matrix:

Here, are the components of the vector (numbered from zero). Given a vector , solve in Python the linear equation system . Let the components of be denoted by . Write a function poly, which has and as input and computes the polynomial:

Plot this polynomial and depict in the same plot the points as small stars. Try your code with the vectors:

Ex. 3: The matrix in Ex. 2 is called a Vandermonde matrix. It can be set up in Python directly with the command vander. Evaluating a polynomial defined by a coefficient vector can be done with the...