# Move Further with NumPy Modules

## Ivan Idris

May 2013

(For more resources related to this topic, see here.)

# Linear algebra

Linear algebra is an important branch of mathematics. The numpy.linalg package contains linear algebra functions. With this module, you can invert matrices, calculate eigenvalues, solve linear equations, and determine determinants, among other things.

# Time for action – inverting matrices

The inverse of a matrix A in linear algebra is the matrix A-1, which when multiplied with the original matrix, is equal to the identity matrix I. This can be written, as A* A-1 = I.

The inv function in the numpy.linalg package can do this for us. Let's invert an example matrix. To invert matrices, perform the following steps:

1. We will create the example matrix with the mat.

```A = np.mat("0 1 2;1 0 3;4 -3 8")
print "A\n", A```

The A matrix is printed as follows:

```A
[[ 0 1 2]
[ 1 0 3]
[ 4 -3 8]]```
2. Now, we can see the inv function in action, using which we will invert the matrix.

```inverse = np.linalg.inv(A)
print "inverse of A\n", inverse```

The inverse matrix is shown as follows:

```inverse of A
[[-4.5 7. -1.5]
[-2. 4. -1. ]
[ 1.5 -2. 0.5]]```

If the matrix is singular or not square, a LinAlgError exception is raised. If you want, you can check the result manually. This is left as an exercise for the reader.

3. Let's check what we get when we multiply the original matrix with the result of the inv function:

`print "Check\n", A * inverse`

The result is the identity matrix, as expected.

`Check[[ 1. 0. 0.][ 0. 1. 0.][ 0. 0. 1.]]`

## What just happened?

We calculated the inverse of a matrix with the inv function of the numpy.linalg package. We checked, with matrix multiplication, whether this is indeed the inverse matrix.

`import numpy as npA = np.mat("0 1 2;1 0 3;4 -3 8")print "A\n", Ainverse = np.linalg.inv(A)print "inverse of A\n", inverseprint "Check\n", A * inverse`

# Solving linear systems

A matrix transforms a vector into another vector in a linear way. This transformation mathematically corresponds to a system of linear equations. The numpy.linalg function, solve, solves systems of linear equations of the form Ax = b; here A is a matrix, b can be 1D or 2D array, and x is an unknown variable. We will see the dot function in action. This function returns the dot product of two floating-point arrays.

# Time for action – solving a linear system

Let's solve an example of a linear system. To solve a linear system, perform the following steps:

1. Let's create the matrices A and b.

`iA = np.mat("1 -2 1;0 2 -8;-4 5 9")print "A\n", Ab = np.array([0, 8, -9])print "b\n", b`

The matrices A and b are shown as follows:

2. Solve this linear system by calling the solve function.

`x = np.linalg.solve(A, b)print "Solution", x`

The following is the solution of the linear system:

`Solution [ 29. 16. 3.]`
3. Check whether the solution is correct with the dot function.

`print "Check\n", np.dot(A , x)`

The result is as expected:

`Check[[ 0. 8. -9.]]`

## What just happened?

We solved a linear system using the solve function from the NumPy linalg module and checked the solution with the dot function.

`import numpy as npA = np.mat("1 -2 1;0 2 -8;-4 5 9")print "A\n", Ab = np.array([0, 8, -9])print "b\n", bx = np.linalg.solve(A, b)print "Solution", xprint "Check\n", np.dot(A , x)`

# Finding eigenvalues and eigenvectors

Eigenvalues are scalar solutions to the equation Ax = ax, where A is a two-dimensional matrix and x is a one-dimensional vector. Eigenvectors are vectors corresponding to eigenvalues. The eigvals function in the numpy.linalg package calculates eigenvalues. The eig function returns a tuple containing eigenvalues and eigenvectors.

# Time for action – determining eigenvalues and eigenvectors

Let's calculate the eigenvalues of a matrix. Perform the following steps to do so:

1. Create a matrix as follows:

`A = np.mat("3 -2;1 0")print "A\n", A`

The matrix we created looks like the following:

`A[[ 3 -2][ 1 0]]`
2. Calculate eigenvalues by calling the eig function.

`print "Eigenvalues", np.linalg.eigvals(A)`

The eigenvalues of the matrix are as follows:

`Eigenvalues [ 2. 1.]`
3. Determine eigenvalues and eigenvectors with the eig function. This function returns a tuple, where the first element contains eigenvalues and the second element contains corresponding Eigenvectors, arranged column-wise.

`eigenvalues, eigenvectors = np.linalg.eig(A)print "First tuple of eig", eigenvaluesprint "Second tuple of eig\n", eigenvectors`

The eigenvalues and eigenvectors will be shown as follows:

`First tuple of eig [ 2. 1.]Second tuple of eig[[ 0.89442719 0.70710678][ 0.4472136 0.70710678]]`
4. Check the result with the dot function by calculating the right- and left-hand sides of the eigenvalues equation Ax = ax.

`for i in range(len(eigenvalues)):print "Left", np.dot(A, eigenvectors[:,i])print "Right", eigenvalues[i] * eigenvectors[:,i]print`

The output is as follows:

`Left [[ 1.78885438][ 0.89442719]]Right [[ 1.78885438][ 0.89442719]]Left [[ 0.70710678][ 0.70710678]]Right [[ 0.70710678][ 0.70710678]]`

## What just happened?

We found the eigenvalues and eigenvectors of a matrix with the eigvals and eig functions of the numpy.linalg module. We checked the result using the dot function .

`import numpy as npA = np.mat("3 -2;1 0")print "A\n", Aprint "Eigenvalues", np.linalg.eigvals(A)eigenvalues, eigenvectors = np.linalg.eig(A)print "First tuple of eig", eigenvaluesprint "Second tuple of eig\n", eigenvectorsfor i in range(len(eigenvalues)):print "Left", np.dot(A, eigenvectors[:,i])print "Right", eigenvalues[i] * eigenvectors[:,i]print`

# Singular value decomposition

Singular value decomposition is a type of factorization that decomposes a matrix into a product of three matrices. The singular value decomposition is a generalization of the previously discussed eigenvalue decomposition. The svd function in the numpy.linalg package can perform this decomposition. This function returns three matrices – U, Sigma, and V – such that U and V are orthogonal and Sigma contains the singular values of the input matrix.

The asterisk denotes the Hermitian conjugate or the conjugate transpose.

# Time for action – decomposing a matrix

It's time to decompose a matrix with the singular value decomposition. In order to decompose a matrix, perform the following steps:

1. First, create a matrix as follows:

`A = np.mat("4 11 14;8 7 -2")print "A\n", A`

The matrix we created looks like the following:

`A[[ 4 11 14][ 8 7 -2]]`
2. Decompose the matrix with the svd function.

`U, Sigma, V = np.linalg.svd(A, full_matrices=False)print "U"print Uprint "Sigma"print Sigmaprint "V"print V`

The result is a tuple containing the two orthogonal matrices U and V on the left- and right-hand sides and the singular values of the middle matrix.

`[-0.31622777 0.9486833 ]]Sigma[ 18.97366596 9.48683298]V[[-0.33333333 -0.66666667 -0.66666667][ 0.66666667 0.33333333 -0.66666667]]U[[-0.9486833 -0.31622777]`
3. We do not actually have the middle matrix—we only have the diagonal values. The other values are all 0. We can form the middle matrix with the diag function. Multiply the three matrices. This is shown, as follows:

`print "Product\n", U * np.diag(Sigma) * V`

The product of the three matrices looks like the following:

`Product[[ 4. 11. 14.][ 8. 7. -2.]]`

## What just happened?

We decomposed a matrix and checked the result by matrix multiplication. We used the svd function from the NumPy linalg module.

`import numpy as npA = np.mat("4 11 14;8 7 -2")print "A\n", AU, Sigma, V = np.linalg.svd(A, full_matrices=False)print "U"print Uprint "Sigma"print Sigmaprint "V"print Vprint "Product\n", U * np.diag(Sigma) * V`

# Pseudoinverse

The Moore-Penrose pseudoinverse of a matrix can be computed with the pinv function of the numpy.linalg module (visit http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse). The pseudoinverse is calculated using the singular value decomposition. The inv function only accepts square matrices; the pinv function does not have this restriction.

# Time for action – computing the pseudo inverse of a matrix

Let's compute the pseudo inverse of a matrix. Perform the following steps to do so:

1. First, create a matrix as follows:

`A = np.mat("4 11 14;8 7 -2")print "A\n", A`

The matrix we created looks like the following:

`A[[ 4 11 14][ 8 7 -2]]`
2. Calculate the pseudoinverse matrix with the pinv function, as follows:

`pseudoinv = np.linalg.pinv(A)print "Pseudo inverse\n", pseudoinv`

The following is the pseudoinverse:

`Pseudo inverse[[-0.00555556 0.07222222][ 0.02222222 0.04444444][ 0.05555556 -0.05555556]]`
3. Multiply the original and pseudoinverse matrices.

`print "Check", A * pseudoinv`

What we get is not an identity matrix, but it comes close to it, as follows:

`Check [[ 1.00000000e+00 0.00000000e+00][ 8.32667268e-17 1.00000000e+00]]`

## What just happened?

We computed the pseudoinverse of a matrix with the pinv function of the numpy.linalg module. The check by matrix multiplication resulted in a matrix that is approximately an identity matrix.

`import numpy as npA = np.mat("4 11 14;8 7 -2")print "A\n", Apseudoinv = np.linalg.pinv(A)print "Pseudo inverse\n", pseudoinvprint "Check", A * pseudoinv`

# Determinants

The determinant is a value associated with a square matrix. It is used throughout mathematics; for more details please visit http://en.wikipedia.org/wiki/Determinant. For an n x n real value matrix the determinant corresponds to the scaling an n-dimensional volume undergoes when transformed by the matrix. The positive sign of the determinant means the volume preserves its orientation ("clockwise" or "anticlockwise"), while a negative sign means reversed orientation. The numpy.linalg module has a det function that returns the determinant of a matrix.

# Time for action – calculating the determinant of a matrix

To calculate the determinant of a matrix, perform the following steps:

1. Create the matrix as follows:

`A = np.mat("3 4;5 6")print "A\n", A`

The matrix we created is shown as follows:

`A[[ 3. 4.][ 5. 6.]]`
2. Compute the determinant with the det function.

`print "Determinant", np.linalg.det(A)`

The determinant is shown as follows:

`Determinant -2.0`

## What just happened?

We calculated the determinant of a matrix with the det function from the numpy.linalg module.

`import numpy as npA = np.mat("3 4;5 6")print "A\n", Aprint "Determinant", np.linalg.det(A)`

## Fast Fourier transform

The fast Fourier transform (FFT) is an efficient algorithm to calculate the discrete Fourier transform (DFT). FFT improves on more naïve algorithms and is of order O(NlogN). DFT has applications in signal processing, image processing, solving partial differential equations, and more. NumPy has a module called fft that offers fast Fourier transform functionality. A lot of the functions in this module are paired; this means that, for many functions, there is a function that does the inverse operation. For instance, the fft and ifft functions form such a pair.

# Time for action – calculating the Fourier transform

First, we will create a signal to transform. In order to calculate the Fourier transform, perform the following steps:

1. Create a cosine wave with 30 points, as follows:

`x = np.linspace(0, 2 * np.pi, 30)wave = np.cos(x)`
2. Transform the cosine wave with the fft function.

`transformed = np.fft.fft(wave)`
3. Apply the inverse transform with the ifft function. It should approximately return the original signal.

`print np.all(np.abs(np.fft.ifft(transformed) - wave) < 10 ** -9)`

The result is shown as follows:

`True`
4. Plot the transformed signal with Matplotlib.

`plot(transformed)show()`

The resulting screenshot shows the fast Fourier transform:

## What just happened?

We applied the fft function to a cosine wave. After applying the ifft function we got our signal back.

`import numpy as npfrom matplotlib.pyplot import plot, showx = np.linspace(0, 2 * np.pi, 30)wave = np.cos(x)transformed = np.fft.fft(wave)print np.all(np.abs(np.fft.ifft(transformed) - wave) < 10 ** -9)plot(transformed)show()`

# Shifting

The fftshift function of the numpy.linalg module shifts zero-frequency components to the center of a spectrum. The ifftshift function reverses this operation.

# Time for action – shifting frequencies

We will create a signal, transform it, and then shift the signal. In order to shift the frequencies, perform the following steps:

1. Create a cosine wave with 30 points.

`x = np.linspace(0, 2 * np.pi, 30)wave = np.cos(x)`
2. Transform the cosine wave with the fft function.

`transformed = np.fft.fft(wave)`
3. Shift the signal with the fftshift function.

`shifted = np.fft.fftshift(transformed)`
4. Reverse the shift with the ifftshift function. This should undo the shift.

`print np.all((np.fft.ifftshift(shifted) - transformed) < 10 ** -9)`

The result is shown as follows:

`True`
5. Plot the signal and transform it with Matplotlib.

`plot(transformed, lw=2)plot(shifted, lw=3)show()`

The following screenshot shows the shift in the fast Fourier transform:

## What just happened?

We applied the fftshift function to a cosine wave. After applying the ifftshift function, we got our signal back

`import numpy as npfrom matplotlib.pyplot import plot, showx = np.linspace(0, 2 * np.pi, 30)wave = np.cos(x)transformed = np.fft.fft(wave)shifted = np.fft.fftshift(transformed)print np.all(np.abs(np.fft.ifftshift(shifted) - transformed) < 10 **-9)plot(transformed, lw=2)plot(shifted, lw=3)show()`

# Random numbers

Random numbers are used in Monte Carlo methods, stochastic calculus, and more. Real random numbers are hard to generate, so in practice we use pseudo random numbers. Pseudo random numbers are random enough for most intents and purposes, except for some very special cases. The functions related to random numbers can be found in the NumPy random module. The core random number generator is based on the Mersenne Twister algorithm. Random numbers can be generated from discrete or continuous distributions. The distribution functions have an optional size parameter, which tells NumPy how many numbers to generate. You can specify either an integer or a tuple as size. This will result in an array filled with random numbers of appropriate shape. Discrete distributions include the geometric, hypergeometric, and binomial distributions.

# Time for action – gambling with the binomial

The binomial distribution models the number of successes in an integer number of independent trials of an experiment, where the probability of success in each experiment is a fixed number.

Imagine a 17th-century gambling house where you can bet on flipping of pieces of eight. Nine coins are flipped. If less than five are heads, then you lose one piece of eight, otherwise you win one. Let's simulate this, starting with 1000 coins in our possession. We will use the binomial function from the random module for that purpose.

In order to understand the binomial function, go through the following steps:

1. Initialize an array, which represents the cash balance, to zeros. Call the binomial function with a size of 10000. This represents 10,000 coin flips in our casino.

`cash = np.zeros(10000)cash[0] = 1000outcome = np.random.binomial(9, 0.5, size=len(cash))`
2. Go through the outcomes of the coin flips and update the cash array. Print the minimum and maximum of outcome, just to make sure we don't have any strange outliers.

`for i in range(1, len(cash)):if outcome[i] < 5:cash[i] = cash[i - 1] - 1elif outcome[i] < 10:cash[i] = cash[i - 1] + 1else:raise AssertionError("Unexpected outcome " + outcome)print outcome.min(), outcome.max()`

As expected, the values are between 0 and 9.

`0 9`
3. Plot the cash array with Matplotlib.

`plot(np.arange(len(cash)), cash)show()`

As you can see in the following screenshot, our cash balance performs a random walk:

## What just happened?

We did a random walk experiment using the binomial function from the NumPy random module

```import numpy as npfrom matplotlib.pyplot import plot, showcash = np.zeros(10000)cash[0] = 1000outcome = np.random.binomial(9, 0.5, size=len(cash))for i in range(1, len(cash)):if outcome[i] < 5:
cash[i] = cash[i - 1] - 1
elif outcome[i] < 10:
cash[i] = cash[i - 1] + 1
else:
raise AssertionError("Unexpected outcome " + outcome)
print outcome.min(), outcome.max()
plot(np.arange(len(cash)), cash)
show()```

# Hypergeometric distribution

The hypergeometric distribution models a jar with two types of objects in it. The model tells us how many objects of one type we can get if we take a specified number of items out of the jar without replacing them. The NumPy random module has a hypergeometric function that simulates this situation.

# Time for action – simulating a game show

Imagine a game show where every time the contestants answer a question correctly, they get to pull three balls from a jar and then put them back. Now there is a catch, there is one ball in there that is bad. Every time it is pulled out, the contestants lose six points. If however, they manage to get out three of the 25 normal balls, they get one point. So, what is going to happen if we have 100 questions in total? In order to get a solution for this, go through the following steps:

1. Initialize the outcome of the game with the hypergeometric function. The first parameter of this function is the number of ways to make a good selection, the second parameter is the number of ways to make a bad selection, and the third parameter is the number of items sampled.

`points = np.zeros(100)outcomes = np.random.hypergeometric(25, 1, 3, size=len(points))`
2. Set the scores based on the outcomes from the previous step.

`for i in range(len(points)):if outcomes[i] == 3:points[i] = points[i - 1] + 1elif outcomes[i] == 2:points[i] = points[i - 1] - 6else:print outcomes[i]`
3. Plot the points array with Matplotlib

`plot(np.arange(len(points)), points)show()`

The following screenshot shows how the scoring evolved:

## What just happened?

We simulated a game show using the hypergeometric function from the NumPy random module. The game scoring depends on how many good and how many bad balls are pulled out of a jar in each session

```import numpy as npfrom matplotlib.pyplot import plot, showpoints = np.zeros(100)outcomes = np.random.hypergeometric(25, 1, 3, size=len(points))for i in range(len(points)):if outcomes[i] == 3:points[i] = points[i - 1] + 1elif outcomes[i] == 2:points[i] = points[i - 1] - 6else:print outcomes[i]plot(np.arange(len(points)), points)
show()```

# Continuous distributions

Continuous distributions are modeled by the probability density functions ( pdf). The probability for a certain interval is determined by integration of the probability density function. The NumPy random module has a number of functions that represent continuous distributions— beta, chisquare, exponential, f, gamma, gumbel, laplace, lognormal, logistic , multivariate_normal, noncentral_chisquare, noncentral_f, normal, and others.

# Time for action – drawing a normal distribution

Random numbers can be generated from a normal distribution and their distribution may be visualized with a histogram. To draw a normal distribution, perform the following steps:

1. Generate random numbers for a given sample size using the normal function from the random NumPy module.

`N=10000normal_values = np.random.normal(size=N)`
2. Draw the histogram and theoretical pdf: Draw the histogram and theoretical pdf with a center value of 0 and standard deviation of 1. We will use Matplotlib for this purpose.

`dummy, bins, dummy = plt.hist(normal_values,np.sqrt(N), normed=True, lw=1)sigma = 1mu = 0plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi))* np.exp( - (bins - mu)**2 / (2 * sigma**2) ),lw=2)plt.show()`

In the following screenshot, we see the familiar bell curve:

## What just happened?

We visualized the normal distribution using the normal function from the random NumPy module. We did this by drawing the bell curve and a histogram of randomly generated values

`import numpy as npimport matplotlib.pyplot as pltN=10000normal_values = np.random.normal(size=N)dummy, bins, dummy = plt.hist(normal_values, np.sqrt(N), normed=True,lw=1)sigma = 1mu = 0plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (bins -mu)**2 / (2 * sigma**2) ),lw=2)plt.show()`

# Lognormal distribution

A lognormal distribution is a distribution of a variable whose natural logarithm is normally distributed. The lognormal function of the random NumPy module models this distribution.

# Time for action – drawing the lognormal distribution

Let's visualize the lognormal distribution and its probability density function with a histogram. Perform the following steps:

1. Generate random numbers using the normal function from the random NumPy module.

`N=10000lognormal_values = np.random.lognormal(size=N)`
2. Draw the histogram and theoretical pdf: Draw the histogram and theoretical pdf with a center value of 0 and standard deviation of 1. We will use Matplotlib for this purpose.

`dummy, bins, dummy = plt.hist(lognormal_values,np.sqrt(N), normed=True, lw=1)sigma = 1mu = 0x = np.linspace(min(bins), max(bins), len(bins))pdf = np.exp(-(numpy.log(x) - mu)**2 / (2 * sigma**2))/ (x *sigma * np.sqrt(2 * np.pi))plt.plot(x, pdf,lw=3)plt.show()`

The fit of the histogram and theoretical pdf is excellent, as you can see in the following screenshot:

## What just happened?

We visualized the lognormal distribution using the lognormal function from the random NumPy module. We did this by drawing the curve of the theoretical probability density function and a histogram of randomly generated values

`import numpy as npimport matplotlib.pyplot as pltN=10000lognormal_values = np.random.lognormal(size=N)dummy, bins, dummy = plt.hist(lognormal_values, np.sqrt(N),normed=True, lw=1)sigma = 1mu = 0x = np.linspace(min(bins), max(bins), len(bins))pdf = np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2))/ (x * sigma *np.sqrt(2 * np.pi))plt.plot(x, pdf,lw=3)plt.show()`

# Summary

We learned a lot in this article about NumPy modules. We covered linear algebra, the fast Fourier transform, continuous and discrete distributions, and random numbers.

## Resources for Article :

Further resources on this subject:

You've been reading an excerpt of: