NumPy Beginner’s Guide  Second Edition — Save 50%
An action packed guide using real world examples of the easy to use, high performance, free open source NumPy mathematical library.
NumPy has a number of modules that have been inherited from its predecessor, Numeric. Some of these packages have a SciPy counterpart, which may have fuller functionality. The numpy.dual package contains functions that are defined both in NumPy and SciPy. The packages discussed in this article are also part of the numpy.dual package.
In this article by Ivan Idris from the book NumPy Beginner’s Guide  Second Edition, we shall cover the following topics:
 The linalg package
 The fft package
 Random numbers
 Continuous and discrete distributions
(For more resources related to this topic, see here.)
Linear algebra
Linear algebra is an important branch of mathematics. The numpy.linalg package contains linear algebra functions. With this module, you can invert matrices, calculate eigenvalues, solve linear equations, and determine determinants, among other things.
Time for action – inverting matrices
The inverse of a matrix A in linear algebra is the matrix A^{1}, which when multiplied with the original matrix, is equal to the identity matrix I. This can be written, as A* A^{1} = I.
The inv function in the numpy.linalg package can do this for us. Let's invert an example matrix. To invert matrices, perform the following steps:

We will create the example matrix with the mat.
A = np.mat("0 1 2;1 0 3;4 3 8") print "A\n", A
The A matrix is printed as follows:
A [[ 0 1 2] [ 1 0 3] [ 4 3 8]]

Now, we can see the inv function in action, using which we will invert the matrix.
inverse = np.linalg.inv(A) print "inverse of A\n", inverse
The inverse matrix is shown as follows:
inverse of A [[4.5 7. 1.5] [2. 4. 1. ] [ 1.5 2. 0.5]]
If the matrix is singular or not square, a LinAlgError exception is raised. If you want, you can check the result manually. This is left as an exercise for the reader.

Let's check what we get when we multiply the original matrix with the result of the inv function:
print "Check\n", A * inverse
The result is the identity matrix, as expected.
Check
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]]
What just happened?
We calculated the inverse of a matrix with the inv function of the numpy.linalg package. We checked, with matrix multiplication, whether this is indeed the inverse matrix.
import numpy as np
A = np.mat("0 1 2;1 0 3;4 3 8")
print "A\n", A
inverse = np.linalg.inv(A)
print "inverse of A\n", inverse
print "Check\n", A * inverse
Solving linear systems
A matrix transforms a vector into another vector in a linear way. This transformation mathematically corresponds to a system of linear equations. The numpy.linalg function, solve, solves systems of linear equations of the form Ax = b; here A is a matrix, b can be 1D or 2D array, and x is an unknown variable. We will see the dot function in action. This function returns the dot product of two floatingpoint arrays.
Time for action – solving a linear system
Let's solve an example of a linear system. To solve a linear system, perform the following steps:

Let's create the matrices A and b.
iA = np.mat("1 2 1;0 2 8;4 5 9")
print "A\n", A
b = np.array([0, 8, 9])
print "b\n", bThe matrices A and b are shown as follows:

Solve this linear system by calling the solve function.
x = np.linalg.solve(A, b)
print "Solution", xThe following is the solution of the linear system:
Solution [ 29. 16. 3.]

Check whether the solution is correct with the dot function.
print "Check\n", np.dot(A , x)
The result is as expected:
Check
[[ 0. 8. 9.]]
What just happened?
We solved a linear system using the solve function from the NumPy linalg module and checked the solution with the dot function.
import numpy as np
A = np.mat("1 2 1;0 2 8;4 5 9")
print "A\n", A
b = np.array([0, 8, 9])print "b\n", b
x = np.linalg.solve(A, b)
print "Solution", x
print "Check\n", np.dot(A , x)
Finding eigenvalues and eigenvectors
Eigenvalues are scalar solutions to the equation Ax = ax, where A is a twodimensional matrix and x is a onedimensional vector. Eigenvectors are vectors corresponding to eigenvalues. The eigvals function in the numpy.linalg package calculates eigenvalues. The eig function returns a tuple containing eigenvalues and eigenvectors.
Time for action – determining eigenvalues and eigenvectors
Let's calculate the eigenvalues of a matrix. Perform the following steps to do so:

Create a matrix as follows:
A = np.mat("3 2;1 0")
print "A\n", AThe matrix we created looks like the following:
A
[[ 3 2]
[ 1 0]] 
Calculate eigenvalues by calling the eig function.
print "Eigenvalues", np.linalg.eigvals(A)
The eigenvalues of the matrix are as follows:
Eigenvalues [ 2. 1.]

Determine eigenvalues and eigenvectors with the eig function. This function returns a tuple, where the first element contains eigenvalues and the second element contains corresponding Eigenvectors, arranged columnwise.
eigenvalues, eigenvectors = np.linalg.eig(A)
print "First tuple of eig", eigenvalues
print "Second tuple of eig\n", eigenvectorsThe eigenvalues and eigenvectors will be shown as follows:
First tuple of eig [ 2. 1.]
Second tuple of eig
[[ 0.89442719 0.70710678]
[ 0.4472136 0.70710678]] 
Check the result with the dot function by calculating the right and lefthand sides of the eigenvalues equation Ax = ax.
for i in range(len(eigenvalues)):
print "Left", np.dot(A, eigenvectors[:,i])
print "Right", eigenvalues[i] * eigenvectors[:,i]
printThe output is as follows:
Left [[ 1.78885438]
[ 0.89442719]]
Right [[ 1.78885438]
[ 0.89442719]]
Left [[ 0.70710678]
[ 0.70710678]]
Right [[ 0.70710678]
[ 0.70710678]]
What just happened?
We found the eigenvalues and eigenvectors of a matrix with the eigvals and eig functions of the numpy.linalg module. We checked the result using the dot function .
import numpy as np
A = np.mat("3 2;1 0")
print "A\n", A
print "Eigenvalues", np.linalg.eigvals(A)
eigenvalues, eigenvectors = np.linalg.eig(A)
print "First tuple of eig", eigenvalues
print "Second tuple of eig\n", eigenvectors
for i in range(len(eigenvalues)):
print "Left", np.dot(A, eigenvectors[:,i])
print "Right", eigenvalues[i] * eigenvectors[:,i]
Singular value decomposition
Singular value decomposition is a type of factorization that decomposes a matrix into a product of three matrices. The singular value decomposition is a generalization of the previously discussed eigenvalue decomposition. The svd function in the numpy.linalg package can perform this decomposition. This function returns three matrices – U, Sigma, and V – such that U and V are orthogonal and Sigma contains the singular values of the input matrix.
The asterisk denotes the Hermitian conjugate or the conjugate transpose.
Time for action – decomposing a matrix
It's time to decompose a matrix with the singular value decomposition. In order to decompose a matrix, perform the following steps:

First, create a matrix as follows:
A = np.mat("4 11 14;8 7 2")
print "A\n", AThe matrix we created looks like the following:
A
[[ 4 11 14]
[ 8 7 2]] 
Decompose the matrix with the svd function.
U, Sigma, V = np.linalg.svd(A, full_matrices=False)
print "U"
print U
print "Sigma"
print Sigma
print "V"
print VThe result is a tuple containing the two orthogonal matrices U and V on the left and righthand sides and the singular values of the middle matrix.
[0.31622777 0.9486833 ]]
Sigma
[ 18.97366596 9.48683298]
V
[[0.33333333 0.66666667 0.66666667]
[ 0.66666667 0.33333333 0.66666667]]
U
[[0.9486833 0.31622777] 
We do not actually have the middle matrix—we only have the diagonal values. The other values are all 0. We can form the middle matrix with the diag function. Multiply the three matrices. This is shown, as follows:
print "Product\n", U * np.diag(Sigma) * V
The product of the three matrices looks like the following:
Product
[[ 4. 11. 14.]
[ 8. 7. 2.]]
What just happened?
We decomposed a matrix and checked the result by matrix multiplication. We used the svd function from the NumPy linalg module.
import numpy as np
A = np.mat("4 11 14;8 7 2")
print "A\n", A
U, Sigma, V = np.linalg.svd(A, full_matrices=False)
print "U"
print U
print "Sigma"
print Sigma
print "V"
print V
print "Product\n", U * np.diag(Sigma) * V
Pseudoinverse
The MoorePenrose pseudoinverse of a matrix can be computed with the pinv function of the numpy.linalg module (visit http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse). The pseudoinverse is calculated using the singular value decomposition. The inv function only accepts square matrices; the pinv function does not have this restriction.
An action packed guide using real world examples of the easy to use, high performance, free open source NumPy mathematical library. 
Time for action – computing the pseudo inverse of a matrix
Let's compute the pseudo inverse of a matrix. Perform the following steps to do so:

First, create a matrix as follows:
A = np.mat("4 11 14;8 7 2")
print "A\n", AThe matrix we created looks like the following:
A
[[ 4 11 14]
[ 8 7 2]] 
Calculate the pseudoinverse matrix with the pinv function, as follows:
pseudoinv = np.linalg.pinv(A)
print "Pseudo inverse\n", pseudoinvThe following is the pseudoinverse:
Pseudo inverse
[[0.00555556 0.07222222]
[ 0.02222222 0.04444444]
[ 0.05555556 0.05555556]] 
Multiply the original and pseudoinverse matrices.
print "Check", A * pseudoinv
What we get is not an identity matrix, but it comes close to it, as follows:
Check [[ 1.00000000e+00 0.00000000e+00]
[ 8.32667268e17 1.00000000e+00]]
What just happened?
We computed the pseudoinverse of a matrix with the pinv function of the numpy.linalg module. The check by matrix multiplication resulted in a matrix that is approximately an identity matrix.
import numpy as np
A = np.mat("4 11 14;8 7 2")
print "A\n", A
pseudoinv = np.linalg.pinv(A)
print "Pseudo inverse\n", pseudoinv
print "Check", A * pseudoinv
Determinants
The determinant is a value associated with a square matrix. It is used throughout mathematics; for more details please visit http://en.wikipedia.org/wiki/Determinant. For an n x n real value matrix the determinant corresponds to the scaling an ndimensional volume undergoes when transformed by the matrix. The positive sign of the determinant means the volume preserves its orientation ("clockwise" or "anticlockwise"), while a negative sign means reversed orientation. The numpy.linalg module has a det function that returns the determinant of a matrix.
Time for action – calculating the determinant of a matrix
To calculate the determinant of a matrix, perform the following steps:

Create the matrix as follows:
A = np.mat("3 4;5 6")
print "A\n", AThe matrix we created is shown as follows:
A
[[ 3. 4.]
[ 5. 6.]] 
Compute the determinant with the det function.
print "Determinant", np.linalg.det(A)
The determinant is shown as follows:
Determinant 2.0
What just happened?
We calculated the determinant of a matrix with the det function from the numpy.linalg module.
import numpy as np
A = np.mat("3 4;5 6")
print "A\n", A
print "Determinant", np.linalg.det(A)
Fast Fourier transform
The fast Fourier transform (FFT) is an efficient algorithm to calculate the discrete Fourier transform (DFT). FFT improves on more naïve algorithms and is of order O(NlogN). DFT has applications in signal processing, image processing, solving partial differential equations, and more. NumPy has a module called fft that offers fast Fourier transform functionality. A lot of the functions in this module are paired; this means that, for many functions, there is a function that does the inverse operation. For instance, the fft and ifft functions form such a pair.
Time for action – calculating the Fourier transform
First, we will create a signal to transform. In order to calculate the Fourier transform, perform the following steps:

Create a cosine wave with 30 points, as follows:
x = np.linspace(0, 2 * np.pi, 30)
wave = np.cos(x) 
Transform the cosine wave with the fft function.
transformed = np.fft.fft(wave)

Apply the inverse transform with the ifft function. It should approximately return the original signal.
print np.all(np.abs(np.fft.ifft(transformed)  wave) < 10 ** 9)
The result is shown as follows:
True

Plot the transformed signal with Matplotlib.
plot(transformed)
show()The resulting screenshot shows the fast Fourier transform:
What just happened?
We applied the fft function to a cosine wave. After applying the ifft function we got our signal back.
import numpy as np
from matplotlib.pyplot import plot, show
x = np.linspace(0, 2 * np.pi, 30)
wave = np.cos(x)
transformed = np.fft.fft(wave)
print np.all(np.abs(np.fft.ifft(transformed)  wave) < 10 ** 9)
plot(transformed)
show()
Shifting
The fftshift function of the numpy.linalg module shifts zerofrequency components to the center of a spectrum. The ifftshift function reverses this operation.
Time for action – shifting frequencies
We will create a signal, transform it, and then shift the signal. In order to shift the frequencies, perform the following steps:

Create a cosine wave with 30 points.
x = np.linspace(0, 2 * np.pi, 30)
wave = np.cos(x) 
Transform the cosine wave with the fft function.
transformed = np.fft.fft(wave)

Shift the signal with the fftshift function.
shifted = np.fft.fftshift(transformed)

Reverse the shift with the ifftshift function. This should undo the shift.
print np.all((np.fft.ifftshift(shifted)  transformed) < 10 ** 9)
The result is shown as follows:
True

Plot the signal and transform it with Matplotlib.
plot(transformed, lw=2)
plot(shifted, lw=3)
show()The following screenshot shows the shift in the fast Fourier transform:
What just happened?
We applied the fftshift function to a cosine wave. After applying the ifftshift function, we got our signal back
import numpy as np
from matplotlib.pyplot import plot, show
x = np.linspace(0, 2 * np.pi, 30)
wave = np.cos(x)
transformed = np.fft.fft(wave)
shifted = np.fft.fftshift(transformed)
print np.all(np.abs(np.fft.ifftshift(shifted)  transformed) < 10 **
9)
plot(transformed, lw=2)
plot(shifted, lw=3)
show()
Random numbers
Random numbers are used in Monte Carlo methods, stochastic calculus, and more. Real random numbers are hard to generate, so in practice we use pseudo random numbers. Pseudo random numbers are random enough for most intents and purposes, except for some very special cases. The functions related to random numbers can be found in the NumPy random module. The core random number generator is based on the Mersenne Twister algorithm. Random numbers can be generated from discrete or continuous distributions. The distribution functions have an optional size parameter, which tells NumPy how many numbers to generate. You can specify either an integer or a tuple as size. This will result in an array filled with random numbers of appropriate shape. Discrete distributions include the geometric, hypergeometric, and binomial distributions.
An action packed guide using real world examples of the easy to use, high performance, free open source NumPy mathematical library. 
Time for action – gambling with the binomial
The binomial distribution models the number of successes in an integer number of independent trials of an experiment, where the probability of success in each experiment is a fixed number.
Imagine a 17thcentury gambling house where you can bet on flipping of pieces of eight. Nine coins are flipped. If less than five are heads, then you lose one piece of eight, otherwise you win one. Let's simulate this, starting with 1000 coins in our possession. We will use the binomial function from the random module for that purpose.
In order to understand the binomial function, go through the following steps:

Initialize an array, which represents the cash balance, to zeros. Call the binomial function with a size of 10000. This represents 10,000 coin flips in our casino.
cash = np.zeros(10000)
cash[0] = 1000
outcome = np.random.binomial(9, 0.5, size=len(cash)) 
Go through the outcomes of the coin flips and update the cash array. Print the minimum and maximum of outcome, just to make sure we don't have any strange outliers.
for i in range(1, len(cash)):
if outcome[i] < 5:
cash[i] = cash[i  1]  1
elif outcome[i] < 10:
cash[i] = cash[i  1] + 1
else:
raise AssertionError("Unexpected outcome " + outcome)
print outcome.min(), outcome.max()As expected, the values are between 0 and 9.
0 9

Plot the cash array with Matplotlib.
plot(np.arange(len(cash)), cash)
show()As you can see in the following screenshot, our cash balance performs a random walk:
What just happened?
We did a random walk experiment using the binomial function from the NumPy random module
import numpy as np
from matplotlib.pyplot import plot, show
cash = np.zeros(10000)
cash[0] = 1000
outcome = np.random.binomial(9, 0.5, size=len(cash))
for i in range(1, len(cash)):
if outcome[i] < 5: cash[i] = cash[i  1]  1 elif outcome[i] < 10: cash[i] = cash[i  1] + 1 else: raise AssertionError("Unexpected outcome " + outcome) print outcome.min(), outcome.max() plot(np.arange(len(cash)), cash) show()
Hypergeometric distribution
The hypergeometric distribution models a jar with two types of objects in it. The model tells us how many objects of one type we can get if we take a specified number of items out of the jar without replacing them. The NumPy random module has a hypergeometric function that simulates this situation.
Time for action – simulating a game show
Imagine a game show where every time the contestants answer a question correctly, they get to pull three balls from a jar and then put them back. Now there is a catch, there is one ball in there that is bad. Every time it is pulled out, the contestants lose six points. If however, they manage to get out three of the 25 normal balls, they get one point. So, what is going to happen if we have 100 questions in total? In order to get a solution for this, go through the following steps:

Initialize the outcome of the game with the hypergeometric function. The first parameter of this function is the number of ways to make a good selection, the second parameter is the number of ways to make a bad selection, and the third parameter is the number of items sampled.
points = np.zeros(100)
outcomes = np.random.hypergeometric(25, 1, 3, size=len(points)) 
Set the scores based on the outcomes from the previous step.
for i in range(len(points)):
if outcomes[i] == 3:
points[i] = points[i  1] + 1
elif outcomes[i] == 2:
points[i] = points[i  1]  6
else:
print outcomes[i] 
Plot the points array with Matplotlib
plot(np.arange(len(points)), points)
show()The following screenshot shows how the scoring evolved:
What just happened?
We simulated a game show using the hypergeometric function from the NumPy random module. The game scoring depends on how many good and how many bad balls are pulled out of a jar in each session
import numpy as np
from matplotlib.pyplot import plot, show
points = np.zeros(100)
outcomes = np.random.hypergeometric(25, 1, 3, size=len(points))
for i in range(len(points)):
if outcomes[i] == 3:
points[i] = points[i  1] + 1
elif outcomes[i] == 2:
points[i] = points[i  1]  6
else:
print outcomes[i]
plot(np.arange(len(points)), points) show()
Continuous distributions
Continuous distributions are modeled by the probability density functions ( pdf). The probability for a certain interval is determined by integration of the probability density function. The NumPy random module has a number of functions that represent continuous distributions— beta, chisquare, exponential, f, gamma, gumbel, laplace, lognormal, logistic , multivariate_normal, noncentral_chisquare, noncentral_f, normal, and others.
Time for action – drawing a normal distribution
Random numbers can be generated from a normal distribution and their distribution may be visualized with a histogram. To draw a normal distribution, perform the following steps:

Generate random numbers for a given sample size using the normal function from the random NumPy module.
N=10000
normal_values = np.random.normal(size=N) 
Draw the histogram and theoretical pdf: Draw the histogram and theoretical pdf with a center value of 0 and standard deviation of 1. We will use Matplotlib for this purpose.
dummy, bins, dummy = plt.hist(normal_values,
np.sqrt(N), normed=True, lw=1)
sigma = 1
mu = 0
plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi))
* np.exp(  (bins  mu)**2 / (2 * sigma**2) ),lw=2)
plt.show()In the following screenshot, we see the familiar bell curve:
What just happened?
We visualized the normal distribution using the normal function from the random NumPy module. We did this by drawing the bell curve and a histogram of randomly generated values
import numpy as np
import matplotlib.pyplot as plt
N=10000
normal_values = np.random.normal(size=N)
dummy, bins, dummy = plt.hist(normal_values, np.sqrt(N), normed=True,
lw=1)
sigma = 1
mu = 0
plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp(  (bins 
mu)**2 / (2 * sigma**2) ),lw=2)
plt.show()
Lognormal distribution
A lognormal distribution is a distribution of a variable whose natural logarithm is normally distributed. The lognormal function of the random NumPy module models this distribution.
Time for action – drawing the lognormal distribution
Let's visualize the lognormal distribution and its probability density function with a histogram. Perform the following steps:

Generate random numbers using the normal function from the random NumPy module.
N=10000
lognormal_values = np.random.lognormal(size=N) 
Draw the histogram and theoretical pdf: Draw the histogram and theoretical pdf with a center value of 0 and standard deviation of 1. We will use Matplotlib for this purpose.
dummy, bins, dummy = plt.hist(lognormal_values,
np.sqrt(N), normed=True, lw=1)
sigma = 1
mu = 0
x = np.linspace(min(bins), max(bins), len(bins))
pdf = np.exp((numpy.log(x)  mu)**2 / (2 * sigma**2))/ (x *
sigma * np.sqrt(2 * np.pi))
plt.plot(x, pdf,lw=3)
plt.show()The fit of the histogram and theoretical pdf is excellent, as you can see in the following screenshot:
What just happened?
We visualized the lognormal distribution using the lognormal function from the random NumPy module. We did this by drawing the curve of the theoretical probability density function and a histogram of randomly generated values
import numpy as np
import matplotlib.pyplot as plt
N=10000
lognormal_values = np.random.lognormal(size=N)
dummy, bins, dummy = plt.hist(lognormal_values, np.sqrt(N),
normed=True, lw=1)
sigma = 1
mu = 0
x = np.linspace(min(bins), max(bins), len(bins))
pdf = np.exp((np.log(x)  mu)**2 / (2 * sigma**2))/ (x * sigma *
np.sqrt(2 * np.pi))
plt.plot(x, pdf,lw=3)
plt.show()
Summary
We learned a lot in this article about NumPy modules. We covered linear algebra, the fast Fourier transform, continuous and discrete distributions, and random numbers.
Resources for Article :
Further resources on this subject:
 Interacting with GNU Octave: Operators [Article]
 What Can You Do with Sage Math? [Article]
 Plotting Data with Sage [Article]
About the Author :
Ivan Idris
Ivan Idris was born in Bulgaria from Indonesian parents. He moved to the Netherlands and graduated from university with a degree in Experimental Physics.
His graduation thesis had a strong emphasis on Applied Computer Science. After graduating, he worked for several companies as a Java Developer, Data Warehouse Developer, and QA Analyst.
His main professional interests are Business Intelligence, big data, and cloud computing. He enjoys writing clean, testable code and interesting technical articles. He is the author of NumPy Beginner’s Guide, NumPy Cookbook, and Learning NumPy.
Books From Packt
