NumPy is famous for its efficient arrays. This fame is partly due to the ease of indexing. We will demonstrate advanced indexing tricks using images. Before diving into indexing, we will install the necessary software— SciPy and PIL.
The code for the recipes in this chapter can be found on the book website at http://www.packtpub.com. You can also visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Some of the examples in this chapter will involve manipulating images. In order to do that, we will require the Python Image Library (PIL) ; but don't worry, instructions and pointers to help you install PIL and other necessary Python software are given throughout the chapter, when necessary.
SciPy is the scientific Python library and is closely related to NumPy. In fact, SciPy and NumPy used to be one and the same project many years ago. In this recipe, we will install SciPy.
In Chapter 1, Winding Along with IPython, we discussed how to install setup tools and pip. Reread the recipe if necessary.
In this recipe, we will go through the steps for installing SciPy.
Installing from source: If you have Git installed, you can clone the SciPy repository using the following command:
git clone https://github.com/scipy/scipy.git python setup.py build python setup.py install --user
This installs to your home directory and requires Python 2.6 or higher.
Before building, you will also need to install the following packages on which SciPy depends:
BLAS and LAPACK libraries
C and Fortran compilers
There is a chance that you have already installed this software as a part of the NumPy installation.
Installing SciPy on Linux: Most Linux distributions...
PIL, the Python imaging library, is a prerequisite for the image processing recipes in this chapter.
Let's see how to install PIL.
Installing PIL on Windows: Install using the Windows executable from the PIL website http://www.pythonware.com/products/pil/.
Installing on Debian or Ubuntu: On Debian or Ubuntu, install PIL using the following command:
sudo apt-get install python-imaging
Installing with easy_install or pip: At the time of writing this book, it appeared that the package managers of Red Hat, Fedora, and CentOS did not have direct support for PIL. Therefore, please follow this step if you are using one of these Linux distributions.
Install with either of the following commands:
easy_install PIL sudo pip install PIL
In this recipe, we will load a sample image of Lena, which is available in the SciPy distribution, into an array. This chapter is not about image manipulation, by the way; we will just use the image data as an input.
Lena Soderberg appeared in a 1972 Playboy magazine. For historical reasons, one of those images is often used in the field of image processing. Don't worry; the picture in question is completely safe for work.
We will resize the image using the repeat
function. This function repeats an array, which in practice means resizing the image by a certain factor.
A prerequisite for this recipe is to have SciPy, Matplotlib, and PIL installed. Have a look at the corresponding recipes in this chapter and the previous chapter.
It is important to know when we are dealing with a shared array view, and when we have a copy of the array data. A slice, for instance, will create a view. This means that if you assign the slice to a variable and then change the underlying array, the value of this variable will change. We will create an array from the famous Lena image, copy the array, create a view, and, at the end, modify the view.
Let's create a copy and views of the Lena array:
Create a copy of the Lena array:
acopy = lena.copy()
Create a view of the array:
aview = lena.view()
Set all the values of the view to 0 with a flat iterator:
aview.flat = 0
The end result is that only one of the images shows the Playboy model. The other ones get censored completely:
The following is the code of this tutorial showing the behavior of array views and copies:
import scipy.misc import matplotlib.pyplot lena = scipy.misc.lena() acopy...
We will be flipping the SciPy Lena image—all in the name of science, of course, or at least as a demo. In addition to flipping the image, we will slice it and apply a mask to it.
The steps to follow are listed below:
Plot the flipped image.
Flip the Lena array around the vertical axis using the following code:
matplotlib.pyplot.imshow(lena[:,::-1])
Plot a slice of the image.
Take a slice out of the image and plot it. In this step, we will have a look at the shape of the Lena array. The shape is a tuple representing the dimensions of the array. The following code effectively selects the left-upper quadrant of the Playboy picture.
matplotlib.pyplot.imshow(lena[:lena.shape[0]/2,:lena.shape[1]/2])
Apply a mask to the image.
Apply a mask to the image by finding all the values in the Lena array that are even (this is just arbitrary for demo purposes). Copy the array and change the even values to 0
. This has the effect of putting lots of blue dots (dark spots if you are looking...
In this tutorial, we will apply fancy indexing to set the diagonal values of the Lena image to 0
. This will draw black lines along the diagonals, crossing it through, not because there is something wrong with the image, but just as an exercise.
Fancy indexing is indexing that does not involve integers or slices, which is normal indexing.
We will start with the first diagonal:
Set the values of the first diagonal to 0
.
To set the diagonal values to 0
, we need to define two different ranges for the x and y values:
lena[range(xmax), range(ymax)] = 0
Set the values of the other diagonal to 0
.
To set the values of the other diagonal, we require a different set of ranges, but the principles stay the same:
lena[range(xmax-1,-1,-1), range(ymax)] = 0
At the end, we get this image with the diagonals crossed off, as shown in the following screenshot:
The following is the complete code for this recipe:
import scipy.misc import matplotlib.pyplot # This script demonstrates fancy...
Let's use the ix_
function to shuffle the Lena image.
This function creates a mesh from multiple sequences.
We will start by randomly shuffling the array indices:
Shuffle array indices.
Create a random indices array with the shuffle
function of the numpy.random
module:
def shuffle_indices(size): arr = numpy.arange(size) numpy.random.shuffle(arr) return arr
Plot the shuffled indices:
matplotlib.pyplot.imshow(lena[numpy.ix_(xindices, yindices)])
What we get is a completely scrambled Lena image, as shown in the following screenshot:
The following is the complete code for the recipe:
import scipy.misc import matplotlib.pyplot import numpy.random import numpy.testing # Load the Lena array lena = scipy.misc.lena() xmax = lena.shape[0] ymax = lena.shape[1] def shuffle_indices(size): arr = numpy.arange(size) numpy.random.shuffle(arr) return arr xindices = shuffle_indices(xmax) numpy.testing.assert_equal(len(xindices), xmax) yindices...
Boolean indexing is indexing based on a boolean array and falls in the category fancy indexing.
We will apply this indexing technique to an image:
Image with dots on the diagonal.
This is in some way similar to the Fancy indexing recipe, in this chapter. This time we select modulo 4
points on the diagonal of the image:
def get_indices(size): arr = numpy.arange(size) return arr % 4 == 0
Then we just apply this selection and plot the points:
lena1 = lena.copy() xindices = get_indices(lena.shape[0]) yindices = get_indices(lena.shape[1]) lena1[xindices, yindices] = 0 matplotlib.pyplot.subplot(211) matplotlib.pyplot.imshow(lena1)
Set to 0
based on value.
Select array values between
quarter and three-quarters of the maximum value and set them to 0
:
lena2[(lena > lena.max()/4) & (lena < 3 * lena.max()/4)] = 0
The plot with the two new images will look like the following screenshot:
The following is the complete code for this recipe:
import scipy.misc...
The
ndarray
class has a strides
field, which is a tuple indicating the number of bytes to step in each dimension when going through an array. Let's apply some stride tricks to the problem of splitting a Sudoku puzzle to the 3 by 3 squares of which it is composed.
Explaining the Sudoku rules is outside the scope of this book. For more information see http://en.wikipedia.org/wiki/Sudoku.
Define the Sudoku puzzle array
Let's define the Sudoku puzzle array. This one is filled with the contents of an actual, solved Sudoku puzzle:
sudoku = numpy.array([ [2, 8, 7, 1, 6, 5, 9, 4, 3], [9, 5, 4, 7, 3, 2, 1, 6, 8], [6, 1, 3, 8, 4, 9, 7, 5, 2], [8, 7, 9, 6, 5, 1, 2, 3, 4], [4, 2, 1, 3, 9, 8, 6, 7, 5], [3, 6, 5, 4, 2, 7, 8, 9, 1], [1, 9, 8, 5, 7, 3, 4, 2, 6], [5, 4, 2, 9, 1, 6, 3, 8, 7], [7, 3, 6, 2, 8, 4, 5, 1, 9] ])
Calculate the strides. The itemsize
field of ndarray
gives us the number of bytes in an array. Using the...
Without knowing it, you might have broadcasted arrays. In a nutshell, NumPy tries to perform an operation even though the operands do not have the same shape. In this recipe, we will multiply an array and a scalar. The scalar is "extended" to the shape of the array operand and then the multiplication is performed. We will download an audio file and make a new version that is quieter.
Let's start by reading a WAV file:
Reading a WAV file.
We will use a standard Python code to download an audio file of Austin Powers called "Smashing, baby". SciPy has a wavfile
module, which allows you to load sound data or generate WAV files. If SciPy is installed, then we should have this module already. The read
function returns a data array and sample rate. In this example, we only care about the data:
sample_rate, data = scipy.io.wavfile.read(WAV_FILE)
Plot the original WAV data.
Plot the original WAV data with Matplotlib. Give the subplot the title Original
.
matplotlib.pyplot...