Packt+ | Advance your knowledge in tech

You're reading from Scientific Computing with Python 3

Product type Book

Published in Dec 2016

Publisher Packt

ISBN-13 9781786463517

Pages 332 pages

Edition 1st Edition

Languages

Python

Concepts

Data Analysis

Authors (3):

Claus Führer

Jan Erik Solem

Olivier Verdier

View More author details

Table of Contents (23) Chapters

Scientific Computing with Python 3

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Acknowledgement

Preface

1. Getting Started

2. Variables and Basic Types

3. Container Types

4. Linear Algebra – Arrays

5. Advanced Array Concepts

6. Plotting

7. Functions

8. Classes

9. Iterating

10. Error Handling

11. Namespaces, Scopes, and Modules

12. Input and Output

13. Testing

14. Comprehensive Examples

15. Symbolic Computations - SymPy

References

Chapter 12. Input and Output

In this chapter, we will cover some options for handling data files. Depending on the data and the desired format, there are several options for reading and writing. We will show some of the most useful alternatives.

File handling

File I/O (input and output) is essential in a number of scenarios. For example:

Working with measured or scanned data. Measurements are stored in files that need to be read to be analyzed.
Interacting with other programs. Save results to files so that they can be imported in other applications, and vice-versa.
Storing information for future reference or comparisons.
Sharing data and results with others, possibly on other platforms using other software.

In this section, we will cover how to handle file I/O in Python.

Interacting with files

In Python, an object of type file represents the contents of a physical file stored on disk. A new file object may be created using the following syntax:

myfile = open('measurement.dat','r') # creating a new file object from an existing file

The contents of the file may be accessed, for instance, with this:

print(myfile.read())

Usage of file objects requires some care. The problem is that a file has to be closed before it can be reread or used by other...

NumPy methods

NumPy has built-in methods for reading and writing NumPy array data to text files. These are numpy.loadtxt and numpy.savetxt.

savetxt

Writing an array to a text file is simple:

savetxt(filename,data)

There are two useful parameters given as strings, fmt and delimiter, which control the format and the delimiter between columns. The defaults are space for the delimiter and %.18e for the format, which corresponds to the exponential format with all digits. The formatting parameters are used as follows:

x = range(100) # 100 integers
savetxt('test.txt',x,delimiter=',')   # use comma instead of space
savetxt('test.txt',x,fmt='%d') # integer format instead of float with e

loadtxt

Reading to an array from a text file is done with the help of the following syntax:

filename = 'test.txt'
data = loadtxt(filename)

Due to the fact that each row in an array must have the same length, each row in the text file must have the same number of elements. Similar to savetxt, the default values...

Pickling

The read and write methods you just saw convert data to strings before writing. Complex types (such as objects and classes) cannot be written this way. With Python’s pickle module, you can save any object and also multiple objects to file.

Data can be saved in plaintext (ASCII) format or using a slightly more efficient binary format. There are two main methods: dump, which saves a pickled representation of a Python object to a file, and load, which retrieves a pickled object from the file. The basic usage is like this:

import pickle
with open('file.dat','wb') as myfile:
    a = random.rand(20,20)
    b = 'hello world'
    pickle.dump(a,myfile)    # first call: first object
    pickle.dump(b,myfile)    # second call: second object


import pickle
with open('file.dat','rb') as myfile:
    numbers = pickle.load(myfile) # restores the array
    text = pickle.load(myfile)    # restores the string

Note the order in which the two objects...

Shelves

Objects in dictionaries can be accessed by keys. There is a similar way to access particular data in a file by first assigning it a key. This is possible by using the module shelve:

from contextlib import closing
import shelve as sv
# opens a data file (creates it before if necessary)
with closing(sv.open('datafile')) as data:
    A = array([[1,2,3],[4,5,6]])     
    data['my_matrix'] = A  # here we created a key

In the section File handling, we saw that the built-in open command generates a context manager, and we saw why this is important for handling external resources, such as files. In contrast to this command, sv.open does not create a context manager by itself. The closing command from the contextlib module is needed to transform it into an appropriate context manager. Consider the following example of restoring the file:

from contextlib import closing
import shelve as sv
with closing(sv.open('datafile')) as data: # opens a data file
...

Reading and writing Matlab data files

SciPy has the ability to read and write data in Matlab’s .mat file format using the module. The commands are loadmat and savemat. To load data, use the following syntax:

import scipy.io
data = scipy.io.loadmat('datafile.mat')

The variable data now contains a dictionary, with keys corresponding to the variable names saved in the .mat file. The variables are in NumPy array format. Saving to .mat files involves creating a dictionary with all the variables you want to save (variable name and value). The command is then savemat:

data = {}
data['x'] = x
data['y'] = y
scipy.io.savemat('datafile.mat',data)

This saves the NumPy arrays x and y with the same names when read into Matlab.

Reading and writing images

SciPy comes with some basic functions for handling images. The module function will read images to NumPy arrays. The function will save an array as an image. The following will read a JPEG image to an array, print the shape and type, then create a new array with a resized image, and write the new image to file:

import scipy.misc as sm

# read image to array
im = sm.imread("test.jpg") 
print(im.shape)   # (128, 128, 3)
print(im.dtype)   # uint8

# resize image
im_small = sm.imresize(im, (64,64))
print(im_small.shape)   # (64, 64, 3)

# write result to new image file
sm.imsave("test_small.jpg", im_small)

Note the data type. Images are almost always stored with pixel values in the range 0...255 as 8-bit unsigned integers. The third shape value shows how many color channels the image has. In this case, 3 means it is a color image with values stored in this order: red im[0], green im[1], blue im[2]. A gray scale...

Summary

File handling is inevitable when dealing with measurements and other sources of a larger amount of data. Also communication with other programs and tools is done via file handling.

You learned to see a file as a Python object like others with important methods such as readlines and write. We showed how files can be protected by special attributes, which may allow only read or only write access.

The way you write to a file often influences the speed of the process. We saw how data is stored by pickling or by using the shelve method.