Reader small image

You're reading from  Scientific Computing with Python - Second Edition

Product typeBook
Published inJul 2021
Reading LevelIntermediate
PublisherPackt
ISBN-139781838822323
Edition2nd Edition
Languages
Right arrow
Authors (3):
Claus Führer
Claus Führer
author image
Claus Führer

Claus Führer is a professor of scientific computations at Lund University, Sweden. He has an extensive teaching record that includes intensive programming courses in numerical analysis and engineering mathematics across various levels in many different countries and teaching environments. Claus also develops numerical software in research collaboration with industry and received Lund University's Faculty of Engineering Best Teacher Award in 2016.
Read more about Claus Führer

View More author details
Right arrow
Input and Output

In this chapter, we will cover some options for handling data files. Depending on the data and the desired format, there are several options for reading and writing. We will show some of the most useful alternatives.

The following topics will be covered in this chapter:

  • File handling
  • NumPy methods
  • Pickling
  • Shelves
  • Reading and writing Matlab data files
  • Reading and writing images

14.1 File handling

File input and output(I/O) is essential in a number of scenarios, for example:

  • Working with measured or scanned data. Measurements are stored in files that need to be read in order to be analyzed.
  • Interacting with other programs. Save results to files so that they can be imported into other applications, and vice-versa.
  • Storing information for future reference or comparisons.
  • Sharing data and results with others, possibly on other platforms using other software.

In this section, we will cover how to handle file I/O in Python.

14.1.1 Interacting with files

In Python, an object of the type file represents the contents of a physical file stored on a disk. A new object file may be created using the following syntax:

# creating a new file object from an existing file
myfile = open('measurement.dat','r')

The contents of the file may be accessed, for instance, with this command:

print(myfile.read())

Usage of file objects requires some care. The problem is that a file has to be closed before it can be re-read or used by other applications, which is done using the following syntax:

myfile.close() # closes the file object

It is not that simple because an exception might be triggered before the call to close is executed, which will skip the closing code (consider the following example). A simple way to make sure that a file will be properly closed is to use context managers. This construction, using the keyword with, is explained in more detail in Section 12.1.3:...

14.1.2 Files are iterables

A file is, in particular, iterable (see Section 9.3: Iterable objects). Files iterate their lines:

with open(name,'r') as myfile:
    for line in myfile:
        data = line.split(';')
        print(f'time {data[0]} sec temperature {data[1]} C')

The lines of the file are returned as strings. The string method split is a possible tool to convert the string to a list of strings; for example:

data = 'aa;bb;cc;dd;ee;ff;gg'
data.split(';') # ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg']

data = 'aa bb cc dd ee ff gg'
data.split(' ') # ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg']

Since the object myfile is iterable, we can also do a direct extraction into a list, as follows:

data = list(myfile)

14.1.3 File modes

As you can see in these examples of file handling, the function open takes at least two arguments. The first is obviously the filename, and the second is a string describing the way in which the file will be used. There are several such modes for opening files. The basic ones are as follows:

with open('file1.dat','r') as ...  # read only
with open('file2.dat','r+') as ...  # read/write
with open('file3.dat','rb') as ...  # read in byte mode  
with open('file4.dat','a') as ...  # append (write to the end of the file)
with open('file5.dat','w') as ... # (over-)write the file
with open('file6.dat','wb') as ... # (over-)write the file in byte mode

The modes 'r', 'r+', and 'a' require that the file exists, whereas 'w' will create a new file if no file with that name...

14.2 NumPy methods

NumPy has built-in methods for reading and writing NumPy array data to text files. These are numpy.loadtxt and numpy.savetxt.

14.2.1 savetxt

Writing an array to a text file is simple:

savetxt(filename,data)

There are two useful parameters given as strings, fmt, and delimiter, which control the format and the delimiter between columns. The defaults are space for the delimiter and %.18e for the format, which corresponds to the exponential format with all digits. The formatting parameters are used as follows:

x = range(100) # 100 integers
savetxt('test.txt',x,delimiter=',') # use comma instead of space
savetxt('test.txt',x,fmt='%d') # integer format instead of float with e

14.2.3 loadtxt

Reading to an array from a text file is done with the help of the following syntax:

filename = 'test.txt'
data = loadtxt(filename)

Due to the fact that each row in an array must have the same length, each row in the text file must have the same number of elements. Similar to savetxt, the default values are float and the delimiter is a space. These can be set using the parameters dtype and delimiter. Another useful parameter is comments, which can be used to mark what symbol is used for comments in the data file. An example of using the formatting parameters is as follows:

data = loadtxt('test.txt',delimiter=';')    # data separated by semicolons

# read to integer type, comments in file begin with a hash character
data = loadtxt('test.txt',dtype=int,comments='#')

14.3 Pickling

The read and write methods you just saw convert data to strings before writing. Complex types (such as objects and classes) cannot be written this way. With Python's module pickle, you can save any object and also multiple objects to a file.

Data can be saved in plain-text (ASCII) format or using a slightly more efficient binary format. There are two main methods: dump, which saves a pickled representation of a Python object to a file, and load, which retrieves a pickled object from the file. The basic usage is like this:

import pickle
with open('file.dat','wb') as myfile:
    a = random.rand(20,20)
    b = 'hello world'
    pickle.dump(a,myfile)    # first call: first object
    pickle.dump(b,myfile)    # second call: second object

import pickle
with open('file.dat','rb') as myfile:
    numbers = pickle.load(myfile) # restores the array
    text = pickle.load(myfile)    # restores the string

Note...

14.4 Shelves

Objects in dictionaries can be accessed by keys. There is a similar way to access particular data in a file by first assigning it a key. This is possible by using the module shelve:

from contextlib import closing
import shelve as sv
# opens a data file (creates it before if necessary)
with closing(sv.open('datafile')) as data:
    A = array([[1,2,3],[4,5,6]])     
    data['my_matrix'] = A  # here we created a key

In Section 14.1.1: Interacting with fileswe saw that the built-in command open generates a context manager, and we saw why this is important for handling external resources, such as files. In contrast to this command, sv.open does not create a context manager by itself. The command closing from the module contextlib is needed to transform it into an appropriate context manager.

Consider the following example of restoring the file:

from contextlib import closing
import shelve...

14.5 Reading and writing Matlab data files

SciPy has the ability to read and write data in Matlab's .mat file format using the module \pyth!scipy.io!. The commands are loadmat and savemat.

To load data, use the following syntax:

import scipy.io
data = scipy.io.loadmat('datafile.mat')

The variable data now contains a dictionary, with keys corresponding to the variable names saved in the .mat file. The variables are in NumPy array format. Saving to .mat files involves creating a dictionary with all the variables you want to save (variable name and value). The command is then savemat:

data = {}
data['x'] = x
data['y'] = y
scipy.io.savemat('datafile.mat',data)

This saves the NumPy arrays, x and yin Matlab's internal file format, thereby preserving variable names.

14.6 Reading and writing images

The module PIL.Image comes with some functions for handling images. The following will read a JPEG image, print the shape and type, and then create a resized image, and write the new image to a file:

import PIL.Image as pil   # imports the Pillow module

# read image to array
im=pil.open("test.jpg") print(im.size) # (275, 183)
# Number of pixels in horizontal and vertical directions # resize image im_big = im.resize((550, 366)) im_big_gray = im_big.convert("L") # Convert to grayscale

im_array=array(im)
print(im_array.shape)
print(im_array.dtype) # unint 8
# write result to new image file im_big_gray.save("newimage.jpg")

 

PIL creates an image object that can easily be converted to a NumPy array. As an array object, images are stored with pixel values in the range 0...255 as 8-bit unsigned integers (unint8). The third shape value shows how many color channels the image...

14.7 Summary

File handling is inevitable when dealing with measurements and other sources of a larger amount of data. Also, communication with other programs and tools is done via file handling.

You learned to see a file as a Python object, like others, with important methods such as readlines and write. We showed how files can be protected by special attributes, which may allow only read or write access.

The way you write to a file often influences the speed of the process. We saw how data is stored by pickling or by using the method shelve.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Scientific Computing with Python - Second Edition
Published in: Jul 2021Publisher: PacktISBN-13: 9781838822323
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Claus Führer

Claus Führer is a professor of scientific computations at Lund University, Sweden. He has an extensive teaching record that includes intensive programming courses in numerical analysis and engineering mathematics across various levels in many different countries and teaching environments. Claus also develops numerical software in research collaboration with industry and received Lund University's Faculty of Engineering Best Teacher Award in 2016.
Read more about Claus Führer