Home Programming Modular Programming with Python

Modular Programming with Python

By Erik Westra
books-svg-icon Book
eBook $35.99 $24.99
Print $43.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $35.99 $24.99
Print $43.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Introducing Modular Programming
About this book
Python has evolved over the years and has become the primary choice of developers in various fields. The purpose of this book is to help readers develop readable, reliable, and maintainable programs in Python. Starting with an introduction to the concept of modules and packages, this book shows how you can use these building blocks to organize a complex program into logical parts and make sure those parts are working correctly together. Using clearly written, real-world examples, this book demonstrates how you can use modular techniques to build better programs. A number of common modular programming patterns are covered, including divide-and-conquer, abstraction, encapsulation, wrappers and extensibility. You will also learn how to test your modules and packages, how to prepare your code for sharing with other people, and how to publish your modules and packages on GitHub and the Python Package Index so that other people can use them. Finally, you will learn how to use modular design techniques to be a more effective programmer.
Publication date:
May 2016
Publisher
Packt
Pages
246
ISBN
9781785884481

 

Chapter 1. Introducing Modular Programming

Modular programming is an essential tool for the modern developer. Gone are the days when you could just throw something together and hope that it works. To build robust systems that last, you need to understand how to organize your programs so that they can grow and evolve over time. Spaghetti coding is not an option. Modular programming techniques, and in particular the use of Python modules and packages, will give you the tools you need to succeed as a professional in the fast changing programming landscape.

In this chapter, we will:

  • Look at the fundamental aspects of modular programming

  • See how Python modules and packages can be used to organize your code

  • Discover what happens when modular programming techniques are not used

  • Learn how modular programming helps you stay on top of the development process

  • Take a look at the Python standard library as an example of modular programming

  • Create a simple program, built using modular techniques, to see how it works in practice

Let's get started by learning about modules and how they work.

 

Introducing Python modules


For most beginner programmers, their first Python program is some version of the famous Hello World program. This program would look something like this:

print("Hello World!")

This one-line program would be saved in a file on disk, typically named something like hello.py, and it would be executed by typing the following command into a terminal or command-line window:

python hello.py

The Python interpreter would then dutifully print out the message you have asked it to:

Hello World!

This hello.py file is called a Python source file. When you are first starting out, putting all your program code into a single source file is a great way of organizing your program. You can define functions and classes, and put instructions at the bottom which start your program when you run it using the Python interpreter. Storing your program code inside a Python source file saves you from having to retype it each time you want to tell the Python interpreter what to do.

As your programs get more complicated, however, you'll find that it becomes harder and harder to keep track of all the various functions and classes that you define. You'll forget where you put a particular piece of code and find it increasingly difficult to remember how all the various pieces fit together.

Modular programming is a way of organizing programs as they become more complicated. You can create a Python module, a source file that contains Python source code to do something useful, and then import this module into your program so that you can use it. For example, your program might need to keep track of various statistics about events that take place while the program is running. At the end, you might want to know how many events of each type have occurred. To achieve this, you might create a Python source file named stats.py which contains the following Python code:

def init():
    global _stats
    _stats = {}

def event_occurred(event):
    global _stats
    try:
        _stats[event] = _stats[event] + 1
    except KeyError:
        _stats[event] = 1

def get_stats():
    global _stats
    return sorted(_stats.items())

The stats.py Python source file defines a module named stats—as you can see, the name of the module is simply the name of the source file without the .py suffix. Your main program can make use of this module by importing it and then calling the various functions that you have defined as they are needed. The following frivolous example shows how you might use the stats module to collect and display statistics about events:

import stats

stats.init()
stats.event_occurred("meal_eaten")
stats.event_occurred("snack_eaten")
stats.event_occurred("meal_eaten")
stats.event_occurred("snack_eaten")
stats.event_occurred("meal_eaten")
stats.event_occurred("diet_started")
stats.event_occurred("meal_eaten")
stats.event_occurred("meal_eaten")
stats.event_occurred("meal_eaten")
stats.event_occurred("diet_abandoned")
stats.event_occurred("snack_eaten")

for event,num_times in stats.get_stats():
    print("{} occurred {} times".format(event, num_times))

We're not interested in recording meals and snacks, of course—this is just an example—but the important thing to notice here is how the stats module gets imported, and then how the various functions you defined within the stats.py file get used. For example, consider the following line of code:

stats.event_occurred("snack_eaten")

Because the event_occurred() function is defined within the stats module, you need to include the name of the module whenever you refer to this function.

Note

There are ways in which you can import modules so you don't need to include the name of the module each time. We'll take a look at this in Chapter 3, Using Modules and Packages, when we look at namespaces and how the import command works in more detail.

As you can see, the import statement is used to load a module, and any time you see the module name followed by a period, you can tell that the program is referring to something (for example, a function or class) that is defined within that module.

 

Introducing Python packages


In the same way that Python modules allow you to organize your functions and classes into separate Python source files, Python packages allow you to group multiple modules together.

A Python package is a directory with certain characteristics. For example, consider the following directory of Python source files:

This Python package, called animals, contains five Python modules: cat, cow, dog, horse, and sheep. There is also a special file with the rather unusual name __init__.py. This file is called a package initialization file; the presence of this file tells the Python system that this directory contains a package. The package initialization file can also be used to initialize the package (hence the name) and can also be used to make importing the package easier.

Note

Starting with Python version 3.3, packages don't always need to include an initialization file. However, packages without an initialization file (called namespace packages) are still quite uncommon and are only used in very specific circumstances. To keep things simple, we will be using regular packages (with the __init__.py file) throughout this book.

Just like we used the module name when calling a function within a module, we use the package name when referring to a module within a package. For example, consider the following code:

import animals.cow
animals.cow.speak()

In this example, the speak() function is defined within the cow.py module, which itself is part of the animals package.

Packages are a great way of organizing more complicated Python programs. You can use them to group related modules together, and you can even define packages inside packages (called nested packages) to keep your program super-organized.

Note that the import statement (and the related from...import statement) can be used in a variety of ways to load packages and modules into your program. We have only scratched the surface here, showing you what modules and packages look like in Python so that you can recognize them when you see them in a program. We will be looking at the way modules and packages can be defined and imported in much more depth in Chapter 3, Using Modules and Packages.

Tip

Downloading the example code

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Modular-Programming-with-Python. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

 

Using modules and packages to organize a program


Modules and packages aren't just there to spread your Python code across multiple source files and directories—they allow you to organize your code to reflect the logical structure of what your program is trying to do. For example, imagine that you have been asked to create a web application to store and report on university examination results. Thinking about the business requirements that you have been given, you come up with the following overall structure for your application:

The program is broken into two main parts: a web interface, which interacts with the user (and with other computer programs via an API), and a backend, which handles the internal logic of storing information in a database, generating reports, and e-mailing results to students. As you can see, the web interface itself has been broken down into four parts:

  • A user authentication section, which handles user sign-up, sign-in, and sign-out

  • A web interface to view and enter exam results

  • A web interface to generate reports

  • An API, which allows other systems to retrieve exam results on request

As you consider each logical component of your application (that is, each of the boxes in the preceding illustration), you are also starting to think about the functionality that each component will provide. As you do this, you are already thinking in modular terms. Indeed, each of the logical components of your application can be directly implemented as a Python module or package. For example, you might choose to break your program into two main packages named web and backend, where:

  • The web package has modules named authentication, results, reports, and api

  • The backend package has modules named database, reportgenerator, and emailer

As you can see, each shaded box in the preceding illustration becomes a Python module, and each of the groupings of boxes becomes a Python package.

Once you have decided on the collection of packages and modules that you want to define, you can start to implement each component by writing the appropriate set of functions within each module. For example, the backend.database module might have a function named get_students_results(), which returns a single student's exam results for a given subject and year.

Note

In a real web application, your modular structure may actually be somewhat different. This is because you typically create a web application using a web application framework such as Django, which imposes its own structure on your program. However, in this example we are keeping the modular structure as simple as possible to show how business functionality translates directly into packages and modules.

Obviously, this example is fictitious, but it shows how you can think about a complex program in modular terms, breaking it down into individual components and then using Python modules and packages to implement each of these components in turn.

 

Why use modular programming techniques?


One of the great things about using modular design techniques, as opposed to just leaping in and writing code, is that they force you to think about the way your program should be structured and let you define a structure that will grow as your program evolves. Your program will be robust, easy to understand, easy to restructure as the scope of the program expands, and easy for others to work with too.

Woodworkers have a motto that equally applies to modular programming: there's a place for everything, and everything should be in its place. This is one of the hallmarks of high quality code, just as it's a hallmark of a well-organized woodworker's workshop.

To see why modular programming is such an important skill, imagine what would happen if you didn't apply modular techniques when writing a program. If you put all your Python code into a single source file, didn't try to logically arrange your functions and classes, and just randomly added new code to the end of the file, you would end up with a terrible mess of incomprehensible code. The following is an example of a program written without any sort of modular organization:

import configparser

def load_config():
    config = configparser.ConfigParser()
    config.read("config.ini")
    return config['config']

def get_data_from_user():
    config = load_config()
    data = []
    for n in range(config.getint('num_data_points')):
        value = input("Data point {}: ".format(n+1))
        data.append(value)
    return data

def print_results(results):
    for value,num_times in results:
        print("{} = {}".format(value, num_times))

def analyze_data():
    data = get_data_from_user()
    results = {}
    config = load_config()
    for value in data:
        if config.getboolean('allow_duplicates'):
            try:
                results[value] = results[value] + 1
            except KeyError:
                results[value] = 1
        else:
            results[value] = 1
    return results

def sort_results(results):
    sorted_results = []
    for value in results.keys():
        sorted_results.append((value, results[value]))
    sorted_results.sort()
    return sorted_results

if __name__ == "__main__":
    results = analyze_data()
    sorted_results = sort_results(results)
    print_results(sorted_results)

This program is intended to prompt the user for a number of data points and count how often each data point occurs. It does work, and the function and variable names do help to explain what each part of the program does—but it is still a mess. Just looking at the source code, it is hard to figure out what this program does. Functions were just added to the end of the file as the author decided to implement them, and even for a relatively small program, it is difficult to keep track of the various pieces. Imagine trying to debug or maintain a program like this if it was 10,000 lines long!

This program is an example of spaghetti coding—programming where everything is jumbled together and there is no overall organization to the source code. Unfortunately, spaghetti coding is often combined with other programming habits that make a program even harder to understand. Some of the more common problems include:

  • Poorly chosen variable and function names that don't hint at what each variable or function is for. A typical example of this is a program that uses variable names such as a, b, c, and d.

  • A complete lack of any documentation explaining what the code is supposed to do.

  • Functions that have unexpected side effects. For example, imagine if the print_results() function in our example program modified the results array as it was being printed. If you wanted to print the results twice or use the results after they had been printed, your program would fail in a most mysterious way.

While modular programming won't cure all these ills, the fact that it forces you to think about the logical organization of your program will help you to avoid them. Organizing your code into logical pieces will help you structure your program so that you know where each part belongs. Thinking about the packages and modules, and what each module contains, will encourage you to choose clear and appropriate names for the various parts of your program. Using modules and packages also makes it natural to include docstrings to explain the functionality of each part of your program as you go along. Finally, using a logical structure encourages each part of your program to perform one particular task, reducing the likelihood of side effects creeping into your code.

Of course, like any programming technique, modular programming can be abused, but if it is used well it will vastly improve the quality of the programs you write.

 

Programming as a process


Imagine that you are writing a program to calculate the price of overseas purchases. Your company is based in England, and you need to calculate the local price of something purchased in US dollars. Someone else has already written a Python module which downloads the exchange rate, so your program starts out looking something like the following:

def calc_local_price(us_dollar_amount):
    exchange_rate = get_exchange_rate("USD", "EUR")
    local_amount = us_dollar_amount * exchange_rate
    return local_amount

So far so good. Your program is included in your company's online ordering system and the code goes into production. However, two months later, your company starts ordering products not just from the US, but from China, Germany, and Australia as well. You scramble to update your program to support these alternative currencies, and write something like the following:

def calc_local_price(foreign_amount, from_country):
    if from_country == "United States":
        exchange_rate = get_exchange_rate("USD", "EUR")
    elif from_country == "China":
        exchange_rate = get_exchange_rate("CHN", "EUR")
    elif from_country == "Germany":
        exchange_rate = get_exchange_rate("EUR", "EUR")
    elif from_country = "Australia":
        exchange_rate = get_exchange_rate("AUS", "EUR")
    else:
        raise RuntimeError("Unsupported country: " + from_country)
    local_amount = us_dollar_amount * exchange_rate
    return local_amount

Once again, this program goes into production. Six months later, another 14 countries are added, and the project manager also decides to add a new feature, where the user can see how the price of a product has changed over time. As the programmer responsible for this code, you now have to add support for those 14 countries, and also add support for historical exchange rates going back in time.

This is a contrived example, of course, but it does show how programs typically evolve. Program code isn't something you write once and then leave forever. Your program is constantly changing and evolving in response to new requirements, newly discovered bugs, and unexpected consequences. Sometimes, a change that seems simple can be anything but. For example, consider the poor programmer who wrote the get_exchange_rate() function in our previous example. This function now has to support not only the current exchange rate for any given pair of currencies, it also has to return historical exchange rates going back to any desired point in time. If this function is obtaining its information from a source that doesn't support historical exchange rates, then the whole function may need to be rewritten from scratch to support an alternative data source.

Sometimes, programmers and IT managers try to suppress change, for example by writing detailed specifications and then implementing one part of the program at a time (the so-called waterfall method of programming). But change is an integral part of programming, and trying to suppress it is like trying to stop the wind from blowing—it's much better to just accept that your program will change, and learn how to manage the process as well as you can.

Modular techniques are an excellent way of managing change in your programs. For example, as your program grows and evolves, you may find that a particular change requires the addition of a new module to your program:

You can then import and use that module in the other parts of your program that need to use this new functionality.

Alternatively, you might find that a new feature only requires you to change the contents of a module:

This is one of the major benefits of modular programming—since the details of how a particular feature is implemented is inside a module, you can often change the internals of a module without affecting any other parts of your program. The rest of your program continues to import and use the module as it did before—only the internal implementation of the module has changed.

Finally, you might find that you need to refactor your program. This is where you have to change the modular organization of your code to improve the way the program works:

Refactoring may involve moving code between modules as well as creating new modules, removing old ones, and changing the way modules work. In essence, refactoring is the process of rethinking the program so that it works better.

In all of these changes, the use of modules and packages help you to manage the changes you make. Because the various modules and packages each perform a well-defined task, you know exactly which parts of your program need to be changed, and you can limit the effects of your changes to only the affected modules and the parts of the system that use them.

Modular programming won't make change go away, but it will help you to deal with change—and the ongoing process of programming—in the best possible way.

 

The Python Standard Library


One of the buzzwords used to describe Python is that it is a batteries included language, that is, it comes with a rich collection of built-in modules and packages called the Python Standard Library. If you've written any non-trivial Python program, you've almost certainly used modules from the Python Standard Library to do so. To get an idea of how vast the Python Standard Library is, here are a few example modules from this library:

Module

Description

datetime

Defines classes to store and perform calculations using date and time values

tempfile

Defines a range of functions to work with temporary files and directories

csv

Supports reading and writing of CSV format files

hashlib

Implements cryptographically secure hashes

logging

Allows you to write log messages and manage log files

threading

Supports multi-threaded programming

html

A collection of modules (that is, a package) used to parse and generate HTML documents

unittest

A framework for creating and running unit tests

urllib

A collection of modules to read data from URLs

These are just a few of the over 300 modules available in the Python Standard Library. As you can see, there is a vast range of functionality provided, and all of this is built in to every Python distribution.

Because of the huge range of functionality provided, the Python Standard Library is an excellent example of modular programming. For example, the math standard library module provides a range of mathematical functions that make it easier to work with integer and floating-point numbers. If you look through the documentation for this module (http://docs.python.org/3/library/math.html), you will find a large collection of functions and constants, all defined within the math module, that perform almost any mathematical operation you could imagine. In this example, the various functions and constants are all defined within a single module, making it easy to refer to them when you need to.

In contrast, the xmlrpc package allows you to make and respond to remote procedure calls that use the XML protocol to send and receive data. The xmlrpc package is made up of two modules: xmlrpc.server and xmlrpc.client, where the server module allows you to create an XML-RPC server, and the client module includes code to access and use an XML-RPC server. This is an example of where a hierarchy of modules is used to logically group related functionality together (in this case, within the xmlrpc package), while using sub-modules to separate out the particular parts of the package.

If you haven't already done so, it is worth spending some time to review the documentation for the Python Standard Library. This can be found at https://docs.python.org/3/library/. It is worth studying this documentation to see how Python has organized such a vast collection of features into modules and packages.

The Python Standard Library is not perfect, but it has been improved over time, and the library as it is today makes a great example of modular programming techniques applied to a comprehensive library, covering a wide range of features and functions.

 

Creating your first module


Now that we've seen what modules are and how they can be used, let's implement our first real Python module. While this module is simple, you may find it a useful addition to the programs you write.

Caching

In computer programming, a cache is a way of storing previously calculated results so that they can be retrieved more quickly. For example, imagine that your program had to calculate shipping costs based on three parameters:

  • The weight of the ordered item

  • The dimensions of the ordered item

  • The customer's location

Calculating the shipping cost based on the customer's location might be quite involved. For example, you may have a fixed charge for deliveries within your city but charge a premium for out-of-town orders based on how far away the customer is. You may even need to send a query to a freight company's API to see how much it will charge to ship the given item.

Since the process of calculating the shipping cost can be quite complex and time consuming, it makes sense to use a cache to store the previously calculated results. This allows you to use the previously calculated results rather than having to recalculate the shipping cost each time. To do this, you would need to structure your calc_shipping_cost() function to look something like the following:

def calc_shipping_cost(params):
    if params in cache:
        shipping_cost = cache[params]
    else:
        ...calculate the shipping cost.
        cache[params] = shipping_cost
    return shipping_cost

As you can see, we take the supplied parameters (in this case, the weight, dimensions, and the customer's location) and check whether there is already an entry in the cache for those parameters. If so, we retrieve the previously-calculated shipping cost from the cache. Otherwise, we go through the possibly time-consuming process of calculating the shipping cost, storing this in the cache using the supplied parameters, and then returning the shipping cost back to the caller.

Notice how the cache variable in the preceding pseudo code looks very much like a Python dictionary—you can store entries in the dictionary based on a given key and then retrieve the entry using this key. There is, however, a crucial difference between a dictionary and a cache: a cache typically has a limit on the number of entries that it can contain, while the dictionary has no such limit. This means that a dictionary will continue to grow forever, possibly taking up all the computer's memory if the program runs for a long time, while a cache will never take too much memory, as the number of entries is limited.

Once the cache reaches its maximum size, an existing entry has to be removed each time a new entry is added so that the cache doesn't continue to grow:

While there are various ways of choosing the entry to remove, the most common way is to remove the least recently used entry, that is, the entry that hasn't been used for the longest period of time.

Caches are very commonly used in computer programs. In fact, even if you haven't yet used a cache in the programs you write, you've almost certainly encountered them before. Has someone ever suggested that you clear your browser's cache to solve a problem with your web browser? Yes, web browsers use a cache to hold previously downloaded images and web pages so that they don't have to be retrieved again, and clearing the contents of the browser cache is a common way of fixing a misbehaving web browser.

Writing a cache module

Let's now write our own Python module to implement a cache. Before we write it, let's think about the functionality that our cache module will require:

  • We're going to limit the size of our cache to 100 entries.

  • We will need an init() function to initialize the cache.

  • We will have a set(key, value) function to store an entry in the cache.

  • A get(key) function will retrieve an entry from the cache. If there is no entry for that key, this function should return None.

  • We'll also need a contains(key) function to check whether a given entry is in the cache.

  • Finally, we'll implement a size() function which returns the number of entries in the cache.

Note

We are deliberately keeping the implementation of this module quite simple. A real cache would make use of a Cache class to allow you to use multiple caches at once. It would also allow the size of the cache to be configured as necessary. To keep things simple, however, we will implement these functions directly within a module, as we want to concentrate on modular programming rather than combining it with object-oriented programming and other techniques.

Go ahead and create a new Python source file named cache.py. This file will hold the Python source code for our new module. At the top of this module, enter the following Python code:

import datetime

MAX_CACHE_SIZE = 100

We will be using the datetime Standard Library module to calculate the least recently used entry in the cache. The second statement, defining MAX_CACHE_SIZE, sets the maximum size for our cache.

Tip

Note that we are following the standard Python convention of defining constants using uppercase letters. This makes them easier to see in your source code.

We now want to implement the init() function for our cache. To do this, add the following to the end of your module:

def init():
    global _cache
    _cache = {} # Maps key to (datetime, value) tuple.

As you can see, we have created a new function named init(). The first statement in this function, global _cache, defines a new variable named _cache. The global statement makes this variable available as a module-level global variable, that is, this variable can be shared by all parts of the cache.py module.

Notice the underscore character at the start of the variable name. In Python, a leading underscore is a convention indicating that a name is private. In other words, the _cache global is intended to be used as an internal part of the cache.py module—the underscore tells you that you shouldn't need to use this variable outside of the cache.py module itself.

The second statement in the init() function sets the _cache global to an empty dictionary. Notice that we've added a comment explaining how the dictionary will be used; it's good practice to add notes like this to your code so others (and you, when you look at this code after a long time working on something else) can easily see what this variable is used for.

In summary, calling the init() function has the effect of creating a private _cache variable within the module and setting it to an empty dictionary. Let's now write the set() function, which will use this variable to store an entry in the cache.

Add the following to the end of your module:

def set(key, value):
    global _cache
    if key not in _cache and len(_cache) >= MAX_CACHE_SIZE:
        _remove_oldest_entry()
    _cache[key] = [datetime.datetime.now(), value]

Once again, the set() function starts with a global _cache statement. This makes the _cache module-level global variable available for the function to use.

The if statement checks to see whether the cache is going to exceed the maximum allowed size. If so, we call a new function, named _remove_oldest_entry(), to remove the oldest entry from the cache. Notice how this function name also starts with an underscore—once again, this indicates that this function is private and should only be used by code within the module itself.

Finally, we store the entry in the _cache dictionary. Notice that we store the current date and time as well as the value in the cache; this will let us know when the cache entry was last used, which is important when we have to remove the oldest entry.

Let's now implement the get() function. Add the following to the end of your module:

def get(key):
    global _cache
    if key in _cache:
        _cache[key][0] = datetime.datetime.now()
        return _cache[key][1]
    else:
        return None

You should be able to figure out what this code does. The only interesting part to note is that we update the date and time for the cache entry before returning the associated value. This lets us know when the cache entry was last used.

With these functions implemented, the remaining two functions should also be easy to understand. Add the following to the end of your module:

def contains(key):
    global _cache
    return key in _cache

def size():
    global _cache
    return len(_cache)

There shouldn't be any surprises here.

There's only one more function left to implement: our private _remove_oldest_entry() function. Add the following to the end of your module:

def _remove_oldest_entry():
    global _cache
    oldest = None
    for key in _cache.keys():
        if oldest == None:
            oldest = key
        elif _cache[key][0] < _cache[oldest][0]:
            oldest = key
    if oldest != None:
        del _cache[oldest]

This completes the implementation of our cache.py module itself, with the five main functions we described earlier, as well as one private function and one private global variable which are used internally to help implement our public functions.

Using the cache

Let's now write a simple test program to use this cache module and verify that it's working properly. Create a new Python source file, which we'll call test_cache.py, and add the following to this file:

import random
import string
import cache

def random_string(length):
    s = ''
    for i in range(length):
        s = s + random.choice(string.ascii_letters)
    return s

cache.init()

for n in range(1000):
    while True:
        key = random_string(20)
        if cache.contains(key):
            continue
        else:
            break
    value = random_string(20)
    cache.set(key, value)
    print("After {} iterations, cache has {} entries".format(n+1, cache.size()))

This program starts by importing three modules: two from the Python Standard Library, and the cache module we have just written. We then define a utility function named random_string(), which generates a string of random letters of a given length. After this, we initialize the cache by calling cache.init() and then generate 1,000 random entries to add to the cache. After adding each cache entry, we print out the number of entries we have added as well as the current cache size.

If you run this program, you can see that it's working as expected:

$ python test_cache.py
After 1 iterations, cache has 1 entries
After 2 iterations, cache has 2 entries
After 3 iterations, cache has 3 entries
...
After 98 iterations, cache has 98 entries
After 99 iterations, cache has 99 entries
After 100 iterations, cache has
 100 entries
After 101 iterations, cache has 100 entries
After 102 iterations, cache has 100 entries
...
After 998 iterations, cache has 100 entries
After 999 iterations, cache has 100 entries
After 1000 iterations, cache has 100 entries

The cache continues to grow until it reaches 100 entries, at which point the oldest entry is removed to make room for a new one. This ensures that the cache stays the same size, no matter how many new entries are added.

While there is a lot more we could do with our cache.py module, this is enough to demonstrate how to create a useful Python module and then use it within another program. Of course, you aren't just limited to importing modules within a main program—modules can import other modules as well.

 

Summary


In this chapter, we introduced the concept of Python modules and saw how Python modules are simply Python source files, which are imported and used by another source file. We then took a look at Python packages and saw that these are collections of modules identified by a package initialization file named __init__.py.

We explored how modules and packages can be used to organize your program's source code and why the use of these modular techniques is so important for the development of large systems. We also explored what spaghetti code looks like and discovered some of the other pitfalls that can occur if you don't modularize your programs.

Next, we looked at programming as a process of constant change and evolution and how modular programming can help deal with a changing codebase in the best possible way. We then learned that the Python Standard Library is an excellent example of a large collection of modules and packages, and finished by creating our own simple Python module that demonstrates effective modular programming techniques. In implementing this module, we learned how a module can use leading underscores in variable and function names to mark them as private to the module, while making the remaining functions and other definitions available for other parts of the system to use.

In the next chapter, we will apply modular techniques to the development of a more sophisticated program consisting of several modules working together to solve a more complex programming problem.

About the Author
  • Erik Westra

    Erik Westra has been a professional software developer for over 25 years, and has worked almost exclusively in Python for the past decade. Eriks early interest in Graphical User Interface design led to the development of one of the most advanced urgent courier dispatch systems used by messenger and courier companies worldwide. In recent years, Erik has been involved in the design and implementation of systems matching seekers with providers of goods and services across a range of geographical areas, as well as real-time messaging, payment, and identity systems. This work has included the creation of real-time geocoders and map-based views of constantly changing data. Erik is based in New Zealand, and works for companies worldwide. Erik is the author of the following Packt books: Python Geospatial Development (third edition), Python Geospatial Analysis, Building Mapping Applications with QGIS, and Modular Programming with Python. Erik has also authored the video course entitled Introduction to QGIS Python Programming.

    Browse publications by this author
Latest Reviews (6 reviews total)
This is one of the best books about python j have read. It teaches you how to code and organize your python code. People who come from oops background often get confused with python's code structure. This is a great book to learn about code structure in python world. Highly recommended.
I was not sure who this book was written for as it assumes a basic knowledge of Python. I would have assumed that anyone with the required basic knowledge would already know all about modules and therefore not learn anything new, but I was wrong. I found the book extremely helpful. I particularly liked the way the author walks you through the examples.
Little to do with Python.
Modular Programming with Python
Unlock this book and the full library FREE for 7 days
Start now