Python 3 Object Oriented Programming: Managing objects

Exclusive offer: get 50% off this eBook here
Python 3 Object Oriented Programming

Python 3 Object Oriented Programming — Save 50%

Harness the power of Python 3 objects

£18.99    £9.50
by Dusty Phillips | August 2010 | Open Source

In the previous article on Python 3: When to Use Object-oriented Programming, the focus was on objects and their attributes and methods.

In this article by Dusty Phillips, author of Python 3 Object Oriented Programming, we'll take a look at designing higher-level objects; the kind of objects that manage other objects. The objects that tie everything together.

(For more resources on Python 3, see here.)

Managing objects

The difference between these objects and most of the examples we've seen so far is that our examples tend to represent concrete ideas. Management objects are more like office managers; they don't do the actual "visible" work out on the floor, but without them, there would be no communication between departments and nobody would know what they are supposed to do. Analogously, the attributes on a management class tend to refer to other objects that do the "visible" work; the behaviors on such a class delegate to those other classes at the right time, and pass messages between them.

As an example, we'll write a program that does a find and replace action for text files stored in a compressed ZIP file. We'll need objects to represent the ZIP file and each individual text file (luckily, we don't have to write these classes, they're available in the Python Standard Library). The manager object will be responsible for ensuring three steps occur in order:

  1. Unzipping the compressed file.
  2. Performing the find and replace action.
  3. Zipping up the new files.

The class is initialized with the .zip filename and search and replace strings. We create a temporary directory to store the unzipped files in, so that the folder stays clean. We also add a useful helper method for internal use that helps identify an individual filename inside that directory:

import sys
import os
import shutil
import zipfile

class ZipReplace:
def __init__(self, filename, search_string,
replace_string):
self.filename = filename
self.search_string = search_string
self.replace_string = replace_string
self.temp_directory = "unzipped-{}".format(
filename)


def _full_filename(self, filename):
return os.path.join(self.temp_directory, filename)

Then we create an overall "manager" method for each of the three steps. This method delegates responsibility to other methods. Obviously, we could do all three steps in one method, or indeed, in one script without ever creating an object. There are several advantages to separating the three steps:

  1. Readability: The code for each step is in a self-contained unit that is easy to read and understand. The method names describe what the method does, and no additional documentation is required to understand what is going on.
  2. Extensibility: If a subclass wanted to use compressed TAR files instead of ZIP files, it could override the zip and unzip methods without having to duplicate the find_replace method.
  3. Partitioning: An external class could create an instance of this class and call the find and replace method directly on some folder without having to zip the content.

The delegation method is the first in the code below; the rest of the methods are included for completeness:

def zip_find_replace(self):
self.unzip_files()
self.find_replace()
self.zip_files()


def unzip_files(self):
os.mkdir(self.temp_directory)
zip = zipfile.ZipFile(self.filename)
try:
zip.extractall(self.temp_directory)
finally:
zip.close()

def find_replace(self):
for filename in os.listdir(self.temp_directory):
with open(self._full_filename(filename)) as file:
contents = file.read()
contents = contents.replace(
self.search_string, self.replace_string)
with open(
self._full_filename(filename), "w") as file:
file.write(contents)

def zip_files(self):
file = zipfile.ZipFile(self.filename, 'w')
for filename in os.listdir(self.temp_directory):
file.write(
self._full_filename(filename), filename)
shutil.rmtree(self.temp_directory)

if __name__ == "__main__":
ZipReplace(*sys.argv[1:4]).zip_find_replace()

For brevity, the code for zipping and unzipping files is sparsely documented. Our current focus is on object-oriented design; if you are interested in the inner details of the zipfile module, refer to the documentation in the standard library, either online at http://docs.python.org/library/zipfile.html or by typing import zipfile ; help(zipfile) into your interactive interpreter. Note that this example only searches the top-level files in a ZIP file; if there are any folders in the unzipped content, they will not be scanned, nor will any files inside those folders.

The last two lines in the code allow us to run the example from the command line by passing the zip filename, search string, and replace string as arguments:

python zipsearch.py hello.zip hello hi

Of course, this object does not have to be created from the command line; it could be imported from another module (to perform batch ZIP file processing) or accessed as part of a GUI interface or even a higher-level management object that knows what to do with ZIP files (for example to retrieve them from an FTP server or back them up to an external disk).

As programs become more and more complex, the objects being modeled become less and less like physical objects. Properties are other abstract objects and methods are actions that change the state of those abstract objects. But at the heart of every object, no matter how complex, is a set of concrete properties and well-defined behaviors.

Removing duplicate code

Often the code in management style classes such as ZipReplace is quite generic and can be applied in many different ways. It is possible to use either composition or inheritance to help keep this code in one place, thus eliminating duplicate code. Before we look at any examples of this, let's discuss a tiny bit of theory. Specifically: why is duplicate code a bad thing?

There are several reasons, but they all boil down to readability and maintainability. When we're writing a new piece of code that is similar to an earlier piece, the easiest thing to do is copy the old code and change whatever needs to change (variable names, logic, comments) to make it work in the new location. Alternatively, if we're writing new code that seems similar, but not identical to code elsewhere in the project, the easiest thing to do is write fresh code with similar behavior, rather than figure out how to extract the overlapping functionality.

But as soon as someone has to read and understand the code and they come across duplicate blocks, they are faced with a dilemma. Code that might have made sense suddenly has to be understood. How is one section different from the other? How are they the same? Under what conditions is one section called? When do we call the other? You might argue that you're the only one reading your code, but if you don't touch that code for eight months it will be as incomprehensible to you as to a fresh coder. When we're trying to read two similar pieces of code, we have to understand why they're different, as well as how they're different. This wastes the reader's time; code should always be written to be readable first.

I once had to try to understand someone's code that had three identical copies of the same three hundred lines of very poorly written code. I had been working with the code for a month before I realized that the three "identical" versions were actually performing slightly different tax calculations. Some of the subtle differences were intentional, but there were also obvious areas where someone had updated a calculation in one function without updating the other two. The number of subtle, incomprehensible bugs in the code could not be counted.

Reading such duplicate code can be tiresome, but code maintenance is an even greater torment. As the preceding story suggests, keeping two similar pieces of code up to date can be a nightmare. We have to remember to update both sections whenever we update one of them, and we have to remember how the multiple sections differ so we can modify our changes when we are editing each of them. If we forget to update both sections, we will end up with extremely annoying bugs that usually manifest themselves as, "but I fixed that already, why is it still happening?"

The result is that people who are reading or maintaining our code have to spend astronomical amounts of time understanding and testing it compared to if we had written the code in a non-repetitive manner in the first place. It's even more frustrating when we are the ones doing the maintenance. The time we save by copy-pasting existing code is lost the very first time we have to maintain it. Code is both read and maintained many more times and much more often than it is written. Comprehensible code should always be paramount.

This is why programmers, especially Python programmers (who tend to value elegant code more than average), follow what is known as the Don't Repeat Yourself, or DRY principle. DRY code is maintainable code. My advice to beginning programmers is to never use the copy and paste feature of their editor. To intermediate programmers, I suggest they think thrice before they hit Ctrl + C.

But what should we do instead of code duplication? The simplest solution is often to move the code into a function that accepts parameters to account for whatever sections are different. This isn't a terribly object-oriented solution, but it is frequently sufficient. For example, if we have two pieces of code that unzip a ZIP file into two different directories, we can easily write a function that accepts a parameter for the directory to which it should be unzipped instead. This may make the function itself slightly more difficult to read, but a good function name and docstring can easily make up for that, and any code that invokes the function will be easier to read.

That's certainly enough theory! The moral of the story is: always make the effort to refactor your code to be easier to read instead of writing bad code that is only easier to write.

Python 3 Object Oriented Programming Harness the power of Python 3 objects
Published: July 2010
eBook Price: £18.99
Book Price: £30.99
See more
Select your format and quantity:

(For more resources on Python 3, see here.)

In practice

Let's explore two ways we can reuse existing code. After writing our code to replace strings in a ZIP file full of text files, we are later contracted to scale all the images in a ZIP file to 640x480. Looks like we could use a very similar paradigm to what we used in ZipReplace. The first impulse, obviously, would be to save a copy of that file and change the find_replace method to scale_image or something similar. But, that's just not cool. What if someday we want to change the unzip and zip methods to also open TAR files? Or maybe we want to use a guaranteed unique directory name for temporary files. In either case, we'd have to change it in two different places!

We'll start by demonstrating an inheritance-based solution to this problem. First we'll modify our original ZipReplace class into a superclass for processing generic ZIP files:

import os
import shutil
import zipfile

class ZipProcessor:
def __init__(self, zipname):
self.zipname = zipname
self.temp_directory = "unzipped-{}".format(
zipname[:-4])

def _full_filename(self, filename):
return os.path.join(self.temp_directory, filename)

def process_zip(self):
self.unzip_files()
self.process_files()
self.zip_files()

def unzip_files(self):
os.mkdir(self.temp_directory)
zip = zipfile.ZipFile(self.zipname)
try:
zip.extractall(self.temp_directory)
finally:
zip.close()

def zip_files(self):
file = zipfile.ZipFile(self.zipname, 'w')
for filename in os.listdir(self.temp_directory):
file.write(self._full_filename(
filename), filename)
shutil.rmtree(self.temp_directory)

We changed the filename property to zipfile to avoid confusion with the filename local variables inside the various methods. This helps make the code more readable even though it isn't actually a change in design. We also dropped the two parameters to __init__ (search_string and replace_string) that were specific to ZipReplace. Then we renamed the zip_find_replace method to process_zip and made it call an (as yet undefined) process_files method instead of find_replace; these name changes help demonstrate the more generalized nature of our new class. Notice that we have removed the find_replace method altogether; that code is specific to ZipReplace and has no business here.

This new ZipProcessor class doesn't actually define a process_files method; so if we ran it directly, it would raise an exception. Since it actually isn't meant to be run directly, we also removed the main call at the bottom of the original script.

Now, before we move on to our image processing app, let's fix up our original zipsearch to make use of this parent class:

from zip_processor import ZipProcessor
import sys
import os

class ZipReplace(ZipProcessor):
def __init__(self, filename, search_string,
replace_string):
super().__init__(filename)
self.search_string = search_string
self.replace_string = replace_string

def process_files(self):
'''perform a search and replace on all files
in the temporary directory'''

for filename in os.listdir(self.temp_directory):
with open(self._full_filename(filename)) as file:
contents = file.read()
contents = contents.replace(
self.search_string, self.replace_string)
with open(
self._full_filename(filename), "w") as file:
file.write(contents)

if __name__ == "__main__":
ZipReplace(*sys.argv[1:4]).process_zip()

This code is a bit shorter than the original version, since it inherits its ZIP processing abilities from the parent class. We first import the base class we just wrote and make ZipReplace extend that class. Then we use super() to initialize the parent class. The find_replace method is still here, but we renamed it to process_files so the parent class can call it. Because this name isn't as descriptive as the old one, we added a docstring to describe what it is doing.

Now, that was quite a bit of work, considering that all we have now is a program that is functionally no different from the one we started with! But having done that work, it is now much easier for us to write other classes that operate on files in a ZIP archive, such as our photo scaler. Further, if we ever want to improve the zip functionality, we can do it for all classes by changing only the one ZipProcessor base class. Maintenance will be much more effective.

See how simple it is, now to create a photo scaling class that takes advantage of the ZipProcessor functionality. (Note: this class requires the third-party pygame library to be installed. You can download it from http://www.pygame.org/.)

from zip_processor import ZipProcessor
import os
import sys
from pygame import image
from pygame.transform import scale


class ScaleZip(ZipProcessor):

def process_files(self):
'''Scale each image in the directory to 640x480'''
for filename in os.listdir(self.temp_directory):
im = image.load(self._full_filename(filename))
scaled = scale(im, (640,480))
image.save(scaled, self._full_filename(filename))

if __name__ == "__main__":
ScaleZip(*sys.argv[1:4]).process_zip()

All that work we did earlier paid off! Look how simple this class is! All we do is open each file (assuming that it is an image; it will unceremoniously crash if the file cannot be opened), scale it, and save it back. The ZipProcessor takes care of the zipping and unzipping without any extra work on our part.

Or we can use composition

Now, let's try solving the same problem using a composition-based solution. Even though we're completely changing paradigms, from inheritance to composition, we only have to make a minor modification to our ZipProcessor class:

import os
import shutil
import zipfile

class ZipProcessor:
def __init__(self, zipname, processor):
self.zipname = zipname
self.temp_directory = "unzipped-{}".format(
zipname[:-4])
self.processor = processor

def _full_filename(self, filename):
return os.path.join(self.temp_directory, filename)

def process_zip(self):
self.unzip_files()
self.processor.process(self)
self.zip_files()

def unzip_files(self):
os.mkdir(self.temp_directory)
zip = zipfile.ZipFile(self.zipname)
try:
zip.extractall(self.temp_directory)
finally:
zip.close()

def zip_files(self):
file = zipfile.ZipFile(self.zipname, 'w')
for filename in os.listdir(self.temp_directory):
file.write(self._full_filename(filename), filename)
shutil.rmtree(self.temp_directory)

All we did was change the initializer to accept a processor object. The process_zip function now calls a method on that processor object; the method called accepts a reference to the ZipProcessor itself. Now we can change our ZipReplace class to be a suitable processor object that no longer uses inheritance:

from zip_processor import ZipProcessor
import sys
import os

class ZipReplace:
def __init__(self, search_string,
replace_string):

self.search_string = search_string
self.replace_string = replace_string

def process(self, zipprocessor):
'''perform a search and replace on all files in the
temporary directory'''
for filename in os.listdir(
zipprocessor.temp_directory):

with open(
zipprocessor._full_filename(filename)) as file:
contents = file.read()
contents = contents.replace(
self.search_string, self.replace_string)
with open(zipprocessor._full_filename(
filename), "w") as file:

file.write(contents)

if __name__ == "__main__":
zipreplace = ZipReplace(*sys.argv[2:4])
ZipProcessor(sys.argv[1], zipreplace).process_zip()

We didn't actually change much here; the class no longer inherits from ZipProcessor, and when we process the files, we accept a zipprocessor object that gives us the function to calculate __full_filename. In the bottom two lines, when we run from the command line, we first construct a ZipReplace object. This is then passed into the ZipProcessor constructor so the two objects can communicate.

This design is a terrific separation of interests. Now we have a ZipProcessor that can accept any object that has a process method to do the actual processing. Further, we have a ZipReplace that can be passed to any method, function, or object that wants to call its process function; it is no longer tied to the zip processing code through an inheritance relationship; it could now be applied with equal ease to a local or network filesystem, for example, or to a different kind of compressed file such as a RAR archive.

Any inheritance relationship can be modeled as a composition relationship (change the "is a" to a "has a parent") instead, but that does not mean it always should be. And the reverse is not true, most composition relationships cannot be (properly) modeled as inheritance.

Case study

For this case study, we'll try to delve further into the question, "when should I choose an object versus a built-in type?" We'll be modeling a Document class that might be used in a text editor or word processor. What objects, functions, or properties should it have?

We might start with a str for the Document contents, but strings aren't mutable. A mutable object is one that can be changed; but a str is immutable, we can't insert a character into it or remove one without creating a brand new string object. That's leaving a lot of str objects for Python's garbage collector to clean up behind us. So, instead of a string, we'll use a list of characters, which we can modify at will. In addition, a Document would need to know the current cursor position within the list, and should also store a filename for the document.

Now, what methods should it have? There are a lot of things we might want to do to a text document, including inserting and deleting characters, cut, copy, paste, and saving or closing the document. It looks like there are copious amounts of both data and behavior, so it makes sense to put all this stuff into its own Document class.

The question is, should this class be composed of a bunch of basic Python objects such as str filenames, int cursor positions, and a list of characters? Or should some or all of those things be specially defined objects in their own right? What about individual lines and characters, do they need to have classes of their own?

We'll answer these questions as we go, but let's just design the simplest possible Document class first and see what it can do:

class Document:
def __init__(self):
self.characters = []
self.cursor = 0
self.filename = ''

def insert(self, character):
self.characters.insert(self.cursor, character)
self.cursor += 1

def delete(self):
del self.characters[self.cursor]

def save(self):
f = open(self.filename, 'w')
f.write(''.join(self.characters))
f.close()

def forward(self):
self.cursor += 1

def back(self):
self.cursor -= 1

This simple class allows us full control over editing a basic document. Have a look at it in action:

>>> doc = Document()
>>> doc.filename = "test_document"
>>> doc.insert('h')
>>> doc.insert('e')
>>> doc.insert('l')
>>> doc.insert('l')
>>> doc.insert('o')
>>> "".join(doc.characters)
'hello'
>>> doc.back()
>>> doc.delete()
>>> doc.insert('p')
>>> "".join(doc.characters)
'hellp'

Looks like it's working. We could connect a keyboard's letter and arrow keys to these methods and the document would track everything just fine.

But what if we want to connect more than just arrow keys. What if we want to connect the Home and End keys as well? We could add more methods to the Document class that search forward or backwards for newline characters (in Python, a newline character, or \n represents the end of one line and the beginning of a new one) in the string and jump to them, but if we did that for every possible movement action (move by words, move by sentences, Page Up, Page Down, end of line, beginning of whitespace, and more), the class would be huge. Maybe it would be better to put those methods on a separate object. What we can do is turn the cursor attribute into an object that is aware of its position and can manipulate that position. We can move the forward and back methods to that class, and add a couple more for the Home and End keys:

class Cursor:
def __init__(self, document):
self.document = document
self.position = 0

def forward(self):
self.position += 1

def back(self):
self.position -= 1

def home(self):
while self.document.characters[
self.position-1] != '\n':

self.position -= 1
if self.position == 0:
# Got to beginning of file before newline
break

def end(self):
while self.position < len(self.document.characters
) and self.document.characters[
self.position] != '\n':

self.position += 1

This class takes the document as an initialization parameter so the methods have access to the contents of the document's character list. It then provides simple methods for moving backwards and forwards, as before, and for moving to the home and end positions.

This code is not very safe. You can very easily move past the ending position, and if you try to go home on an empty file it will crash. These examples are kept short to make them readable, that doesn't mean they are defensive! You can improve the error checking of this code as an exercise; it might be a great opportunity to expand your exception handling skills.

The Document class itself is hardly changed, except for removing the two methods that were moved to the Cursor class:

class Document:
def __init__(self):
self.characters = []
self.cursor = Cursor(self)
self.filename = ''

def insert(self, character):
self.characters.insert(self.cursor.position,
character)

self.cursor.forward()

def delete(self):
del self.characters[self.cursor.position]

def save(self):
f = open(self.filename, 'w')
f.write(''.join(self.characters))
f.close()

We simply updated anything that accessed the old cursor integer to use the new object instead. We can test that the home method is really moving to the newline character.

>>> d = Document()
>>> d.insert('h')
>>> d.insert('e')
>>> d.insert('l')
>>> d.insert('l')
>>> d.insert('o')
>>> d.insert('\n')
>>> d.insert('w')
>>> d.insert('o')
>>> d.insert('r')
>>> d.insert('l')
>>> d.insert('d')
>>> d.cursor.home()
>>> d.insert("*")
>>> print("".join(d.characters))
hello
*world

Now, since we've been using that string join function a lot (to concatenate the characters so we can see the actual document contents), we can add a property to the Document class to give us the complete string:

@property
def string(self):
return "".join(self.characters)

This makes our testing a little simpler:

>>> print(d.string)
hello
world

This framework is easy enough to extend to create a complete text editor document. Now, let's make it work for rich text; text that can have bold, underlined, or italic characters. There are two ways we could process this; the first is to insert "fake" characters into our character list that act like instructions such as "bold characters until you find a stop bold character". The second is to add information to each character indicating what formatting it should have. While the former method is probably more common, we'll implement the latter solution. To do that, we're obviously going to need a class for characters. This class will have an attribute representing the character, as well as three boolean attributes representing whether it is bold, italic, or underlined.

Hmm, Wait! Is this character class going to have any methods? If not, maybe we should use one of the many Python data structures instead; a tuple or named tuple would probably be sufficient. Are there any actions that we would want to do to, or invoke on a character?

Well, clearly, we might want to do things with characters, such as delete or copy them, but those are things that need to be handled at the Document level, since they are really modifying the list of characters. Are there things that need to be done to individual characters?

Actually, now that we're thinking about what a Character actually is... what is it? Would it be safe to say that a Character is a string? Maybe we should use an inheritance relationship here? Then we can take advantage of the numerous methods that str instances come with.

What sorts of methods are we talking about? There's startswith, strip, find, lower, and many more. Most of these methods expect to be working on strings that contain more than one character. In contrast, if Character were to subclass str, we'd probably be wise to override __init__ to raise an exception if a multi-character string were supplied. Since all those methods we'd get for free wouldn't really apply to our Character class, it turns out we shouldn't use inheritance, after all.

This leaves us at our first question; should Character even be a class? There is a very important special method on the object class that we can take advantage of to represent our characters. This method, called __str__ (two underscores, like __init__), is used in string manipulation functions like print and the str constructor to convert any class to a string. The default implementation does some boring stuff like printing the name of the module and class and its address in memory. But if we override it, we can make it print whatever we like. For our implementation, we can make it prefix characters with special characters to represent whether they are bold, italic, or underlined. So we will create a class to represent a character, and here it is:

class Character:
def __init__(self, character,
bold=False, italic=False, underline=False):
assert len(character) == 1
self.character = character
self.bold = bold
self.italic = italic
self.underline = underline

def __str__(self):
bold = "*" if self.bold else ''
italic = "/" if self.italic else ''
underline = "_" if self.underline else ''
return bold + italic + underline + self.character

This class allows us to create characters and prefix them with a special character when the str() function is applied to them. Nothing too exciting there. We only have to make a few minor modifications to the Document and Cursor classes to work with this class. In the Document class, we add these two lines at the beginning of the insert method:

def insert(self, character):
if not hasattr(character, 'character'):
character = Character(character)

This is a rather strange bit of code. Its basic purpose is to check if the character being passed in is a Character or a str. If it is a string, it is wrapped in a Character class so all objects in the list are Character objects. However, it is entirely possible that someone using our code would want to use a class that is neither Character nor string, using duck typing. If the object has a character attribute, we assume it is a "Character-like" object. But if it does not, we assume it is a "str-like" object and wrap it in a Character. This helps the program take advantage of duck typing as well as polymorphism; as long as an object has a character attribute, it can be used in the Document. This could be very useful, for example, if we wanted to make a programmer's editor with syntax highlighting: we'd need extra data on the character, such as what type of token the character belongs to.

In addition, we need to modify the string property on Document to accept the new Character values. All we need to do is call str() on each character before we join it:

@property
def string(self):
return "".join((str(c) for c in self.characters))

This code uses a generator expression. It's simply a shortcut to perform a specific action on all the objects in a sequence.

Finally we also need to check Character.character, instead of just the string character we were storing before, in the home and end functions when we're looking to see if it matches a newline.

def home(self):
while self.document.characters[
self.position-1].character != '\n':

self.position -= 1
if self.position == 0:
# Got to beginning of file before newline
break

def end(self):
while self.position < len(
self.document.characters) and \
self.document.characters[
self.position
].character != '\n':

self.position += 1

This completes the formatting of characters. We can test it to see that it works:

>>> d = Document()
>>> d.insert('h')
>>> d.insert('e')
>>> d.insert(Character('l', bold=True))
>>> d.insert(Character('l', bold=True))
>>> d.insert('o')
>>> d.insert('\n')
>>> d.insert(Character('w', italic=True))
>>> d.insert(Character('o', italic=True))
>>> d.insert(Character('r', underline=True))
>>> d.insert('l')
>>> d.insert('d')
>>> print(d.string)
he*l*lo
/w/o_rld
>>> d.cursor.home()
>>> d.delete()
>>> d.insert('W')
>>> print(d.string)
he*l*lo
W/o_rld
>>> d.characters[0].underline = True
>>> print(d.string)
_he*l*lo
W/o_rld
>>>

As expected, whenever we print the string, each bold character is preceded by a *, each italic character by a /, and each underlined character by a _. All our functions seem to work, and we can modify characters in the list after the fact. We have a working rich text document object that could be plugged into a user interface and hooked up with a keyboard for input and a screen for output. Naturally, we'd want to display real bold, italic, and underlined characters on the screen, instead of using our __str__ method, but it was sufficient for the basic testing we demanded of it.

Exercises

We've looked at various ways that objects, data, and methods can interact with each other in an object-oriented Python program. As usual, your first thoughts should be how you can apply these principles to your own work. Do you have any messy scripts lying around that could be rewritten using an object-oriented manager? Look through some of your old code and look for methods that are not actions. If the name isn't a verb, try rewriting it as a property.

Think about code you've written in any language. Does it break the DRY principle? Is there any duplicate code? Did you copy and paste code? Did you write two versions of similar pieces of code because you didn't feel like understanding the original code? Go back over some of your recent code now and see if you can refactor the duplicate code using inheritance or composition. Try to pick a project you're still interested in maintaining; not code so old that you never want to touch it again. It helps keep your interest up when you do the improvements!

Now, look back over some of the examples we saw in this article. Start with the cached webpage example which uses a property to cache the retrieved data. An obvious problem with this example is that the cache is never refreshed. Add a timeout to the getter for the property, and only return the cached page if the page has been requested before the timeout has expired. You can use the time module (time.time() - an_old_time returns the number of seconds that have elapsed since an_old_time) to determine whether the cache has expired.

Now look at the composition and inheritance based versions of ZipProcessor. We wrote an inheritance-based ScaleZipper, but didn't port it to the composite ZipProcessor. Try writing the composite ScaleZipper and compare the two pieces of code. Which version do you find easier to use? Which is more elegant? What is easier to read? These are subjective questions; the answer varies for each of us. Knowing the answer, however, is important; if you find you prefer inheritance over composition, you have to pay attention that you don't overuse inheritance in your daily coding. If you prefer composition, make sure you don't miss opportunities to create an elegant inheritance-based solution.

Finally, add some error handlers to the various classes we created in the case study. They should ensure single characters are entered, that you don't try to move the cursor past the end or beginning of the file, that you don't delete a character that doesn't exist, and that you don't save a file without a filename. Try to think of as many edge cases as you can, and account for them. Consider different ways to handle them; should you raise an exception when the user tries to move past the end of the file, or just stay on the last character?

Pay attention, in your daily coding, to the copy and paste commands. Every time you use them in your editor, consider whether it would be a good idea to improve your program's organization so that you only have one version of the code you are about to copy.

Summary

In this article, we focused on identifying objects, especially objects that are not immediately apparent; objects that manage and control. In particular, we covered:

  1. The DRY principle and the follies of duplicate code
  2. Inheritance and composition for reducing code duplication

Further resources on this subject:


Python 3 Object Oriented Programming Harness the power of Python 3 objects
Published: July 2010
eBook Price: £18.99
Book Price: £30.99
See more
Select your format and quantity:

About the Author :


Dusty Phillips

Dusty Phillips is a Canadian freelance software developer, teacher, martial artist, and open source aficionado. He is closely affiliated with the Arch Linux community and other open source projects. He maintains the Arch Linux storefronts and has compiled the Arch Linux Handbook. Dusty holds a master's degree in computer science, with specialization in Human Computer Interaction. He currently has six different Python interpreters installed on his computer.

Books From Packt


Matplotlib for Python Developers
Matplotlib for Python Developers

Python Testing: Beginner's Guide
Python Testing: Beginner's Guide

MySQL for Python
MySQL for Python

Expert Python Programming
Expert Python Programming

Microsoft Dynamics NAV 2009 Application Design
Grok 1.0 Web Development

Blender 2.49 Scripting
Blender 2.49 Scripting

Agile Web Application Development with Yii1.1 and PHP5
Agile Web Application Development with Yii1.1 and PHP5

jQuery 1.4 Reference Guide
jQuery 1.4 Reference Guide


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software