Python 3: When to Use Object-oriented Programming

(For more resources on Python 3, see here.)

Treat objects as objects

This may seem obvious, but you should generally give separate objects in your problem domain a special class in your code. The process is generally to identify objects in the problem and then model their data and behaviors.

Identifying objects is a very important task in object-oriented analysis and programming. But it isn't always as easy as counting the nouns in a short paragraph, as we've been doing. Remember, objects are things that have both data and behavior. If we are working with only data, we are often better off storing it in a list, set, dictionary, or some other Python data structure. On the other hand, if we are working with only behavior, with no stored data, a simple function is more suitable.

An object, however, has both data and behavior. Most Python programmers use built-in data structures unless (or until) there is an obvious need to define a class. This is a good thing; there is no reason to add an extra level of abstraction if it doesn't help organize our code. Sometimes, though, the "obvious" need is not so obvious.

A Python programmer often starts by storing data in a few variables. As our program expands, we will later find that we are passing the same set of related variables to different functions. This is the time to think about grouping both variables and functions into a class. If we are designing a program to model polygons in two-dimensional space, we might start with each polygon being represented as a list of points. The points would be modeled as two-tuples (x,y) describing where that point is located. This is all data, stored in two nested data structures (specifically, a list of tuples):

square = [(1,1), (1,2), (2,2), (2,1)]

Now, if we want to calculate the distance around the perimeter of the polygon, we simply need to sum the distances between the two points, but to do that, we need a function to calculate the distance between two points. Here are two such functions:

import math

def distance(p1, p2):
return math.sqrt((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2)
def perimeter(polygon):
perimeter = 0
points = polygon + [polygon[0]]
for i in range(len(polygon)):
perimeter += distance(points[i], points[i+1])
return perimeter

Now, as object-oriented programmers, we clearly recognize that a polygon class could encapsulate the list of points (data) and the perimeter function (behavior). Further, a point class, might encapsulate the x and y coordinates and the distance method. But should we do this?

For the previous code, maybe, maybe not. We've been studying object-oriented principles long enough that we can now write the object-oriented version in record time:

import math
class Point:
def __init__(self, x, y):
self.x = x
self.y = y

def distance(self, p2):
return math.sqrt((self.x-p2.x)**2 + (self.y-p2.y)**2)

class Polygon:
def __init__(self):
self.vertices = []
def add_point(self, point):

def perimeter(self):
perimeter = 0
points = self.vertices + [self.vertices[0]]
for i in range(len(self.vertices)):
perimeter += points[i].distance(points[i+1])
return perimeter

Now, to understand the difference a little better, let's compare the two APIs in use. Here's how to calculate the perimeter of a square using the object-oriented code:

>>> square = Polygon()
>>> square.add_point(Point(1,1))
>>> square.add_point(Point(1,2))
>>> square.add_point(Point(2,2))
>>> square.add_point(Point(2,1))
>>> square.perimeter()

That's fairly succinct and easy to read, you might think, but let's compare it to the function-based code:

>>> square = [(1,1), (1,2), (2,2), (2,1)]
>>> perimeter(square)

Hmm, maybe the object-oriented API isn't so compact! On the other hand, I'd argue that it was easier to read than the function example: How do we know what the list of tuples is supposed to represent in the second version? How do we remember what kind of object (a list of two-tuples? That's not intuitive!) we're supposed to pass into the perimeter function? We would need a lot of external documentation to explain how these functions should be used.

In contrast, the object-oriented code is relatively self documenting, we just have to look at the list of methods and their parameters to know what the object does and how to use it. By the time we wrote all the documentation for the functional version, it would probably be longer than the object-oriented code.

Besides, code length is a horrible indicator of code complexity. Some programmers (thankfully, not many of them are Python coders) get hung up on complicated, "one liners", that do incredible amounts of work in one line of code. One line of code that even the original author isn't able to read the next day, that is. Always focus on making your code easier to read and easier to use, not shorter.

As a quick exercise, can you think of any ways to make the object-oriented Polygon as easy to use as the functional implementation? Pause a moment and think about it.

Really, all we have to do is alter our Polygon API so that it can be constructed with multiple points. Let's give it an initializer that accepts a list of Point objects. In fact, let's allow it to accept tuples too, and we can construct the Point objects ourselves, if needed:

def __init__(self, points = []):
self.vertices = []
for point in points:
if isinstance(point, tuple):
point = Point(*point)

This example simply goes through the list and ensures that any tuples are converted to points. If the object is not a tuple, we leave it as is, assuming that it is either a Point already, or an unknown duck typed object that can act like a Point.

As we can see, it's not always easy to identify when an object should really be represented as a self-defined class. If we have new functions that accept a polygon argument, such as area(polygon) or point_in_polygon(polygon, x, y), the benefits of the object-oriented code become increasingly obvious. Likewise, if we add other attributes to the polygon, such as color or texture, it makes more and more sense to encapsulate that data into a class.

The distinction is a design decision, but in general, the more complicated a set of data is, the more likely it is to have functions specific to that data, and the more useful it is to use a class with attributes and methods instead.

When making this decision, it also pays to consider how the class will be used. If we're only trying to calculate the perimeter of one polygon in the context of a much greater problem, using a function will probably be quickest to code and easiest to use "one time only". On the other hand, if our program needs to manipulate numerous polygons in a wide variety of ways (calculate perimeter, area, intersection with other polygons, and more), we have most certainly identified an object; one that needs to be extremely versatile.

Pay additional attention to the interaction between objects. Look for inheritance relationships; inheritance is impossible to model elegantly without classes, so make sure to use them. Composition can, technically, be modeled using only data structures; for example, we can have a list of dictionaries holding tuple values, but it is often less complicated to create an object, especially if there is behavior associated with the data.

Don't rush to use an object just because you can use an object, but never neglect to create a class when you need to use a class.

Using properties to add behavior to class data

Python is very good at blurring distinctions; it doesn't exactly help us to "think outside the box". Rather, it teaches us that the box is in our own head; "there is no box".

Before we get into the details, let's discuss some bad object-oriented theory. Many object-oriented languages (Java is the most guilty) teach us to never access attributes directly. They teach us to write attribute access like this:

class Color:
def __init__(self, rgb_value, name):
self._rgb_value = rgb_value
self._name = name
def set_name(self, name):
self._name = name
def get_name(self):
return self._name

The variables are prefixed with an underscore to suggest that they are private (in other languages it would actually force them to be private). Then the get and set methods provide access to each variable. This class would be used in practice as follows:

>>> c = Color("#ff0000", "bright red")
>>> c.get_name()
'bright red'
>>> c.set_name("red")
>>> c.get_name()

This is not nearly as readable as the direct access version that Python favors:

class Color:
def __init__(self, rgb_value, name):
self.rgb_value = rgb_value = name

c = Color("#ff0000", "bright red")
print( = "red"

So why would anyone recommend the method-based syntax? Their reasoning is that someday we may want to add extra code when a value is set or retrieved. For example, we could decide to cache a value and return the cached value, or we might want to validate that the value is a suitable input. In code, we could decide to change the set_name() method as follows:

def set_name(self, name):
if not name:
raise Exception("Invalid Name")
self._name = name

Now, in Java and similar languages, if we had written our original code to do direct attribute access, and then later changed it to a method like the above, we'd have a problem: Anyone who had written code that accessed the attribute directly would now have to access the method; if they don't change the access style, their code will be broken. The mantra in these languages is that we should never make public members private. This doesn't make much sense in Python since there isn't any concept of private members!

Indeed, the situation in Python is much better. We can use the Python property keyword to make methods look like a class attribute. If we originally wrote our code to use direct member access, we can later add methods to get and set the name without changing the interface. Let's see how it looks:

class Color:
def __init__(self, rgb_value, name):
self.rgb_value = rgb_value
self._name = name

def _set_name(self, name):
if not name:
raise Exception("Invalid Name")
self._name = name

def _get_name(self):
return self._name

name = property(_get_name, _set_name)

If we had started with the earlier non-method-based class, which set the name attribute directly, we could later change the code to look like the above. We first change the name attribute into a (semi-) private _name attribute. Then we add two more (semi-) private methods to get and set that variable, doing our validation when we set it.

Finally, we have the property declaration at the bottom. This is the magic. It creates a new attribute on the Color class called name, which now replaces the previous name attribute. It sets this attribute to be a property, which calls the two methods we just created whenever the property is accessed or changed. This new version of the Color class can be used exactly the same way as the previous version, yet it now does validation when we set the name:

>>> c = Color("#0000ff", "bright red")
>>> print(
bright red
>>> = "red"
>>> print(
>>> = ""
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "", line 8, in _set_name
raise Exception("Invalid Name")
Exception: Invalid Name

So if we'd previously written code to access the name attribute, and then changed it to use our property object, the previous code would still work, unless it was sending an empty property value, which is the behavior we wanted to forbid in the first place. Success!

Bear in mind that even with the name property, the previous code is not 100% safe. People can still access the _name attribute directly and set it to an empty string if they wanted to. But if they access a variable we've explicitly marked with an underscore to suggest it is private, they're the ones that have to deal with the consequences, not us.

(For more resources on Python 3, see here.)

How it works

So, what exactly is that property object doing? Think of the property function as returning an object that proxies any requests to set or access the attribute value through the methods we have specified. The property keyword is like a constructor for such an object.

This property constructor can actually accept two additional arguments, a deletion function and a docstring for the property. The delete function is rarely supplied in practice, but it can be useful for logging that a value has been deleted, or possibly to veto deleting if we have reason to do so.The docstring is just a string describing what the property does.If we do not supply this parameter, the docstring will instead be copied from the docstring for the first argument: the getter method.

Here is a silly example that simply states whenever any of the methods are called:

class Silly:
def _get_silly(self):
print("You are getting silly")
return self._silly
def _set_silly(self, value):
print("You are making silly {}".format(value))
self._silly = value
def _del_silly(self):
print("Whoah, you killed silly!")
del self._silly

silly = property(_get_silly, _set_silly,
_del_silly, "This is a silly property")

If we actually use this class, it does indeed print out the correct strings when we ask it to:

>>> s = Silly()
>>> s.silly = "funny"
You are making silly funny
>>> s.silly
You are getting silly
>>> del s.silly
Whoah, you killed silly!

Further, if we look at the help file for the Silly class (by issuing help(silly) at the interpreter prompt), it shows us the custom docstring for our silly attribute:

Help on class Silly in module __main__:

class Silly(builtins.object)
| Data descriptors defined here:
| __dict__
| dictionary for instance variables (if defined)
| __weakref__
| list of weak references to the object (if defined)
| silly
| This is a silly property

Once again, everything is working as we planned. In practice, properties are normally only defined with the first two parameters; the getter and setter functions. The docstring is defined as a normal docstring on the getter and copied into the property, while the deletion function is left empty because object attributes are rarely deleted. If a coder does try to delete one that doesn't have a deletion function specified, however, it will raise an exception, so if there is any chance of a legitimate reason to delete our property, we should supply that function.

Decorators: another way to create properties

Decorators were introduced in Python 2.4 as a way to modify functions dynamically by passing them as arguments to other functions, which eventually return a new function. We won't be covering decorators in-depth at this time, but the basic syntax is easy to grasp. If you've never used them before, you can still follow along.

Applying a decorator can be as simple as prefixing the function name with an @ symbol, and placing the result just before the definition of the function that is being decorated. The property function itself can be used with decorator syntax to turn a get function into a property:

class Foo:
def foo(self):
return "bar"

This applies property as a decorator, and is equivalent to applying it as foo = property(foo). The main difference, from a readability perspective, is that we get to mark the foo function as a property at the top of the method, instead of after it is defined, where it can be easily overlooked.

Going one step further, we can specify a setter function for the new property as follows:

class Foo:
def foo(self):
return self._foo

def foo(self, value):
self._foo = value

This syntax looks a little odd. First we decorate the foo method as a getter. Then we decorate a new method with exactly the same name with the setter attribute of the original decorated foo method! Remember, the property function returns an object; this object is automatically set up to have a setter attribute, and this attribute can be applied as a decorator to other functions. Using the same name for the get and set methods is not required, but it does help group the multiple methods that create one property together.

We can, of course, also specify a deletion function with @foo.deleter. We cannot specify a docstring using property decorators, so we need to rely on the property copying the docstring from the initial getter method.

Here's our previous Silly class rewritten to use property as a decorator:

class Silly:
def silly(self):
"This is a silly property"
print("You are getting silly")
return self._silly

def silly(self, value):
print("You are making silly {}".format(value))
self._silly = value

def silly(self):
print("Whoah, you killed silly!")
del self._silly

When should we use properties?

With the property keyword smearing the division between behavior and data, it can be confusing to know which one to choose. The example use case we saw earlier is one of the most common uses of properties: we have some data on a class that we later want to add behavior to. There are also other factors to take into account when deciding to use a property.

Technically, in Python, data, properties, and methods are all attributes on a class. The fact that a method is callable does not distinguish it from other types of attributes; indeed, it is possible to create normal objects that are callable, and also that functions and methods are themselves normal objects.

The fact that methods are just callable attributes, and properties are just customizable attributes can help us in our decision. Methods should only represent actions; things that can be done to or performed by the object. When you call a method, even with only one argument, it should do something. Methods are generally verbs.

That leaves us to decide between standard data attributes and properties. In general, always use a standard attribute until you need to control access to that property in some way. In either case, your attribute should be a noun. The only difference between an attribute and a property is that we can invoke custom actions automatically when a property is retrieved, set, or deleted.

Let's try a more realistic example. A common need for custom behavior is caching a value that is difficult to calculate or expensive to look up (requiring, for example, a network request or database query). The goal is to store the value locally to avoid repeated calls to the expensive calculation.

We can do this with a custom getter on the property. The first time the value is retrieved, we perform the lookup or calculation. Then we could locally cache the value as a private attribute on our object (or in dedicated caching software), and the next time the value is requested, we return the stored data. Here's how we might cache a webpage:

from urllib.request import urlopen

class WebPage:
def __init__(self, url):
self.url = url
self._content = None

def content(self):
if not self._content:
print("Retrieving New Page...")
self._content = urlopen(self.url).read()
return self._content

We can test this code to see that the page is only retrieved once:

>>> import time
>>> webpage = WebPage("")
>>> now = time.time()
>>> content1 = webpage.content
Retrieving New Page...
>>> time.time() - now
>>> now = time.time()
>>> content2 = webpage.content
>>> time.time() - now
>>> content2 == content1

On my awful satellite connection it takes twenty seconds the first time I load the content, but the second time, I get the result in two seconds (which is really just the amount of time it took to type the lines into the interpreter).

Custom getters are also useful for attributes that need to be calculated on the fly, based on other object attributes. For example, we might want to calculate the average for a list of integers:

class AverageList(list):
def average(self):
return sum(self) / len(self)

This very simple class inherits from list, so we get list-like behavior for free. We just add a property to the class, and presto, our list can have an average:

>>> a = AverageList([1,2,3,4])
>>> a.average

Of course, we could have made this a method instead, but then we should call it calculate_average(), since methods represent actions. But a property called average is more suitable; both easier to type, and easier to read.

Custom setters are useful for validation, as we've already seen, but they can also be used to proxy a value to another location. For example, we could add a content setter to the WebPage class that automatically logs into our web server and uploads a new page whenever the value is set.


In this article, we focused on identifying objects, especially objects that are not immediately apparent; objects that manage and control. In particular, we covered:

  1. Why objects should have both data and behavior
  2. How properties blur the distinction between data and behavior

Further resources on this subject:

You've been reading an excerpt of:

Python 3 Object Oriented Programming

Explore Title