Packt+ | Advance your knowledge in tech

You're reading from Python 3 Object-Oriented Programming - Second Edition

Product typeBook

Published inAug 2015

Reading LevelIntermediate

PublisherPackt

ISBN-139781784398781

Edition1st Edition

Languages

Python

Concepts

Object Oriented Programming

Author (1)

Dusty Phillips

Chapter 6. Python Data Structures

In our examples so far, we've already seen many of the built-in Python data structures in action. You've probably also covered many of them in introductory books or tutorials. In this chapter, we'll be discussing the object-oriented features of these data structures, when they should be used instead of a regular class, and when they should not be used. In particular, we'll be covering:

Tuples and named tuples
Dictionaries
Lists and sets
How and why to extend built-in objects
Three types of queues

Empty objects

Let's start with the most basic Python built-in, one that we've seen many times already, the one that we've extended in every class we have created: the object. Technically, we can instantiate an object without writing a subclass:

>>> o = object()
>>> o.x = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute 'x'

Unfortunately, as you can see, it's not possible to set any attributes on an object that was instantiated directly. This isn't because the Python developers wanted to force us to write our own classes, or anything so sinister. They did this to save memory; a lot of memory. When Python allows an object to have arbitrary attributes, it takes a certain amount of system memory to keep track of what attributes each object has, for storing both the attribute name and its value. Even if no attributes are stored, memory is allocated for potential new attributes. Given the dozens...

Tuples and named tuples

Tuples are objects that can store a specific number of other objects in order. They are immutable, so we can't add, remove, or replace objects on the fly. This may seem like a massive restriction, but the truth is, if you need to modify a tuple, you're using the wrong data type (usually a list would be more suitable). The primary benefit of tuples' immutability is that we can use them as keys in dictionaries, and in other locations where an object requires a hash value.

Tuples are used to store data; behavior cannot be stored in a tuple. If we require behavior to manipulate a tuple, we have to pass the tuple into a function (or method on another object) that performs the action.

Tuples should generally store values that are somehow different from each other. For example, we would not put three stock symbols in a tuple, but we might create a tuple of stock symbol, current price, high, and low for the day. The primary purpose of a tuple is to aggregate different pieces...

Dictionaries

Dictionaries are incredibly useful containers that allow us to map objects directly to other objects. An empty object with attributes to it is a sort of dictionary; the names of the properties map to the property values. This is actually closer to the truth than it sounds; internally, objects normally represent attributes as a dictionary, where the values are properties or methods on the objects (see the __dict__ attribute if you don't believe me). Even the attributes on a module are stored, internally, in a dictionary.

Dictionaries are extremely efficient at looking up a value, given a specific key object that maps to that value. They should always be used when you want to find one object based on some other object. The object that is being stored is called the value; the object that is being used as an index is called the key. We've already seen dictionary syntax in some of our previous examples.

Dictionaries can be created either using the dict() constructor or using the {...

Lists

Lists are the least object-oriented of Python's data structures. While lists are, themselves, objects, there is a lot of syntax in Python to make using them as painless as possible. Unlike many other object-oriented languages, lists in Python are simply available. We don't need to import them and rarely need to call methods on them. We can loop over a list without explicitly requesting an iterator object, and we can construct a list (as with a dictionary) with custom syntax. Further, list comprehensions and generator expressions turn them into a veritable Swiss-army knife of computing functionality.

We won't go into too much detail of the syntax; you've seen it in introductory tutorials across the Web and in previous examples in this book. You can't code Python very long without learning how to use lists! Instead, we'll be covering when lists should be used, and their nature as objects. If you don't know how to create or append to a list, how to retrieve items from a list, or what ...

Sets

Lists are extremely versatile tools that suit most container object applications. But they are not useful when we want to ensure objects in the list are unique. For example, a song library may contain many songs by the same artist. If we want to sort through the library and create a list of all the artists, we would have to check the list to see if we've added the artist already, before we add them again.

This is where sets come in. Sets come from mathematics, where they represent an unordered group of (usually) unique numbers. We can add a number to a set five times, but it will show up in the set only once.

In Python, sets can hold any hashable object, not just numbers. Hashable objects are the same objects that can be used as keys in dictionaries; so again, lists and dictionaries are out. Like mathematical sets, they can store only one copy of each object. So if we're trying to create a list of song artists, we can create a set of string names and simply add them to the set. This example...

Extending built-ins

We discussed briefly in Chapter 3, When Objects Are Alike, how built-in data types can be extended using inheritance. Now, we'll go into more detail as to when we would want to do that.

When we have a built-in container object that we want to add functionality to, we have two options. We can either create a new object, which holds that container as an attribute (composition), or we can subclass the built-in object and add or adapt methods on it to do what we want (inheritance).

Composition is usually the best alternative if all we want to do is use the container to store some objects using that container's features. That way, it's easy to pass that data structure into other methods and they will know how to interact with it. But we need to use inheritance if we want to change the way the container actually works. For example, if we want to ensure every item in a list is a string with exactly five characters, we need to extend list and override the append() method to raise...

Queues

Queues are peculiar data structures because, like sets, their functionality can be handled entirely using lists. However, while lists are extremely versatile general-purpose tools, they are occasionally not the most efficient data structure for container operations. If your program is using a small dataset (up to hundreds or even thousands of elements on today's processors), then lists will probably cover all your use cases. However, if you need to scale your data into the millions, you may need a more efficient container for your particular use case. Python therefore provides three types of queue data structures, depending on what kind of access you are looking for. All three utilize the same API, but differ in both behavior and data structure.

Before we start our queues, however, consider the trusty list data structure. Python lists are the most advantageous data structure for many use cases:

They support efficient random access to any element in the list
They have strict ordering...

Case study

To tie everything together, we'll be writing a simple link collector, which will visit a website and collect every link on every page it finds in that site. Before we start, though, we'll need some test data to work with. Simply write some HTML files to work with that contain links to each other and to other sites on the Internet, something like this:

<html>
    <body>
        <a href="contact.html">Contact us</a>
        <a href="blog.html">Blog</a>
        <a href="esme.html">My Dog</a>
        <a href="/hobbies.html">Some hobbies</a>
        <a href="/contact.html">Contact AGAIN</a>
        <a href="http://www.archlinux.org/">Favorite OS</a>
    </body>
</html>

Name one of the files index.html so it shows up first when pages are served. Make sure the other files exist, and keep things complicated so there is lots of linking between them. The examples for this chapter include a directory...

Exercises

The best way to learn how to choose the correct data structure is to do it wrong a few times. Take some code you've recently written, or write some new code that uses a list. Try rewriting it using some different data structures. Which ones make more sense? Which ones don't? Which have the most elegant code?

Try this with a few different pairs of data structures. You can look at examples you've done for previous chapter exercises. Are there objects with methods where you could have used namedtuple or dict instead? Attempt both and see. Are there dictionaries that could have been sets because you don't really access the values? Do you have lists that check for duplicates? Would a set suffice? Or maybe several sets? Would one of the queue implementations be more efficient? Is it useful to restrict the API to the top of a stack rather than allowing random access to the list?

If you want some specific examples to work with, try adapting the link collector to also save the title used...

Summary

We've covered several built-in data structures and attempted to understand how to choose one for specific applications. Sometimes, the best thing we can do is create a new class of objects, but often, one of the built-ins provides exactly what we need. When it doesn't, we can always use inheritance or composition to adapt them to our use cases. We can even override special methods to completely change the behavior of built-in syntaxes.

In the next chapter, we'll discuss how to integrate the object-oriented and not-so-object-oriented aspects of Python. Along the way, we'll discover that it's more object-oriented than it looks at first sight!

The rest of the chapter is locked

You have been reading a chapter from

Python 3 Object-Oriented Programming - Second Edition

Published in: Aug 2015Publisher: PacktISBN-13: 9781784398781

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dusty Phillips

Dusty Phillips is a Canadian software developer and an author currently living in New Brunswick. He has been active in the open-source community for 2 decades and has been programming in Python for nearly as long. He holds a master's degree in computer science and has worked for Facebook, the United Nations, and several startups.
Read more about Dusty Phillips

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5