Data structures and algorithms are two of the core elements of a large and complex software project. They are a systematic way of storing and organizing data in software so that it can be used efficiently. Python has efficient high-level data structures and an effective object-oriented programming language. Python is the language of choice for many advanced data tasks, for a very good reason. It is one of the easiest advanced programming languages to learn. Intuitive structures and semantics mean that for people who are not computer scientists, but maybe biologists, statisticians, or the directors of a start-up, Python is a straightforward way to perform a wide variety of data tasks. It is not just a scripting language, but a full-featured, object-oriented programming language.
In Python, there are many useful data structures and algorithms built into the language. Also, because Python is an object-based language, it is relatively easy to create custom data objects. In this book, we will examine Python's internal libraries and some of the external libraries, and we'll learn how to build your own data objects from first principles.
In this chapter, we will look at the following topics:
- Obtaining a general working knowledge of data structures and algorithms
- Understanding core data types and their functions
- Exploring the object-oriented aspects of the Python programming language
The data structures and algorithms are presented using the Python programming language (version 3.7) in this book. This book does assume that you know Python. However, if you are a bit rusty, coming from another language, or do not know Python at all, don't worry—this first chapter should get you quickly up to speed.
The following is the GitHub link: https://github.com/PacktPublishing/Hands-On-Data-Structures-and-Algorithms-with-Python-Second-Edition/tree/master/Chapter01.
To install Python, we use the following method.
Python is an interpreted language, and statements are executed line by line. A programmer can typically write down the series of commands in a source code file. For Python, the source code is stored in a file with a .py file extension.
Python is fully integrated and usually already installed on most of the Linux and Mac operating systems. Generally, the pre-installed Python version is 2.7. You can check the version installed on the system using the following commands:
>>> import sys
3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)]
You can also install a different version of Python using the following commands on Linux:
- Open the Terminal
- sudo apt-get update
- sudo apt-get install -y python3-pip
- pip3 install <package_name>
Python has to be installed on systems with Windows operating systems, as it is not pre-installed, unlike Linux/macOS. Any version of Python can be downloaded from this link: https://www.python.org/downloads/. You can download the software installer and run it—select Install for all users and then click on Next. You need to specify the location where you want to install the package, then click Next. After that, select the option Add Python to environment variables in the Customize Python dialog box, then just click Next again for final installation. When the installation is finished, you can confirm the installation by opening up Command Prompt and typing the following command:
The latest stable Python version is Python 3.7.0. The Python program can be executed by typing the following in the command line:
Algorithms and data structures are the most fundamental concepts in computing. They are the main building blocks from which complex software is built. Having an understanding of these foundation concepts is extremely important in software design and this involves the following three characteristics:
- How algorithms manipulate information contained within data structures
- How data is arranged in memory
- What the performance characteristics of particular data structures are
In this book, we will examine the topic from several perspectives. Firstly, we will look at the fundamentals of the Python programming language from the perspective of data structures and algorithms. Secondly, it is important that we have the correct mathematical tools. We need to understand the fundamental concepts of computer science and for this we need mathematics. By taking a heuristic approach, developing some guiding principles means that, in general, we do not need any more than high school mathematics to understand the principles of these key ideas.
Another important aspect is an evaluation. Measuring the performance of algorithms requires an understanding of how the increase in data size affects operations on that data. When we are working on large datasets or real-time applications, it is essential that our algorithms and structures are as efficient as they can be.
Finally, we need a strong experimental design strategy. Being able to conceptually translate a real-world problem into the algorithms and data structures of a programming language involves being able to understand the important elements of a problem and a methodology for mapping these elements to programming structures.
To better understand the importance of algorithmic thinking, let's consider a real-world example. Imagine we are at an unfamiliar market and we are given the task of purchasing a list of items. We assume that the market is laid out randomly, each vendor sells a random subset of items, and some of these items may be on our list. Our aim is to minimize the price for each item we buy, as well as minimize the time spent at the market. One way to approach this problem is to write an algorithm like the following:
1. Does the vendor have items that are on our list and the cost is less than a predicted cost for that item?
2. If yes, buy and remove from list; if no, move on to the next vendor.
3. If no more vendors, end.
This is a simple iterator, with a decision and an action. If we have to implement this using programming language, we would need data structures to define and store in memory both the list of items we want to buy and the list of items the vendor is selling. We would need to determine the best way of matching items in each list and we need some sort of logic to decide whether to purchase or not.
There are several observations that we can make regarding this algorithm. Firstly, since the cost calculation is based on a prediction, we don't know what the real cost is. As such, we do not purchase an item because we underpredicted the cost of the item, and we reach the end of the market with items remaining on our list. To handle this situation, we need an effective way of storing the data so that we can efficiently backtrack to the vendor with the lowest cost.
Also, we need to understand the time taken to compare items on our shopping list with the items being sold by each vendor. It is important because as the number of items on our shopping list, or the number of items sold by each vendor, increases, searching for an item takes a lot more time. The order in which we search through items and the shape of the data structures can make a big difference to the time it takes to do a search. Clearly, we would like to arrange our list as well as the order we visit each vendor in such a way that we minimize the search time.
Also, consider what happens when we change the buy condition to purchase at the cheapest price, not just the below-average predicted price. This changes the problem entirely. Instead of sequentially going from one vendor to the next, we need to traverse the market once and, with this knowledge, we can order our shopping list with regards to the vendors we want to visit.
Obviously, there are many more subtleties involved in translating a real-world problem into an abstract construct such as a programming language. For example, as we progress through the market, our knowledge of the cost of a product improves, so our predicted average-price variable becomes more accurate until, by the last stall, our knowledge of the market is perfect. Assuming any kind of backtracking algorithm incurs a cost, we can see cause to review our entire strategy. Conditions such as high price variability, the size and shape of our data structures, and the cost of backtracking all determine the most appropriate solution. The whole discussion clearly demonstrates the importance of data structures and algorithms in building a complex solution.
Python has several built-in data structures, including lists, dictionaries, and sets, which we use to build customized objects. In addition, there are a number of internal libraries, such as collections and math object, which allow us to create more advanced structures as well as perform calculations on those structures. Finally, there are the external libraries such as those found in the SciPy packages. These allow us to perform a range of advanced data tasks such as logistic and linear regression, visualization, and mathematical calculations, such as operations on matrices and vectors. External libraries can be very useful for an out-of-the-box solution. However, we must also be aware that there is often a performance penalty compared to building customized objects from the ground up. By learning how to code these objects ourselves, we can target them to specific tasks, making them more efficient. This is not to exclude the role of external libraries and we will look at this in Chapter 12, Design Techniques and Strategies.
To begin, we will take an overview of some of the key language features that make Python such a great choice for data programming.
Python is one of the most popular and extensively used programming languages all over the world due to its readability and flexibility. A feature of the Python environment is its interactive console, allowing you to both use Python as a desktop-programmable calculator and also as an environment to write and test snippets of code.
The read...evaluate...print loop of the console is a very convenient way to interact with a larger code base, such as to run functions and methods or to create instances of classes. This is one of the major advantages of Python over compiled languages such as C/C++ or Java, where the write...compile...test...recompile cycle can increase development time considerably compared to Python's read...evaluate...print loop. Being able to type in expressions and get an immediate response can greatly speed up data science tasks.
There are some excellent distributions of Python apart from the official CPython version. Two of the most popular are available at: Anaconda (https://www.continuum.io/downloads) and Canopy (https://www.enthought.com/products/canopy/). Most distributions come with their own developer environments. Both Canopy and Anaconda include libraries for scientific, machine learning, and other data applications. Most distributions come with an editor.
There are also a number of implementations of the Python console, apart from the CPython version. Most notable among these is the IPython/Jupyter platform which is based on a web-based computational environment.
To solve a real-world problem through algorithm implementation, we first have to select the variables and then apply the operations on these variables. Variables are labels that are attached to the objects. Variables are not objects nor containers for objects; they only act as a pointer or a reference to the object. For example, consider the following code:
Here, we have created a variable, a, that points to a list object. We create another variable, b, that points to this same list object. When we append an element to this list object, this change is reflected in both a and b.
In Python, variable names are attached to different data types during the program execution; it is not required to first declare the datatype for the variables. Each value is of a type (for example, a string or integer); however, the variable name that points to this value does not have a specific type. More specifically, variables point to an object that can change their type depending on the kind of values assigned to them. Consider the following example:
In the preceding code example, the type of a is changed from int to float, depending upon the value stored in the variable.
Scoping rules of variables inside functions are important. Whenever a function executes, a local environment (namespace) is created. This local namespace contains all the variables and parameter names that are assigned by the functions. Whenever a function is called, Python Interpreter first looks into the local namespace that is the function itself—if no match is found, then it looks at the global namespace. If the name is still not found, then it searches in the built-in namespace. If it is not found, then the interpreter would raise a NameError exception. Consider the following code:
print(a) #prints 11
print(b) #prints 25
In the preceding code, we define two global variables. We need to tell the interpreter, using the keyword global, that inside the function we are referring to a global variable. When we change this variable to 11, these changes are reflected in the global scope. However, the b variable we set to 21 is local to the function, and any changes made to it inside the function are not reflected in the global scope. When we run the function and print b, we see that it retains its global value.
In addition, let's consider another interesting example:
>>> a = 10
>>> def my_function():
>>> my_function ()
The code works, and gives an output of 10, but see the following code:
>>> a = 10
>>> def my_function():
... a= a+1
UnboundLocalError: local variable 'a' referenced before assignment
The preceding code gives an error because assignment to a variable in a scope makes that variable a local variable to that scope. In the preceding example, in the my_function() assignment to the a variable, the compiler assumes a as a local variable, and that is why the earlier print() function tries to print a local variable a, which is not initialized as a local variable; thus, it gives an error. It can be resolved by accessing the outer scope variable by declaring it as global:
>>> a = 10
>>> def my_function():
... global a
... a = a+1
So, in Python, the variables that are referenced inside a function are global implicitly, and if the a variable is assigned a value anywhere inside the function's body, it is assumed to be a local variable unless explicitly declared as global.
Python programs consist of a sequence of statements. The interpreter executes each statement in order until there are no more statements. This is true if files run as the main program, as well as if they are loaded via import. All statements, including variable assignment, function definitions, class definitions, and module imports, have equal status. There are no special statements that have higher priority than any other, and every statement can be placed anywhere in a program. All the instructions/statements in the program are executed in sequence in general. However, there are two main ways of controlling the flow of program execution—conditional statements and loops.
The if...else and elif statements control the conditional execution of statements. The general format is a series of if and elif statements followed by a final else statement:
else: print('Something else')
Note the use of the == operator to compare the two values. This returns True if both the values are equal; it returns False otherwise. Note also that setting x to a string will return Something else rather than generate a type error as may happen in languages that are not dynamically typed. Dynamically typed languages, such as Python, allow flexible assignment of objects with different types.
The other way of controlling program flow is with loops. Python offers two ways of constructing looping, such as the while and for loop statements. A while loop repeats executing statements until a Boolean condition is true. A for loop provides a way of repeating the execution into the loop through a series of elements. Here is an example:
In this example, the while loop executes the statements until the condition x < 3 is true. Let's consider another example that uses a for loop:
>>>words = ['cat', 'dog', 'elephant']
>>> for w in words:
In this example, the for loop executes iterating for all the items over the list.
Python contains various built-in data types. These include four numeric types (int, float, complex, bool), four sequence types (str, list, tuple, range), one mapping type (dict), and two set types. It is also possible to create user-defined objects, such as functions or classes. We will look at the string and the list data types in this chapter and the remaining built-in types in the next chapter.
All data types in Python are objects. In fact, pretty much everything is an object in Python, including modules, classes, and functions, as well as literals such as strings and integers. Each object in Python has a type, a value, and an identity. When we write greet= "helloworld", we are creating an instance of a string object with the value "hello world" and the identity of greet. The identity of an object acts as a pointer to the object's location in memory. The type of an object, also known as the object's class, describes the object's internal representation, as well as the methods and operations it supports. Once an instance of an object is created, its identity and type cannot be changed.
We can get the identity of an object by using the built-in function id(). This returns an identifying integer and on most systems, this refers to its memory location, although you should not rely on this in any of your code.
Also, there are a number of ways to compare objects; for example, see the following:
if a==b: # a and b have the same value
if a is b: # if a and b are the same object
if type(a) is type(b): #a and b are the same type
An important distinction needs to be made between mutable and immutable objects. Mutable objects such as lists can have their values changed. They have methods, such as insert() or append(), that change an object's value. Immutable objects such as strings cannot have their values changed, so when we run their methods, they simply return a value rather than change the value of an underlying object. We can, of course, use this value by assigning it to a variable or using it as an argument in a function. For example, the int class is immutable—once an instance of it is created, its value cannot be changed, however, an identifier referencing this object can be reassigned another value.
Strings are immutable sequence objects, with each character representing an element in the sequence. As with all objects, we use methods to perform operations. Strings, being immutable, do not change the instance; each method simply returns a value. This value can be stored as another variable or given as an argument to a function or method.
The following table is a list of some of the most commonly used string methods and their descriptions:
|s.capitalize||Returns a string with only the first character capitalized, the rest remaining lowercase.|
|s.count(substring,[start,end])||Counts occurrences of a substring.|
|s.expandtabs([tabsize])||Replaces tabs with spaces.|
|s.endswith(substring,[start, end]||Returns True if a string ends with a specified substring.|
|s.find(substring,[start,end])||Returns index of first presence of a substring.|
|s.isalnum()||Returns True if all chars are alphanumeric of string s.|
|s.isalpha()||Returns True if all chars are alphabetic of string s.|
|s.isdigit()||Returns True if all chars are digits in the string.|
|s.split([separator],[maxsplit])||Splits a string separated by whitespace or an optional separator. Returns a list.|
|s.join(t)||Joins the strings in sequence t.|
|s.lower()||Converts the string to all lowercase.|
|s.replace(old, new[maxreplace])||Replaces old substring with a new substring.|
|s.startswith(substring, [start, end]])||Returns True if the string starts with a specified substring.|
|s.swapcase()||Returns a copy of the string with swapped case in the string.|
|s.strip([characters])||Removes whitespace or optional characters.|
|s.lstrip([characters])||Returns a copy of the string with leading characters removed.|
Strings, like all sequence types, support indexing and slicing. We can retrieve any character from a string by using its index s[i]. We can retrieve a slice of a string by using s[i:j], where i and j are the start and end points of the slice. We can return an extended slice by using a stride, as in the following—s[i:j:stride]. The following code should make this clear:
The first two examples are pretty straightforward, returning the character located at index 1 and the first seven characters of the string, respectively. Notice that indexing begins at 0. In the third example, we are using a stride of 2. This results in every second character being returned. In the final example, we omit the end index and the slice returns every second character in the entire string.
You can use any expression, variable, or operator as an index as long as the value is an integer:
Another common operation is traversing through a string with a loop:
Given that strings are immutable, a common question that arises is how we perform operations such as inserting values. Rather than changing a string, we need to think of ways to build new string objects for the results we need. For example, if we wanted to insert a word into our greeting, we could assign a variable to the following:
As this code shows, we use the slice operator to split the string at index position 5 and use + to concatenate. Python never interprets the contents of a string as a number. If we need to perform mathematical operations on a string, we need to first convert them to a numeric type:
List is one of the most commonly used built-in data structures, as they can store any number of different data types. They are simple representations of objects and are indexed by integers starting from zero, as we saw in the case of strings.
The following table contains the most commonly used list methods and their descriptions:
Returns a list of sequence s.
Appends element x at the end of list s.
Appends list x at the end of list s.
Returns the count of the occurrence of x in list s.
Returns the smallest index i, where s[i]==x. We can include an optional start and stop index for the lookup.
Inserts x at index i.
Returns the element i and removes it from the list s.
Removes element x from the list s.
Reverses the order of list s.
Sorts list s with optional key and reverses it.
In Python, lists implementation is different when compared to other languages. Python does not create multiple copies of a variable. For example, when we assign a value of one variable in another variable, both variables point to the same memory address where the value is stored. A copy would only be allocated if the variables change their values. This feature makes Python memory efficient, in the sense that it only creates multiple copies when it is required.
This has important consequences for mutable compound objects such as lists. Consider the following code:
In the preceding code, both the list1 and list2 variables are pointing to the same memory location. However, when we change the y through list2 to 4, we are actually changing the same y variable that list1 is pointing to as well.
An important feature of list is that it can contain nested structures; that is, list can contain other lists. For example, in the following code, list items contains three other lists:
We can access the values of the list using the bracket operators and, since lists are mutable, they are copied in place. The following example demonstrates how we can use this to update elements; for example, here we are raising the price of flour by 20 percent:
We can create a list from expressions using a very common and intuitive method; that is, list comprehensions. It allows us to create a list through an expression directly into the list. Consider the following example, where a list l is created using this expression:
List comprehensions can be quite flexible; for example, consider the following code. It essentially shows two different ways to performs a function composition, where we apply one function (x*4) to another (x*2). The following code prints out two lists representing the function composition of f1 and f2, calculated first using a for loop and then using a list comprehension:
def f1(x): return x*2
def f2(x): return x*4
for i in range(16):
print([f1(x) for x in range(64) if x in [f2(j) for j in range(16)]])
The first line of output is from the for loop construct. The second is from the list comprehension expression:
List comprehensions can also be used to replicate the action of nested loops in a more compact form. For example, we multiply each of the elements contained within list1 with each other:
We can also use list comprehensions with other objects such as strings, to build more complex structures. For example, the following code creates a list of words and their letter count:
As we will see, lists form the foundation of many of the data structures we will look at. Their versatility, ease of creation, and use enable them to build more specialized and complex data structures.
In Python, it is not only data types that are treated as objects. Both functions and classes are what are known as first class objects, allowing them to be manipulated in the same ways as built-in data types. By definition, first class objects are the following:
- Created at runtime
- Assigned as a variable or in a data structure
- Passed as an argument to a function
- Returned as the result of a function
In Python, the term first class object is a bit of a misnomer, since it implies some sort of hierarchy, whereas all Python objects are essentially first class.
To have a look at how this works, let's define a simple function:
return 'hello world'
if language =='fr'
return 'Bonjour le monde'
else: return 'language not supported'
Since user-defined functions are objects, we can do things such as include them in other objects, such as lists:
Functions can also be used as arguments for other functions. For example, we can define the following function:
Here, callf() takes a function as an argument, sets a language variable to 'eng', and then calls the function with the language variable as its argument. We could see how this would be useful if, for example, we wanted to produce a program that returns specific sentences in a variety of languages, perhaps for some sort of natural language application. Here, we have a central place to set the language. As well as our greeting function, we could create similar functions that return different sentences. By having one point where we set the language, the rest of the program logic does not have to worry about this. If we want to change the language, we simply change the language variable and we can keep everything else the same.
Functions that take other functions as arguments, or that return functions, are called higher order functions. Python 3 contains two built-in higher order functions—filter() and map(). Note that in earlier versions of Python, these functions returned lists; in Python 3, they return an iterator, making them much more efficient. The map() function provides an easy way to transform each item into an iterable object. For example, here is an efficient, compact way to perform an operation on a sequence. Note the use of the lambda anonymous function:
Similarly, we can use the filter built-in function to filter items in a list:
Note that both map and filter perform the same function similar to what can be achieved by list comprehensions. There does not seem to be a great deal of difference in the performance characteristics, apart from a slight performance advantage when using the in-built functions map and filter without the lambda operator, compared to list comprehensions. Despite this, most style guides recommend the use of list comprehensions over built-in functions, possibly because they tend to be easier to read.
Creating our own higher order functions is one of the hallmarks of functional programming style. A practical example of how higher order functions can be useful is demonstrated by the following. Here, we are passing the len function as the key to the sort function. This way, we can sort a list of words by length:
Here is another example for case-insensitive sorting:
Note the difference between the list.sort() method and the sorted built-in function. The list.sort() method, a method of the list object, sorts the existing instance of a list without copying it. This method changes the target object and returns None. It is an important convention in Python that functions or methods that change the object return None, to make it clear that no new object was created and that the object itself was changed.
On the other hand, the sorted built-in function returns a new list. It actually accepts any iterable object as an argument, but it will always return a list. Both list sort and sorted take two optional keyword arguments as key.
A simple way to sort more complex structures is to use the index of the element to sort, using the lambda operator, for example:
Here we have sorted the items by price.
Recursion is one of the most fundamental concepts of computer science. It is called recursion when a function takes one or more calls to itself during execution. Loop iterations and recursion are different in the sense that loops execute statements repeatedly through a Boolean condition or through a series of elements, whereas recursion repeatedly calls a function. In Python, we can implement a recursive function simply by calling it within its own function body. To stop a recursive function turning into an infinite loop, we need at least one argument that tests for a terminating case to end the recursion. This is sometimes called the base case. It should be pointed out that recursion is different from iteration. Although both involve repetition, iteration loops through a sequence of operations, whereas recursion repeatedly calls a function. Technically, recursion is a special case of iteration known as tail iteration, and it is usually always possible to convert an iterative function to a recursive function and vice versa. The interesting thing about recursive functions is that they are able to describe an infinite object within a finite statement.
The following code should demonstrate the difference between recursion and iteration. Both these functions simply print out numbers between low and high, the first one using iteration and the second using recursion:
Notice that for iterTest, the iteration example, we use a while statement to test for the condition, then call the print method, and finally increment the low value. The recursive example tests for the condition, prints, then calls itself, incrementing the low variable in its argument. In general, iteration is more efficient; however, recursive functions are often easier to understand and write. Recursive functions are also useful for manipulating recursive data structures such as linked lists and trees, as we will see.
We can create functions that do not just return one result but rather an entire sequence of results, by using the yield statement. These functions are called generators. Python contains generator functions, which are an easy way to create iterators and are especially useful as a replacement for unworkably long lists. A generator yields items rather than builds lists. For example, the following code shows why we might choose to use a generator, as opposed to creating a list:
#compares the running time of a list compared to a generator
#generator function creates an iterator of odd numbers between n and m
#builds a list of odd numbers between n and m
#the time it takes to perform sum on an iterator
print("Time to sum an iterator: %f" % (time.time() - t1))
#the time it takes to build and sum a list
print("Time to build and sum a list: %f" % (time.time() - t1))
This prints out the following:
As we can see, building a list to do this calculation takes significantly longer. The performance improvement as a result of using generators is because the values are generated on demand, rather than saved as a list in memory. A calculation can begin before all the elements have been generated and elements are generated only when they are needed.
In the preceding example, the sum method loads each number into memory when it is needed for the calculation. This is achieved by the generator object repeatedly calling the __next__ () special method. Generators never return a value other than None.
Typically, generator objects are used in for loops. For example, we can make use of the oddLst generator function created in the preceding code to print out odd integers between 1 and 10:
for i in oddLst (1,10):print(i)
We can also create a generator expression, which, apart from replacing square brackets with parentheses, uses the same syntax and carries out the same operation as list comprehensions. Generator expressions, however, do not create a list; they create a generator object. This object does not create the data, but rather creates that data on demand. This means that generator objects do not support sequence methods such as append() and insert().
You can, however, change a generator into a list using the list() function:
Classes are a way to create new kinds of objects and they are central to object-oriented programming. A class defines a set of attributes that are shared across instances of that class. Typically, classes are sets of functions, variables, and properties.
The object-oriented paradigm is compelling because it gives us a concrete way to think about and represent the core functionality of our programs. By organizing our programs around objects and data rather than actions and logic, we have a robust and flexible way to build complex applications. The actions and logic are still present, of course, but by embodying them in objects, we have a way to encapsulate functionality, allowing objects to change in very specific ways. This makes our code less error-prone, easier to extend and maintain, and able to model real-world objects.
Classes are created in Python using the class statement. This defines a set of shared attributes associated with a collection of class instances. A class usually consists of a number of methods, class variables, and computed properties. It is important to understand that defining a class does not, by itself, create any instances of that class. To create an instance, a variable must be assigned to a class. The class body consists of a series of statements that execute during the class definition. The functions defined inside a class are called instance methods. They apply some operations to the class instance by passing an instance of that class as the first argument. This argument is called self by convention, but it can be any legal identifier. Here is a simple example:
def init (self,name,rate):
Employee.numEmployee += 1
def del (self):
self.owed += numHours*self.rate
return ("%.2f hours worked" % numHours)
return("payed %s " % self.name)
Class variables, such as numEmployee, share values among all the instances of the class. In this example, numEmployee is used to count the number of employee instances. Note that the Employee class implements the __init__ and __del__ special methods, which we will discuss in the next section.
We can create instances of the Employee objects, run methods, and return class and instance variables by doing the following:
We can use the dir(object) function to get a list of attributes of a particular object. The methods that begin and end with two underscores are called special methods. Apart from the following exception, special methods are generally called by the Python interpreter rather than the programmer; for example, when we use the + operator, we are actually invoking a to _add_ () call. For example, rather than using my_object._len_ (), we can use len(my_object); using len() on a string object is actually much faster, because it returns the value representing the object's size in memory, rather than making a call to the object's _len_ method.
The only special method we actually call in our programs, as common practice, is the _init_ method, to invoke the initializer of the superclass in our own class definitions. It is strongly advised not to use the double underscore syntax for your own objects because of potential current or future conflicts with Python's own special methods.
We may, however, want to implement special methods in custom objects, to give them some of the behavior of built-in types. In the following code, we create a class that implements the _repr_ method. This method creates a string representation of our object that is useful for inspection purposes:
return 'a custom object (%r) ' % (self.greet)
When we create an instance of this object and inspect it, we can see we get our customized string representation. Notice the use of the %r format placeholder to return the standard representation of the object. This is useful and best practice because, in this case, it shows us that the greet object is a string indicated by the quotation marks:
Inheritance is one of the most powerful features of object-oriented programming languages. It allows us to inherit the functionality from other classes. It is possible to create a new class that modifies the behavior of an existing class through inheritance. Inheritance means that if an object of one class is created by inheriting another class, then the object would have all the functionality, methods, and variables of both the classes; that is, the parent class and new class. The existing class from which we inherit the functionalities is called the parent/base class, and the new class is called the derived/child class.
Inheritance can be explained with a very simple example—we create an employee class with attributes such as name of employee and rate at which he is going to be paid hourly. We can now create a new specialEmployee class inheriting all the attributes from the employee class.
Inheritance in Python is done by passing the inherited class as an argument in the class definition. It is often used to modify the behavior of existing methods.
An instance of the specialEmployee class is identical to an Employee instance, except for the changed hours() method. For example, in the following code we create a new specialEmployee class that inherits all the functionalities of the Employee class, and also change the hours() method:
self.owed += numHours*self.rate*2
return("%.2f hours worked" % numHours)
For a subclass to define new class variables, it needs to define an __init__() method, as follows:
Employee.__init__(self,name,rate) #calls the base classes
self.owed += numHours*self.rate+self.bonus
return("%.2f hours worked" % numHours)
Notice that the methods of the base class are not automatically invoked and it is necessary for the derived class to call them. We can test for the class membership using the built-in isinstance(obj1,obj2) function. This returns True if obj1 belongs to the class of obj2 or any class derived from obj2. Let's consider the following example to understand this, where obj1 and obj2 are the objects of the Employee and specialEmployee classes respectively:
#Example issubclass() to check whether a class is a subclass of another class
#Example isinstance() to check if an object belongs to a class or not
d = specialEmployee("packt", 20, 100)
b = Employee("packt", 20)
# the output prints
Generally, all the methods operate on the instance of a class defined within a class. However, it is not a requirement. There are two types of methods—static methods and class methods. A static method is quite similar to a class method, which is mainly bound to the class, and not bound with the object of the class. It is defined within a class and does not require an instance of a class to execute. It does not perform any operations on the instance and it is defined using the @staticmethod class decorator. Static methods cannot access the attributes of an instance, so their most common usage is as a convenience to group utility functions together.
A class method operates on the class itself and does not work with the instances. A class method works in the same way that class variables are associated with the classes rather than instances of that class. Class methods are defined using the @classmethod decorator and are distinguished from instance methods in the class. It is passed as the first argument, and this is named cls by convention. The exponentialB class inherits from the exponentialA class and changes the base class variable to 4. We can also run the parent class's exp() method as follows:
def addition(x, y):
a = exponentialA()
print("the value: 3 to the power 3 is", b)
print('The sum is:', exponentialA.addition(15, 10))
#prints the following output
the value: 3 to the power 3 is 27
The sum is: 25
The difference between a static method and a class method is that a static method doesn't know anything about the class, it only deals with the parameters, whereas the class method works only with the class, and its parameter is always the class itself.
There are several reasons why class methods may be useful. For example, because a subclass inherits all the same features of its parent, there is the potential for it to break inherited methods. Using class methods is a way to define exactly what methods are run.
Unless otherwise specified, all attributes and methods are accessible without restriction. This also means that everything defined in a base class is accessible from a derived class. This may cause problems when we are building object-oriented applications where we may want to hide the internal implementation of an object. This can lead to namespace conflicts between objects defined in derived classes with the base class. To prevent this, the methods we define private attributes with have a double underscore, such as __privateMethod(). These method names are automatically changed to __Classname_privateMethod() to prevent name conflicts with methods defined in base classes. Be aware that this does not strictly hide private attributes, rather it just provides a mechanism for preventing name conflicts.
It is recommended to use private attributes when using a class property to define mutable attributes. A property is a kind of attribute that rather than returning a stored value computes its value when called. For example, we could redefine the exp() property with the following:
This chapter has given us a basic fundamental and an introduction to the Python programming. We described various data structures and algorithms provided by the python. We covered the use of variables, lists, a couple of control structures, and learned how to use the conditional statement. We also discussed how functions are used in python. The various kinds of objects were discussed, together with some materials on the object-oriented aspects of the Python language. We created our own objects and inherited from them.
There is still more that Python offers. As we prepare to examine the later chapters on some implementations of algorithms, the next chapter will focus on numbers, sequences, maps, and sets. These are also data types in Python that prove useful when organizing data for a series of operations.
- Learning Python by Fabrizio Romano: https://www.packtpub.com/application-development/learning-python.