The main differences between Python 3 and Python 2
It has already been said that Python 3 breaks backwards compatibility with Python 2. Still, it is not a complete redesign. Also, it does not mean that every Python module written for a 2.x release will stop working under Python 3. It is possible to write completely cross-compatible code that will run on both major releases without additional tools or techniques, but usually it is possible only for simple applications.
Why should I care?
Despite my personal opinion on Python 2 compatibility, exposed earlier in this chapter, it is impossible to simply forget about it right at this time. There are still some useful packages (such as fabric, mentioned in Chapter 6, Deploying the Code) that are really worth using but are not likely to be ported in the very near future.
Also, sometimes we may be constrained by the organization we work in. The existing legacy code may be so complex that porting it is not economically feasible. So, even if we decide to move on and live only in the Python 3 world from now on, it will be impossible to completely live without Python 2 for some time.
Nowadays, it is very hard to name oneself a professional developer without giving something back to the community, so helping the open source developers in adding Python 3 compatibility to the existing packages is a good way to pay off the "moral debt" incurred by using them. This, of course, cannot be done without knowing the differences between Python 2 and Python 3. By the way, this is also a great exercise for those new in Python 3.
The main syntax differences and common pitfalls
The Python documentation is the best reference for differences between every release. Anyway, for readers' convenience, this section summarizes the most important ones. This does not change the fact that the documentation is mandatory reading for those not familiar with Python 3 yet (see https://docs.python.org/3.0/whatsnew/3.0.html).
The breaking changes introduced by Python 3 can generally be divided into a few groups:
- Syntax changes, wherein some syntax elements were removed/changed and other elements were added
- Changes in the standard library
- Changes in datatypes and collections
Syntax changes
Syntax changes that make it difficult for the existing code to run are the easiest to spot—they will cause the code to not run at all. The Python 3 code that features new syntax elements will fail to run on Python 2 and vice versa. The elements that are removed will make Python 2 code visibly incompatible with Python 3. The running code that has such issues will immediately cause the interpreter to fail raising a SyntaxError exception. Here is an example of the broken script that has exactly two statements, of which none will be executed due to the syntax error:
print("hello world")
print "goodbye python2"Its actual result when run on Python 3 is as follows:
$ python3 script.py File "script.py", line 2 print "goodbye python2" ^ SyntaxError: Missing parentheses in call to 'print'
The list of such differences is a bit long and, from time to time, any new Python 3.x release may add new elements of syntax that will raise such errors on earlier releases of Python (even on the same 3.x branch). The most important of them are covered in Chapter 2, Syntax Best Practices – below the Class Level, and Chapter 3, Syntax Best Practices – above the Class Level, so there is no need to list all of them here.
The list of things dropped or changed from Python 2.7 is shorter, so here are the most important ones:
- printis no longer a statement but a function instead, so the parenthesis is now obligatory.
- Catching exceptions changed from except exc, vartoexcept exc as var.
- The <>comparison operator has been removed in favor of!=.
- from module import *(https://docs.python.org/3.0/reference/simple_stmts.html#import) is now allowed only on a module level, no longer inside the functions.
- from .[module] import nameis now the only accepted syntax for relative imports. All imports not starting with the dot character are interpreted as absolute imports.
- The sort()function and the list'ssorted()method no longer accept thecmpargument. Thekeyargument should be used instead.
- Division expressions on integers such as 1/2 return floats. The truncating behavior is achieved through the //operator like1//2. The good thing is that this can be used with floats too, so5.0//2.0 == 2.0.
Changes in the standard library
Breaking changes in the standard library are the second easiest to catch after syntax changes. Each subsequent version of Python adds, deprecates, improves, or completely removes standard library modules. Such a process was regular also in the older versions of Python (1.x and 2.x), so it does not come as a shock in Python 3. In most cases, depending on the module that was removed or reorganized (like urlparse being moved to urllib.parse), it will raise exceptions on the import time just after it was interpreted. This makes such issues so easy to catch. Anyway, in order to be sure that all such issues are covered, the full test code coverage is essential. In some cases (for example, when using lazily loaded modules), the issues that are usually noticed on import time will not appear before some modules are used in code as function calls. This is why, it is so important to make sure that every line of code is actually executed during tests suite.
Tip
Lazily loaded modules
A lazily loaded module is a module that is not loaded on import time. In Python, import statements can be included inside of functions so import will happen on a function call and not on import time. In some cases, such loading of modules may be a reasonable choice but in most cases, it is a workaround for poorly designed module structures (for example, to avoid circular imports) and should be generally avoided. For sure, there is no justifiable reason to lazily load standard library modules.
Changes in datatypes and collections
Changes in how Python represents datatypes and collections require the most effort when the developer tries to maintain compatibility or simply port existing code to Python 3. While incompatible syntax or standard library changes are easily noticeable and the most easy to fix, changes in collections and types are either nonobvious or require a lot of repetitive work. A list of such changes is long and, again, official documentation is the best reference.
Still, this section must cover the change in how string literals are treated in Python 3 because it seems to be the most controversial and discussed change in Python 3, despite being a very good thing that now makes things more explicit.
All string literals are now Unicode and bytes literals require a b or B prefix. For Python 3.0 and 3.1 using u prefix (like u"foo") was dropped and will raise a syntax error. Dropping that prefix was the main reason for all controversies. It made really hard to create code that was compatible in different branches of Python—version 2.x relied on this prefix in order to create Unicode literals. This prefix was brought back in Python 3.3 to ease the integration process, although without any syntactic meaning.
The popular tools and techniques used for maintaining cross-version compatibility
Maintaining compatibility between versions of Python is a challenge. It may add a lot of additional work depending on the size of the project but is definitely doable and worth doing. For packages that are meant to be reused in many environments, it is an absolute must have. Open source packages without well-defined and tested compatibility bounds are very unlikely to become popular, but also, closed third-party code that never leaves the company network can greatly benefit from being tested in different environments.
It should be noted here that while this part focuses mainly on compatibility between various versions of Python, these approaches apply for maintaining compatibility with external dependencies like different package versions, binary libraries, systems, or external services.
The whole process can be divided into three main areas, ordered by importance:
- Defining and documenting target compatibility bounds and how they will be managed
- Testing in every environment and with every dependency version declared as compatible
- Implementing actual compatibility code
Declaration of what is considered compatible is the most important part of the whole process because it gives the users of the code (developers) the ability to have expectations and make assumptions on how it works and how it can change in the future. Our code can be used as a dependency in different projects that may also strive to manage compatibility, so the ability to reason how it behaves is crucial.
While this book tries to always give a few choices rather than to give an absolute recommendation on specific options, here is one of the few exceptions. The best way so far to define how compatibility may change in the future is by the proper approach to versioning numbers using Semantic Versioning (http://semver.org/), or shortly, semver. It describes a broadly accepted standard for marking the scope of change in code by the version specifier consisting only of three numbers. It also gives some advice on how to handle deprecation policies. Here is an excerpt from its summary:
Given a version number MAJOR.MINOR.PATCH, increment:
- A MAJORversion when you make incompatible API changes
- A MINORversion when you add functionality in a backwards-compatible manner
- A PATCHversion when you make backwards-compatible bug fixes
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
When it comes to testing, the sad truth is that to be sure that code is compatible with every declared dependency version and in every environment (here, the Python version), it must be tested in every combination of these. This, of course, may not be possible when the project has a lot of dependencies because the number of combinations grows rapidly with every new dependency in a version. So, typically some trade off needs to be made so that running full compatibility tests does not take ages. A selection of tools that help testing in so-called matrixes is presented in Chapter 10, Test-Driven Development, that discusses testing in general.
Note
The benefit of using projects that follow semver is that usually what needs to be tested are only major releases because minor and patch releases are guaranteed not to include backwards incompatible changes. This is only true if such projects can be trusted not to break such a contract. Unfortunately, mistakes happen to everyone and backward incompatible changes happen in a lot of projects, even on patch versions. Still, since semver declares strict compatibility on minor and patch version changes, breaking it is considered a bug, so it may be fixed in patch release.
Implementation of the compatibility layer is last and also least important if bounds of that compatibility are well-defined and rigorously tested. Still there are some tools and techniques that every programmer interested in such a topic should know.
The most basic is Python's __future__ module. It ports back some features from newer Python releases back into the older ones and takes the form of import statement:
from __future__ import <feature>
Features provided by future statements are syntax-related elements that cannot be easily handled by different means. This statement affects only the module where it was used. Here is an example of Python 2.7 interactive session that brings Unicode literals from Python 3.0:
Python 2.7.10 (default, May 23 2015, 09:40:32) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> type("foo") # old literals <type 'str'> >>> from __future__ import unicode_literals >>> type("foo") # now is unicode <type 'unicode'>
Here is a list of all the available __future__ statement options that developers concerned with 2/3 compatibility should know:
- division: This adds a Python 3 division operator (PEP 238)
- absolute_import: This makes every form of- importstatement not starting with a dot character interpreted as an absolute import (PEP 328)
- print_function: This changes a- printstatement into a function call, so parentheses around- printbecomes mandatory (PEP 3112)
- unicode_literals: This makes every string literal interpreted as Unicode literals (PEP 3112)
A list of the __future__ statement options is very short and it covers only a few syntax features. The other things that have changed like the metaclass syntax (which is an advanced feature covered in Chapter 3, Syntax Best Practices – above the Class Level), are a lot harder to maintain. Reliably handling of multiple standard library reorganizations also cannot be solved by future statements. Happily, there are some tools that aim to provide a consistent layer of ready-to-use compatibility. The most commonly known is Six (https://pypi.python.org/pypi/six/) that provides whole common 2/3 compatibility boilerplate as a single module. The other promising but slightly less popular tool is the future module (http://python-future.org/).
In some situations, developers may not want to include additional dependencies in some small packages. A common practice is the additional module that gathers all the compatibility code, usually named compat.py. Here is an example of such a compat module taken from the python-gmaps project (https://github.com/swistakm/python-gmaps):
# -*- coding: utf-8 -*-
import sys
if sys.version_info < (3, 0, 0):
    import urlparse  # noqa
    def is_string(s):
        return isinstance(s, basestring)
else:
    from urllib import parse as urlparse  # noqa
    def is_string(s):
        return isinstance(s, str)Such a compat.py module is popular even in projects that depends on Six for 2/3 compatibility because it is a very convenient way to store code that handles compatibility with different versions of packages used as dependencies.
Tip
Downloading the example code
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
- Log in or register to our website using your e-mail address and password.
- Hover the mouse pointer on the SUPPORT tab at the top.
- Click on Code Downloads & Errata.
- Enter the name of the book in the Search box.
- Select the book for which you're looking to download the code files.
- Choose from the drop-down menu where you purchased this book from.
- Click on Code Download.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
- WinRAR / 7-Zip for Windows
- Zipeg / iZip / UnRarX for Mac
- 7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Expert-Python-Programming_Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                