Python provides a variety of specialized data types, such as dates and times, container types, and enumerations. There is a whole section in the Python standard library titled Data Types , which deserves to be explored; it is filled with interesting and useful tools for each and every programmer's needs. You can find it here:
https://docs.python.org/3/library/datatypes.html
In this section, we are briefly going to take a look at dates and times, collections, and enumerations.
Dates and times
The Python standard library provides several data types that can be used to deal with dates and times. This realm may seem innocuous at first glance, but it's actually quite tricky: timezones, daylight saving time… There are a huge number of ways to format date and time information; calendar quirks, parsing, and localizing—these are just a few of the many difficulties we face when we deal with dates and times, and that's probably the reason why, in this particular context, it is very common for professional Python programmers to also rely on various third-party libraries that provide some much-needed extra power.
The standard library
We will start with the standard library, and finish the session with a little overview of what's out there in terms of the third-party libraries you can use.
From the standard library, the main modules that are used to handle dates and times are datetime
, calendar
, zoneinfo
, and time
. Let's start with the imports you'll need for this whole section:
>>> from datetime import date, datetime, timedelta, timezone
>>> import time
>>> import calendar as cal
>>> from zoneinfo import ZoneInfo
The first example deals with dates. Let's see how they look:
>>> today = date.today()
>>> today
datetime.date(2021, 3, 28)
>>> today.ctime()
'Sun Mar 28 00:00:00 2021'
>>> today.isoformat()
'2021-03-28'
>>> today.weekday()
6
>>> cal.day_name[today.weekday()]
'Sunday'
>>> today.day, today.month, today.year
(28, 3, 2021)
>>> today.timetuple()
time.struct_time(
tm_year=2021, tm_mon=3, tm_mday=28,
tm_hour=0, tm_min=0, tm_sec=0,
tm_wday=6, tm_yday=87, tm_isdst=-1
)
We start by fetching the date for today. We can see that it's an instance of the datetime.date
class. Then we get two different representations for it, following the C and the ISO 8601 format standards, respectively. After that, we ask what day of the week it is, and we get the number 6. Days are numbered 0 to 6 (representing Monday to Sunday), so we grab the value of the sixth element in calendar.day_name
(notice in the code that we have substituted calendar
with "cal
" for brevity).
The last two instructions show how to get detailed information out of a date object. We can inspect its day
, month
, and year
attributes, or call the timetuple()
method and get a whole wealth of information. Since we're dealing with a date object, notice that all the information about time has been set to 0.
Let's now play with time:
>>> time.ctime()
'Sun Mar 28 15:23:17 2021'
>>> time.daylight
1
>>> time.gmtime()
time.struct_time(
tm_year=2021, tm_mon=3, tm_mday=28,
tm_hour=14, tm_min=23, tm_sec=34,
tm_wday=6, tm_yday=87, tm_isdst=0
)
>>> time.gmtime(0 )
time.struct_time(
tm_year=1970, tm_mon=1, tm_mday=1,
tm_hour=0, tm_min=0, tm_sec=0,
tm_wday=3, tm_yday=1, tm_isdst=0
)
>>> time.localtime()
time.struct_time(
tm_year=2021, tm_mon=3, tm_mday=28,
tm_hour=15, tm_min=23, tm_sec=50,
tm_wday=6, tm_yday=87, tm_isdst=1
)
>>> time.time()
1616941458.149149
This example is quite similar to the one before, only here, we are dealing with time. We can see how to get a printed representation of time according to C format standard, and then how to check if daylight saving time is in effect. The function gmtime
converts a given number of seconds from the epoch to a struct_time
object in UTC. If we don't feed it any number, it will use the current time.
The epoch is a date and time from which a computer system measures system time. You can see that on the machine used to run this code, the epoch is January 1st , 1970. This is the point in time used by both Unix and POSIX.
Coordinated Universal Time or UTC is the primary time standard by which the world regulates clocks and time.
We finish the example by getting the struct_time
object for the current local time and the number of seconds from the epoch expressed as a float number (time.time()
).
Let's now see an example using datetime
objects, which bring together dates and times.
>>> now = datetime.now()
>>> utcnow = datetime.utcnow()
>>> now
datetime.datetime(2021, 3, 28, 15, 25, 16, 258274)
>>> utcnow
datetime.datetime(2021, 3, 28, 14, 25, 22, 918195)
>>> now.date()
datetime.date(2021, 3, 28)
>>> now.day, now.month, now.year
(28, 3, 2021)
>>> now.date() == date.today()
True
>>> now.time()
datetime.time(15, 25, 16, 258274)
>>> now.hour, now.minute, now.second, now.microsecond
(15, 25, 16, 258274)
>>> now.ctime()
'Sun Mar 28 15:25:16 2021'
>>> now.isoformat()
'2021-03-28T15:25:16.258274'
>>> now.timetuple()
time.struct_time(
tm_year=2021, tm_mon=3, tm_mday=28,
tm_hour=15, tm_min=25, tm_sec=16,
tm_wday=6, tm_yday=87, tm_isdst=-1
)
>>> now.tzinfo
>>> utcnow.tzinfo
>>> now.weekday()
6
The preceding example is rather self-explanatory. We start by setting up two instances that represent the current time. One is related to UTC (utcnow
), and the other one is a local representation (now
). It just so happens that we ran this code on the first day after daylight saving time was introduced in the UK in 2021, so now
represents the current time in BST. BST is one hour ahead of UTC when daylight saving time is in effect, as can be seen from the code.
You can get date
, time
, and specific attributes from a datetime
object in a similar way as to what we have already seen. It is also worth noting how both now
and utcnow
present the value None
for the tzinfo
attribute. This happens because those objects are naive .
Date and time objects may be categorized as aware if they include time zone information, or naïve if they don't.
Let's now see how a duration is represented in this context:
>>> f_bday = datetime(
1975, 12, 29, 12, 50, tzinfo=ZoneInfo('Europe/Rome')
)
>>> h_bday = datetime(
1981, 10, 7, 15, 30, 50, tzinfo=timezone(timedelta(hours=2))
)
>>> diff = h_bday - f_bday
>>> type (diff)
<class 'datetime.timedelta'>
>>> diff.days
2109
>>> diff.total_seconds()
182223650.0
>>> today + timedelta(days=49 )
datetime.date(2021, 5, 16)
>>> now + timedelta(weeks=7 )
datetime.datetime(2021, 5, 16, 15, 25, 16, 258274)
Two objects have been created that represent Fabrizio and Heinrich's birthdays. This time, in order to show you the alternative, we have created aware objects.
There are several ways to include time zone information when creating a datetime
object, and in this example, we are showing you two of them. One uses the brand-new ZoneInfo
object from the zoneinfo
module, introduced in Python 3.9. The second one uses a simple timedelta
, an object that represents a duration.
We then create the diff
object, which is assigned as the subtraction of them. The result of that operation is an instance of timedelta
. You can see how we can interrogate the diff
object to tell us how many days Fabrizio and Heinrich's birthdays are apart, and even the number of seconds that represent that whole duration. Notice that we need to use total_seconds
, which expresses the whole duration in seconds. The seconds
attribute represents the number of seconds assigned to that duration. So, a timedelta(days=1)
will have seconds equal to 0, and total_seconds
equal to 86,400 (which is the number of seconds in a day).
Combining a datetime
with a duration adds or subtracts that duration from the original date and time information. In the last few lines of the example, we can see how adding a duration to a date
object produces a date
as a result, whereas adding it to a datetime
produces a datetime
, as it is fair to expect.
One of the more difficult undertakings to carry out using dates and times is parsing. Let's see a short example:
>>> datetime.fromisoformat('1977-11-24T19:30:13+01:00' )
datetime.datetime(
1977, 11, 24, 19, 30, 13,
tzinfo=datetime.timezone(datetime.timedelta(seconds=3600))
)
>>> datetime.fromtimestamp(time.time())
datetime.datetime(2021, 3, 28, 15, 42, 2, 142696)
We can easily create datetime
objects from ISO-formatted strings, as well as from timestamps. However, in general, parsing a date from unknown formats can prove to be a difficult task.
Third-party libraries
To finish off this subsection, we would like to mention a few third-party libraries that you will very likely come across the moment you will have to deal with dates and times in your code:
These three are some of the most common, and they are worth investigating.
Let's take a look at one final example, this time using the Arrow third-party library:
>>> import arrow
>>> arrow.utcnow()
<Arrow [2021-03-28T14:43:20.017213+00:00]>
>>> arrow.now()
<Arrow [2021-03-28T15:43:39.370099+01:00]>
>>> local = arrow.now('Europe/Rome' )
>>> local
<Arrow [2021-03-28T16:59:14.093960+02:00]>
>>> local.to('utc' )
<Arrow [2021-03-28T14:59:14.093960+00:00]>
>>> local.to('Europe/Moscow' )
<Arrow [2021-03-28T17:59:14.093960+03:00]>
>>> local.to('Asia/Tokyo' )
<Arrow [2021-03-28T23:59:14.093960+09:00]>
>>> local.datetime
datetime.datetime(
2021, 3, 28, 16, 59, 14, 93960,
tzinfo=tzfile('/usr/share/zoneinfo/Europe/Rome')
)
>>> local.isoformat()
'2021-03-28T16:59:14.093960+02:00'
Arrow provides a wrapper around the data structures of the standard library, plus a whole set of methods and helpers that simplify the task of dealing with dates and times. You can see from this example how easy it is to get the local date and time in the Italian time zone (Europe/Rome ), as well as to convert it to UTC, or to the Russian or Japanese time zones. The last two instructions show how you can get the underlying datetime
object from an Arrow one, and the very useful ISO-formatted representation of a date and time.
The collections module
When Python general-purpose built-in containers (tuple
, list
, set
, and dict
) aren't enough, we can find specialized container data types in the collections
module. They are described in Table 2.1 .
Data type
Description
namedtuple()
Factory function for creating tuple subclasses with named fields
deque
List-like container with fast appends and pops on either end
ChainMap
Dictionary-like class for creating a single view of multiple mappings
Counter
Dictionary subclass for counting hashable objects
OrderedDict
Dictionary subclass with methods that allow for re-ordering entries
defaultdict
Dictionary subclass that calls a factory function to supply missing values
UserDict
Wrapper around dictionary objects for easier dictionary subclassing
UserList
Wrapper around list objects for easier list subclassing
UserString
Wrapper around string objects for easier string subclassing
Table 2.1: Collections module data types
There isn't enough space here to cover them all, but you can find plenty of examples in the official documentation; here, we will just give a small example to show you namedtuple
, defaultdict
, and ChainMap
.
namedtuple
A namedtuple is a tuple-like object that has fields accessible by attribute lookup, as well as being indexable and iterable (it's actually a subclass of tuple
). This is sort of a compromise between a fully-fledged object and a tuple, and it can be useful in those cases where you don't need the full power of a custom object, but only want your code to be more readable by avoiding weird indexing. Another use case is when there is a chance that items in the tuple need to change their position after refactoring, forcing the coder to also refactor all the logic involved, which can be very tricky.
For example, say we are handling data about the left and right eyes of a patient. We save one value for the left eye (position 0) and one for the right eye (position 1) in a regular tuple. Here's how that may look:
>>> vision = (9.5 , 8.8 )
>>> vision
(9.5, 8.8)
>>> vision[0 ]
9.5
>>> vision[1 ]
8.8
Now let's pretend we handle vision
objects all of the time, and, at some point, the designer decides to enhance them by adding information for the combined vision, so that a vision
object stores data in this format (left eye, combined, right eye) .
Do you see the trouble we're in now? We may have a lot of code that depends on vision[0]
being the left eye information (which it still is) and vision[1]
being the right eye information (which is no longer the case). We have to refactor our code wherever we handle these objects, changing vision[1]
to vision[2]
, and that can be painful. We could have probably approached this a bit better from the beginning, by using a namedtuple
. Let us show you what we mean:
>>> from collections import namedtuple
>>> Vision = namedtuple('Vision' , ['left' , 'right' ])
>>> vision = Vision(9.5 , 8.8 )
>>> vision[0 ]
9.5
>>> vision.left
9.5
>>> vision.right
8.8
If, within our code, we refer to the left and right eyes using vision.left
and vision.right
, all we need to do to fix the new design issue is change our factory and the way we create instances—the rest of the code won't need to change:
>>> Vision = namedtuple('Vision' , ['left' , 'combined' , 'right' ])
>>> vision = Vision(9.5 , 9.2 , 8.8 )
>>> vision.left
9.5
>>> vision.right
8.8
>>> vision.combined
9.2
You can see how convenient it is to refer to those values by name rather than by position. After all, as a wise man once wrote, Explicit is better than implicit (Can you recall where? Think Zen if you can't...). This example may be a little extreme; of course, it's not likely that our code designer will go for a change like this, but you'd be amazed to see how frequently issues similar to this one occur in a professional environment, and how painful it is to refactor in such cases.
defaultdict
The defaultdict data type is one of our favorites. It allows you to avoid checking whether a key is in a dictionary by simply inserting it for you on your first access attempt, with a default value whose type you pass on creation. In some cases, this tool can be very handy and shorten your code a little. Let's see a quick example. Say we are updating the value of age
, by adding one year. If age
is not there, we assume it was 0 and we update it to 1:
>>> d = {}
>>> d['age' ] = d.get('age' , 0 ) + 1
>>> d
{'age': 1}
>>> d = {'age' : 39 }
>>> d['age' ] = d.get('age' , 0 ) + 1
>>> d
{'age': 40}
Now let's see how it would work with a defaultdict
data type. The second line is actually the short version of an if
clause that runs to a length of four lines, and that we would have to write if dictionaries didn't have the get()
method (we'll see all about if
clauses in Chapter 3 , Conditionals and Iteration ):
>>> from collections import defaultdict
>>> dd = defaultdict(int )
>>> dd['age' ] += 1
>>> dd
defaultdict(<class 'int'>, {'age': 1}) # 1, as expected
Notice how we just need to instruct the defaultdict
factory that we want an int
number to be used if the key is missing (we'll get 0, which is the default for the int
type). Also notice that even though in this example there is no gain on the number of lines, there is definitely a gain in readability, which is very important. You can also use a different technique to instantiate a defaultdict
data type, which involves creating a factory object. To dig deeper, please refer to the official documentation.
ChainMap
ChainMap is an extremely useful data type which was introduced in Python 3.3. It behaves like a normal dictionary but, according to the Python documentation, is provided for quickly linking a number of mappings so they can be treated as a single unit . This is usually much faster than creating one dictionary and running multiple update
calls on it. ChainMap
can be used to simulate nested scopes and is useful in templating. The underlying mappings are stored in a list. That list is public and can be accessed or updated using the maps
attribute. Lookups search the underlying mappings successively until a key is found. By contrast, writes, updates, and deletions only operate on the first mapping.
A very common use case is providing defaults, so let's see an example:
>>> from collections import ChainMap
>>> default_connection = {'host' : 'localhost' , 'port' : 4567 }
>>> connection = {'port' : 5678 }
>>> conn = ChainMap(connection, default_connection)
>>> conn['port' ]
5678
>>> conn['host' ]
'localhost'
>>> conn.maps
[{'port': 5678}, {'host': 'localhost', 'port': 4567}]
>>> conn['host' ] = 'packtpub.com'
>>> conn.maps
[{'port': 5678, 'host': 'packtpub.com'},
{'host': 'localhost', 'port': 4567}]
>>> del conn['port' ]
>>> conn.maps
[{'host': 'packtpub.com'}, {'host': 'localhost', 'port': 4567}]
>>> conn['port' ]
4567
>>> dict (conn)
{'host': 'packtpub.com', 'port': 4567}
Isn't it just lovely that Python makes your life so easy? You work on a ChainMap
object, configure the first mapping as you want, and when you need a complete dictionary with all the defaults as well as the customized items, you can just feed the ChainMap
object to a dict
constructor. If you have ever coded in other languages, such as Java or C++, you probably will be able to appreciate how precious this is, and how well Python simplifies some tasks.
Enums
Technically not a built-in data type, as you have to import them from the enum
module, but definitely worth mentioning, are enumerations . They were introduced in Python 3.4, and though it is not that common to see them in professional code, we thought it would be a good idea to give you an example anyway for the sake of completeness.
The official definition of an enumeration is that it is a set of symbolic names (members) bound to unique, constant values. Within an enumeration, the members can be compared by identity, and the enumeration itself can be iterated over.
Say you need to represent traffic lights; in your code, you might resort to the following:
>>> GREEN = 1
>>> YELLOW = 2
>>> RED = 4
>>> TRAFFIC_LIGHTS = (GREEN, YELLOW, RED)
>>>
>>> traffic_lights = {'GREEN' : 1 , 'YELLOW' : 2 , 'RED' : 4 }
There's nothing special about this code. It's something, in fact, that is very common to find. But, consider doing this instead:
>>> from enum import Enum
>>> class TrafficLight ( Enum ):
... GREEN = 1
... YELLOW = 2
... RED = 4
...
>>> TrafficLight.GREEN
<TrafficLight.GREEN: 1>
>>> TrafficLight.GREEN.name
'GREEN'
>>> TrafficLight.GREEN.value
1
>>> TrafficLight(1 )
<TrafficLight.GREEN: 1>
>>> TrafficLight(4 )
<TrafficLight.RED: 4>
Ignoring for a moment the (relative) complexity of a class definition, you can appreciate how this approach may be advantageous. The data structure is much cleaner, and the API it provides is much more powerful. We encourage you to check out the official documentation to explore all the great features you can find in the enum
module. We think it's worth exploring, at least once.