About this book

Irrespective of one’s level of expertise with Esri software, a good command of Python is necessary to drive a geospatial environment. Python proficiency makes even an experienced user of Esri technology/software 5–10-times as valuable as a non-coding GIS analyst. Python for ArcGIS Pro explains how to incorporate scripting at each step from mapping to data science, databases, and data services.

The book leads the reader through the major uses of Python programming for ArcGIS Pro - map production, online and offline data management, data analyses, and data visualization. It shows various Python programming options for ArcGIS Pro, and how to integrate them together into a smarter workflow. You’ll learn how to use popular Python packages such as Jupyter Notebooks and pandas to explore and analyze geospatial data, and how to write data engineering scripts to manage ongoing data processing and data transfers. The book concludes with 3 real-world case studies where you’ll apply the concepts you studied earlier.

By the end of this book, you will be able to use Python to perform all the major tasks involved with ArcGIS Pro: automating the production of maps for print, managing data between ArcGIS Pro and ArcGIS Online, creating custom script tools for sharing, and then running data analysis on top of the ArcGIS geospatial library, all using Python.

Publication date:
December 2021
Publisher
Packt
Pages
489
ISBN
9781803241661

 

Python: The Beginning

Programming with computers is one of the most rewarding and frustrating of human endeavors.

Those rewards can be in the form of money, as we can see with today’s high-tech salaries. I would argue, however, that the most rewarding part of mastering programming is to make yourself into a computer power user who can execute both simple and complex applications and analyses, written in reusable code, with ease.

The frustrations will come and go, and it is a good thing: you, like me and millions before you, will learn from each mistake (it helps to be a pedant, perhaps, but not being one myself I can’t be sure). You will grow and learn with each exercise in this book, and by asking the right questions and paying close attention you can avoid some of these issues.

If you are an ArcGIS expert or novice, seeking to expand on your skillsets: congratulations, you are in the right place. In this book you will learn how to take your existing GIS expertise (or interest) and multiply its potential using a deceptively simple programming language called Python.

Computer programming is its own vast field that cannot be captured in one chapter, of course. In this chapter I will explain the basic knowledge necessary to read, write and run Python scripts. We’ll leave the ArcGIS tools for later chapters and focus on Python: its beginnings, its current state, how to use it, and importantly, what Python is and what it is not.

We will cover the following topics:

  • Basics of Python
  • Basics of computer programming
  • Installing and importing modules
  • Writing and executing scripts
 

Python: Built Different

Guido Van Rossum, the creator of the Python programming language, was frustrated with the state of computer programming in the late 1980s. Programming languages were too complex, and at the same time, too loose with their formatting requirements. This led to large codebases with complex scripts poorly written and rarely documented.

Merely running a simple program could take a long time, as the code would need to be type-checked (variables declared correctly and assigned to the correct data type) and compiled (converted from high-level code written in text files into the assembly language or machine code understood by the CPU).

As the Dutch programmer completed professional work on the ABC programming language, where he had learned much about language design, he decided he wanted to turn his grips about the limits of ABC and other languages into a hobby.

With a master’s degree in mathematics and computer science from the University of Amsterdam, his hobbies tended towards the computer, but he did have a love for Monty Python, the British comedy series. So, he combined his passions and created Python, which is now used for all kinds of programmatic solutions. Today Python is everywhere, in the internet and appliances and cars and so much more. Because of its ubiquity and its simplicity, it has been adopted by the GIS software ecosystem as a standard programming tool.

 

Why Python is different

Because of Van Rossum’s extensive experience with the state of computer languages in the 1980s, he was well positioned to create a language that solved many of their deficiencies. He added features that he admired from many other languages and added a few of his own. Here is an incomplete list of Python features built to improve on other languages:

Issue Improvement Python Feature
Memory overrun Built-in memory management Garbage collection and memory management
Slow compiler times One line testing, dynamic typing Python Interpreter
Unclear error messages Messages indicating the offending line and affected code Error Traceback
Spaghetti code Clean importation and modularization Importation
Unclear code formatting and spacing making code unreadable Indentation rules and reduced brackets Forced whitespace
Too many ways to do something There should be only one way: the Pythonic way The Zen of Python

Python Versions

The original Python version release in 1991 by Van Rossum, Python 1.0 and its successors, was eventually superseded by the widely popular Python 2.x. Care was taken to ensure that version 2.0 and beyond were backwards-compatible with Python 1.x. However, for the new Python 3.0 and beyond, backwards compatibility with Python 1 and Python 2 was broken.

This break has caused a divergence in the Python ecosystem. Some companies chose to stick with Python 2.x, which has meant that the “sunset” date or retirement date for the older version was extended from 2015 until April 2020. Now that the sunset date has been passed, there is no active work by the Python Software Foundation (PSF) on Python 2.x. Python 3.x development continues and will continue into the future, overseen by the PSF.

Van Rossum served as the Benevolent Dictator for Life of the PSF until he resigned the position in 2018.

Check out more about the history of Python: https://docs.python.org/3/faq/general.html

Figure 1:Divergence of Python 3 from Python 2

ArcGIS Python Versions

Since ArcMap version 9.x, Python has been integrated into the ArcGIS software suite. However, ArcGIS Desktop and ArcGIS Pro now both depend on different versions of Python.

ArcGIS Desktop: Python 2.x

ArcGIS Desktop (or ArcMap) version 9.0 and above ships with Python 2.x included. The installer for ArcGIS would automatically install Python 2.x and would add the arcpy module (originally arcgisscripting) to the Python path variable, making it available for scripting.

ArcMap, ArcCatalog, ArcGIS Engine, and ArcGIS Server all depend on arcpy and the Python 2.x version included when the ArcGIS Desktop or Enterprise software is installed.

ArcGIS Pro: Python 3.x

ArcGIS Pro, which was designed after the decision to sunset Python 2.0 was announced, was divorced from the Python 2.x ecosystem and instead shipped with Python 3.x.

Instead of arcpy, ArcGIS Pro uses the ArcGIS API for Python.

Managing both versions

The sunsetting of ArcGIS Desktop has been extended to March 2025, meaning that Python 2.7 will be included by Esri until that time despite it being officially retired by the Python Software Foundation.

Because of this, we will learn use virtual environments to manage the versions, and you will learn about the PATH and PYTHONPATH environmental variables, which control which version of Python is used to execute a script.

IMAGE CREDIT: https://media.geeksforgeeks.org/wp-content/uploads/20190502023317/TIMELINE.jpg

What is Python?

In short, Python is an application: python.exe. This application is also an executable file, meaning it can be run by itself to interpret code, or it can be called from other applications to run custom scripts. This standard interoperability is part of why it is included in applications such as ArcGIS Pro. When ArcGIS is installed, Python is also installed on your computer, along with a series of supporting files and folders.

Python includes a large standard library of tools or “modules”. These include support for internet requests, advanced math, CSV reading and writing, JSON serialization, and many more modules included in the Python core. While these tools are powerful, Python was also built to be extensible, meaning that third-party modules can be easily added to a Python installation. The ArcGIS Python modules are both good examples of extending the capabilities of Python. There are hundreds of thousands of others, covering almost any type of programming need, of varying quality.

Python is written in the programming language C. There are variants of Python written in other languages for a variety of technical reasons, but most implementations of Python are built on top of C. This means that Python is often expanded through modules built on top of C code, usually for speed improvement reasons. A Python code “layer” or “wrapper” is put on top of C code to make it work with normal Python packages, gaining the simplicity of Python and the processing speed boosts of precompiled C code. NumPy and SciPy are examples of this type of module, and are included with the ArcGIS installation of Python.

Python is free and open software, which is another reason it is packaged with so many other software applications for automation purposes. Python can also be installed separately, using a free installer from the Python Software Foundation.

Check out the Python Software Foundation on the internet: https://www.python.org/psf

Download Python versions directly from the PSF: https://www.python.org/downloads/

Where is it installed

On Windows machines, Python is not included by default – it must be installed along with ArcGIS or separately using an installer from the Python Software Foundation. Once the ArcGIS Installer is run, you will see a folder inside the C:\ drive. You can set a custom location or use the default.

Python Interpreter

When you start python.exe by double-clicking on it (see below for multiple other ways to run the executable), it starts what is known as the Python Interpreter.

This is a useful interface, allowing you to enter, one line at a time, bits of code for testing and confirmation. Once the line is entered, push Enter/Return and the code will be executed. This tool helps you both learn coding and test code in the same environment.

Starting the Interpreter

Double-clicking on python.exe from the folder or starting Python (command line) from the Start Menu, will start the interpreter, which allows for one-line commands to be executed.:

Python 3 is very similar:

What is a Python script?

The python.exe executable file, along with being a program where code can be run, will also execute Python scripts. These scripts are simple text files that can be edited by any text editing software. Python scripts are saved with the .py extension.

When a Python script is “run”, it is passed as the first command line argument to the Python executable (python.exe). This program will read and then execute the code from the top to the bottom as long as it is valid Python and it contains no errors. If there is an error encountered, the script will stop and return an error message. If there is no error, nothing will be returned unless you have added “print” statements to return messages from the main loop to the Python window as the script is running.

In this example the script is executed by “passing” the script as an argument to the executable (python.exe), which is explicitly called with the full folder path to the python.exe file to avoid path issues:

C:\Projects>C:\PythonArcGIS\ArcGIS10.5\python.exe chapter1.py

In this example the script is executed by “passing” the script as an argument to the executable, along with optional parameters that are accepted by the script itself before being run:

C:\Projects>C:\PythonArcGIS\ArcGIS10.5\python.exe chapter1.py arg1 arg2

Versions included

Python comes with two versions of the python.exe file. These are the same version of Python, to be clear, but each file has a different role. Python.exe is the main file, and the other version is pythonw.exe. This file will not open an interpreter if double-clicked, as the normal python.exe will. No interpreter is available from pythonw.exe, which is the point: it is used to execute scripts more “silently” than python.exe. Use python.exe for to start the interpreter.

How to call the executable

The Python “executable” (python.exe) is accessed to run the Python Interpreter or to run a custom Python script. There are many different ways to “call” or start the Python executable:

  • Double-click on python.exe
    • Starts the Python Interpreter
  • Open IDLE, the included integrated development environment (IDE)
    • Should be accessible in your Start menu on Windows in the ArcGIS folder
  • Open a CMD terminal and type “python”
    • Only works if the Python executable is in the PATH environment variable
  • Using a third-party IDE such as PyCharm
    • Each PyCharm project can have its own virtual environment, and therefore its own executable, or it can use the one installed by Esri when ArcGIS is installed.
    • There are a lot of IDEs but PyCharm is the one I recommend for a variety of reasons.
  • Using a Jupyter Notebook, which we will discuss extensively in this book
    • This requires the installation of Jupyter, which is not included in the standard Python installation.
  • Inside ArcGIS Desktop or ArcGIS Pro
    • There are menu buttons that allow you to start a Python interpreter window inside ArcMap or ArcCatalog or ArcGIS Pro.
    • Run code one line at a time or by using the load script command in the right-click menu.

IDLE development environment

The included IDE called IDLE is a useful environment that comes standard with every Python instance. IDLE is useful for the Python Interpreter, but also because you can create and execute scripts in this environment easily by opening a new script from the File menu, and then using the script’s Run menu to execute the script.

The Path environment variable

On Windows there is a system environment variable known as the Windows Path environment variable. This variable is available to all applications installed on the machine. Other programs use it for different purposes, but for Python it is used to find all available Python executables and modules.

This is important to understand because you may end up with multiple versions of Python on your computer one day, or after just one install of ArcGIS Desktop or Pro. When

If a script is run in a CMD window using the “python script.py” (passing the script to Python as an argument), and it contains import statements, then there are three things that have to happen.

First, Windows will look for an executable called python.exe in the Path. If it is there, it will then confirm that the script is valid. If it is, then Python will run the script and the Path environment variable will be checked to look for allowed locations for all modules you are trying to import.

So the Python executable cannot be run by name (instead of file location) until the python.exe is in the Path. Here is how you edit the Path variable:

Open up the Advanced System Settings in the Control Panel:

Locate and double-click on the Path variable (or press edit when selected):

Add a new line to the Path environment variable in the interface. If you have multiple version of Python and you are not using virtual environments, be sure to order the folders in the Path so that the correct version of Python is called when you type “python” into a CMD line window:

If you are not allowed to edit the Path variable, you can still run Python in the command line by referring to it using the whole path to the executable: C:\ArcGIS10.8\Python\python.exe script.py

The operating system and Python system modules

Two modules (code libraries) built into Python need to be mentioned first. The os and sys modules, also called the operating system module (os) and the Python system module (sys) are used to control Windows system operations and Python system operations respectively.

The OS module

The os module is used for many things, including folder path operations such as creating folders, removing folders, checking if a folder or file exists, or executing a file using the operating system-associated application used to run that file extension. Getting the current directory, copying files, and more is possible with this module.

In this example, a string is passed to the os.path.exists method, which is Boolean. If it returns False, the folder does not exist, and is then created using the os.mkdir method:

import os
folderpath = "C:\Test_folder"
if not os.path.exists(folderpath):
   os.mkdir(folderpath)

Read about the os module here: https://www.geeksforgeeks.org/os-module-python-examples/

The sys module accepts arguments

The sys module allows you to accept arguments to a script at runtime, meaning when it is executed. This is done by using the sys.argv method, which is a list containing all arguments made to Python during the executing of the script.

If a name variable is using the sys module to accept parameters, here is what the script looks like:
import sys
name = sys.argv[1]
print(name)

The System path

The sys module contains the Python path or system path (system in this case means Python). This is a list that Python uses to search for importable modules, after accessing the Windows path. If you can’t edit the Windows path as explained above (due to permissions usually), you can alter the Python path at runtime using the system path.

The sys.path list is a part of the sys module built into Python:

Read more about the sys module here: https://www.geeksforgeeks.org/python-sys-module/

 

Basics of programming

Computer programming varies from language to language in terms of implementation, but there are remarkable similarities among these languages in how their internal logic works. These programming basics are applicable for all programming languages with specific code implementations shown in Python.

Key Concepts

Variables Names assigned to Python objects of any data type. Variables must start with a letter. Underscores are encouraged.
x=0
y=1
xy = x+y
xy_str = str(xy)
Iteration For loops are used to iterate through an iterable data object (e.g. a list). While loops are used to loop until a condition has been met
for item in datalist:
    print(item)
x=0
while x < 1:
    x+=1
Conditionals If/Elif/Else statements that interpret if an object meets a condition.
list_var = [1,’1’,1.0]
for item in list_var:
  if type(item) == type(0):
    print(‘Integer’)
  elif type(item) == type(‘a’):
    print(‘String’)
  else:
    print(‘Float’) 
Zero-based indexing Data containers are accessed using indexes that start with 0. The indexes are passed to the list or tuple using square brackets []. String characters can be access using the same pattern.
list_var = [‘s’,’m’,’t’]
m_var = list_var[0]

name_var = “logan”
g_var = name_var[0]
Data Types Strings are for text. Integers are for whole numbers. Floats are for floating point numbers. Data containers such as lists and tuples and dictionaries are used extensively to organized data.
Str_var = “string”
int_var = 4
float_var = 5.7
list_var = [45,43,24]
tuple_var = (87.’a’,34)
dict_var = {‘key’:’value’}
Code Comments Comments in code are encouraged. They help explain your thinking to both other readers and yourself. Comments are created by using the “#” symbol. Comments can be on a line by themselves or can be added to the end of a statement as anything after the # symbol will be ignored.
# This is a comment
x = 0 #also a comment
Errors Error messages of many types are built into Python. The error traceback show the affected lines of code and the type of error. It’s not perfect.
>>> str_var = 'red"
  File "<stdin>", line 1
    str_var = 'red"
                  ^
SyntaxError: EOL while scanning string literal
Counters/Enumerators

Using a variable to keep track of the number of loops performed by a for loop or while loop is a good idea. Some languages (including Python) have some built-in enumeration functionality. Counters are reassigned to themselves after being increased.

In Python the shortcut “x += y” is the same as “x = x +y”

counter = 0
list_var = [34,54,23,54]
for item in list_var:
    print(item, counter)
    counter += 1 

Variables

Variables are used to assign objects to labels or identifiers. They are used to keep track of pieces of data, to organize the flow of the data through the script, and to help programmers read the script.

variable = 1 # a variable assignment

It is recommended (by me) to use descriptive variables that are neither too long nor too short. When variables are too short, they can become confusing to read. When they are too long, they can be confusing to write. Using underscores to separate words in variables is a common practice.

Assigned to vs is equal to (value comparison)

In Python, variables are assigned to an object using the equals sign “=”. This means that there is another way to check if a value is equal to another value: using a double equals sign “==”.

variable = 1 # a variable assignment
variable == 1 # a comparison (that is True)

Variable formatting rules

Variables must start with a letter. They cannot start with a number or other symbol, otherwise a Syntax Error will occur. However, numbers and underscores can be used in the

>>> 2var = 34
  File "<stdin>", line 1
    2var = 34
     ^
SyntaxError: invalid syntax
>>> two_var = 34
>>> two_var
34

Read more about variables here: https://realpython.com/python-variables/

Iteration

The core of computer programming is iteration: recursively performing the same action or analysis or function call or whatever your script is built to process. Computers excel at this type of task: they can quickly iterate through a dataset to perform whatever action you deem necessary, on each data item in the set.

For loops

A “for loop” is an iteration implementation that, when presented with a data list, will perform an operation on each member of the list.

In this example, a list of integers are assigned to the variable name data_list. The list is then used to construct a for loop using the format “for {var} in {iterable}” where {var} is a variable name that is assigned to each object in the list, one at a time as the loop progresses. One convention is to use “item” but it can be any valid variable:

data_list = [45,56,34,12,2]
for item in data_list:
    print (item * 2)
90
112
68
24
4

While loops

A “while loop” is an iteration implementation that will loop until a specific threshold is met. While loops can be dangerous as they can cause an infinite loop in a script if the threshold is never met.

In this example, the while loop will run (doing nothing but adding 1 to x until it reaches 100, upon which the threshold is met and the while loop will end

x = 0
while x < 100:
    x = x + 1   #same as x += 1

Read more about loops here: https://www.geeksforgeeks.org/loops-in-python/

Counter and enumerators

Iteration in for loops or while loops often requires the use of counters (also called enumerators) to track loops in an iteration.

For loops have the option to use the enumerate function by passing the iterator to the function and using a count variable (can be any valid variable name but count is logical) in front of the item variable. The count variable will keep track of the loops, starting at index zero:

>>> data_list = ['a','b','c','d','e']
>>> for count,item in enumerate(data_list):
...     print(count, item)
... 
0 a
1 b
2 c
3 d
4 e

In Python the shortcut “x += y” is used to increase the value of x while keeping the same variable name, which is the same as “x = x +y”:

>>> x = 0
>>> while x <100:
...    x = x + 1
>>> x
100
>>> x = 0
>>> while x <100:
...    x += 1
>>> x
100

Conditionals

If statements and Elif statements (short for else if) and Else statements are used to create conditions that will be used to evaluate data objects. If statements can be by themselves (elif and else are optional) and is used by declaring the keyword if and then the condition the data must meet:

list_var = [1,’1’,1.0]
for item in list_var:
  if type(item) == type(0):
    print(‘Integer’)
  elif type(item) == type(‘a’):
    print(‘String’)
  else:
    print(‘Float’) 

Read more about conditionals here: https://realpython.com/python-conditional-statements/

If vs Else

If statements are usually specific to one condition, while else statements are used as catch-alls to ensure that any data that goes through the if statement will have some way of being dealt with, even if it doesn’t meet the condition of the if statement. Elif statements, which are dependent on the if statement existing and are also condition specific, are not catch-all statements.

List Position (or why programmers count from 0)

Iteration occurs over lists that contain data. Within the list, these data are differentiated by list order or position. Items in a list are retrieved by item index, the (current) position of the data in the list.

Zero-based indexing

In Python, like most computer programming languages, the first item in a list is at index 0, not index 1.

This is a bit confusing to beginners but is a programming standard. It is slightly more computationally efficient to retrieve an item in a list that starts with 0 than a list that starts with 1, and this became the standard in C and its precursors, which meant that Python (written in C) uses zero-based indexing.

Data extraction using index position

This is the basic format of data retrieval from a list. This list of strings has an order, and the string “Bill” is the second item, meaning it is at index 1. To assign this string to variable, we pass the index into square brackets:

names = [“Silas”, “Bill”, ”Dara”]
name_bill = names[1]

Data extraction using reverse index position

This is the second format of data retrieval from a list. List order can be used in reverse, meaning that the indexing starts from the last member of the list and counts backwards. Negative numbers are used, starting at -1, which is the index of the last member of the list, and -2 is the second-to-last member of the list and so on. This means that the “Bill” string is at index -2 when using reverse index position, and so -2 must be passed to the list in square brackets:

names = [“Silas”, “Bill”, ”Dara”]
name_bill = names[-2]

Read more about indexing here: https://realpython.com/lessons/indexing-and-slicing/

Data Types

The data type of a variable determines its behavior. For instance, the character 5 could be an integer type (5) or a float (5.0) or a string (“5”). Each version of 5 will have different available tools, such as the replace method for strings which can replace characters in the string with other characters.

Key Data Types

Data Type Python Data Type Object
Text data is stored as a String data type str
Numeric data is stored as an Integer or Float or Complex type intfloatcomplex
Sequence data (lists or arrays) can be stored as a list or tuple. Range is a special generator listtuplerange
Mapping or key/value pair data types are also known as dictionaries in Python dict
A Set is a data type that contains distinct, immutable objects setfrozenset
Boolean is either True or False, 1 or 0 bool
Binary data types are used to access data files in binary mode. bytesbytearraymemoryview

Checking the data type

To check the data type of a Python variable, use the type() function:

>>> x = 0
>>> type(x)
<class ‘int’>

Strings

All text data is represented as the String data type in Python. These are known as strings. Common data stored as strings includes names, addresses, or even whole blog posts.

Strings can be also templated in code to allow for “fill-in-the-blank” strings that are not set until the script is run. Strings are technically immutable but can be manipulated using built-in Python string tools and the separate String module.

Key Concepts

Quotation Marks Single or double quotation marks can be used to designate a string, as long as it is the same at the beginning and end. Triple quotation marks are used for strings with multiple lines. Quotes within a string can be indicated using the opposite mark as the one opening and closing the string.
String addition Strings can be “added” together to form a larger string. Strings can also be “multiplied” by an integer to repeat the string X times.
String formatting String templates or placeholders can be used in code and filled in at run-time with the data required.
String manipulation Strings can be manipulated using built-in functionality. Characters can be replaced or located. Strings can be split or joined.

Quotation marks

Strings must be surrounded by quotation marks. In Python, these can be either single or double quotes, but they must be consistent. If a single quote is used to start the string, a single quote must be used to stop it:

>>> string_var = 'the red fox"
  File "<stdin>", line 1
    string_var = 'the red fox"
                             ^
SyntaxError: EOL while scanning string literal
>>> string_var = 'the red fox'
>>> string_var
'the red fox'

Multiple line strings

Multiple line strings are created by pair three single quotes or double quotes at the beginning of the string, and three at the end.

In this example the variable string_var is a multiple line string (“\n” is a Python character representing a new line):

>>> string_var = """the red fox chased the
... dog across the yard"""
>>> string_var
'the red fox chased the\ndog across the yard'

String addition (and more)

Strings can be “added” together to create a new string. This process allows you to build strings from smaller strings, which can be useful for populating new fields composed of other fields in a data file and other tasks.

In this example the string “forest” is assigned to string_var. Another string is then added to string_var to create a longer string.

>>> string_var = "forest"
>>> string_var += " path" #same as string_var = string_var+ “ path”
>>> string_var
'forest path'

String formatting

Strings in code often make use of “placeholders” for data that will be filled in later. This is known as string formatting, and there are multiple ways to perform string formatting using Python.

Key Concepts

Format function All strings have a built-in function called format that allows the string to have arguments passed. It will accept all data types and format the string from a template.
String literals For Python 3.6+, there is a new tool called string literals, which allow you to insert variables into strings directly. An “f” is placed in front of the string.
Data type string operators An older but still useful tool are the string operators, which are used in strings as placeholders for specific data types (either strings or floats or integers).

String format function

This method of formatting is the preferred form for Python 3 (it is also available in Python 2.7). It allows you to pass the variables to the format function (which is built into all strings) and to have them fill up placeholders within the string. Any data type can be passed to the format function.

In this example, the string template is filled with details contained in other variables using the format string function. The placeholders are filled in the order that the variables are listed, so they must be in correct order. The curly brackets are the placeholders, and the format function will accept arguments and fill in the string:

>>> year = 1980
>>> day = "Monday"
>>> month = "Feb"
>>> template = "It was a cold {} in {} {}"
>>> template.format(day, month, year)
'It was a cold Monday in Feb 1980'

In this example, the placeholders are named, and are passed to keyword arguments in the format function. The arguments are named and do not need to be in order in the format function:

>>> template = 'It was a cold {day} in {month} {year}'
>>> template.format(month=month,year=year,day=day)
'It was a cold Monday in Feb 1980'

In this example, the placeholders are numbered, which makes it much easier to repeat a string:

>>> template = "{0},{0} oh no,{1} gotta go"
>>> template.format("Louie", "Me")
'Louie,Louie oh no,Me gotta go'

String literals

There is a new (as of Python 3.6) method of formatting strings known as formatted string literals. By adding an “f” before strings, placeholder variables can become populated by variables without using the format function.

In this example, the variables are formatted directly into the string literal, which has an “f” before the string to indicate that it is a string literal:

>>> year = 1980
>>> day = "Monday"
>>> month = "Feb"
>>> str_lit = f"It was a cold {day} in {month} {year}"
>>> str_lit
'It was a cold Monday in Feb 1980'

Read more about string formatting here: https://realpython.com/python-string-formatting/

String manipulation

String manipulation is common and lots of tools are built into the String data type. These allow you to replace characters in a string or find their index location in the string.

Find and index are similar methods but find is able to be used in conditional statements. If the character is not found in the string, find will return -1, while index will return an error.

The join method is used to join together a list of string data. The split method is the opposite: it splits a string into a list based on a supplied character or the default empty space.

Method Example
join
string_list = [‘101 N Main St’,’Eureka’,’Illinois 60133’]
address = ‘, ’.join(string_list)
replace
address = ‘101 N Main St’.replace(“St”,”Street”)
find, rfind
str_var = ‘rare’
str_index = str_var.find(‘a’) #index 1
str_index = str_var.find(‘r’) #index 0
str_index = str_var.rfind(‘r’) #index 2
upper, lower, title
name = “Laura”
name_upper = name.upper()
name_lower = name.lower()
name_title = name_lower.title()
index, rindex
str_var = ‘rare’
str_index = str_var.index(‘a’) #index 1
str_index = str_var.index(‘r’) #index 0
str_index = str_var.rindex(‘r’) #index 2
str_var.index(‘t’) #this will cause an error
split
latitude,longitude = “45.123,-95.321”.split(“,”)
address_split = ‘101 N Main St’.split()

String indexing

String indexing is similar to list indexing, as explained above. Individual characters, or groups of characters, can be selected from a string by passing the index of the character needed to the string in square brackets.

In this example, the “d” from “readiness” is accessed by passing the index [3] to square brackets next to the string:

>>> str_var = "readiness"
>>> d_var = str_var[3]
>>> d_var 
'd'

Groups of characters are selected by passing a start and end index, where the end index is the index of the first character you do not want to include:

>>> str_var = "readiness"
>>> din_var = str_var[3:6]. #index 6 is e
>>> din_var
'din'
>>> dine_var = str_var[3:7]. #index 7 is s
>>> dine_var
'dine'

Integers

The Integer data type represents whole numbers. It can be used to perform addition, subtraction, multiplication, and division (with one caveat as noted below).

>>> int_var = 50
>>> int_var * 5
250
>>> int_var / 5
10.0
>>> int_var ** 2
2500

Convert a string to an integer

To convert a string (or a float) to an integer, use the int function:

>>> x = '0'
>>> y = int(x)
>>> y
0
>>> type(y)
<type 'int'>
>>> type(x)
<type 'str'>

Integer math issue in Python 2

A well-known and well-intentioned design issue in Python 2 is the integer division issue. It means that performing division math with integers will result in a (usually) unwanted result where no remainder is returned. It is encouraged to convert integers into floats before dividing.

Here is an example of the issue:

Python 2.7.16 (default, Dec 21 2020, 23:00:36) 
>>> 5/3
1

This issue has been fixed in Python 3:

Python 3.8.2 (default, Apr  8 2021, 23:19:18) 
>>> 5/3
1.6666666666666667

Read more about integers in Python here: https://realpython.com/python-numbers/

Floating Numbers

Floating point numbers in Python are used to represent real numbers as 64-bit double-precision values. Sometimes using binary systems to represent decimal based numbers can be a bit odd, so keep an eye out, but in general these will work as expected.

>>> x = 5.0
>>> x * 5
25.0
>>> x ** 5
3125.0
>>> x/2.3
2.173913043478261

Convert a string to a float

To convert a string (or an integer) to a float, use the float function:

>>> x = '5'
>>> y = float(x)
>>> type(y)
<type 'float'>

Read more about floating point numbers in Python here: https://www.geeksforgeeks.org/python-float-type-and-its-methods

Conversion between data types

Conversion between data types is possible in Python using built-in functions that are part of the standard library. To start, the type function is useful to find the data type of an object. Once identified, the data object can be converted from Integer (int function) to String (str function) to Float (float function), as long as the character would be valid in that data type.

In these examples, a character is converted from String to Integer to Float to String using the int and str and float functions:

>>> str_var = "5"
>>> int_var = int(str_var)
>>> int_var
5
>>> float_var = float(int_var)
>>> float_var
5.0
>>> str_var = str(float_var)
>>> type(str_var)
'<class 'str'>'

Data Structures or Containers

Data structures, also called data containers and data collections, are special data types that can hold, in a retrievable order, any data item of any data types (including other data containers). Data containers are used to organized data items by index in tuples or lists, or by key:value pair in dictonaries.

lData retrieval from data containers

To get data out of data containers, square brackets are used to pass either indexes (lists and tuples) or keys (dictionaries). If there is more than one level of data container (i.e. one container contains another), first the data container inside is referenced using an index or key inside a first square bracket, and then the data inside the container is accessed using a second.

Data Container Example
Tuple
tuple_var = (“blue”, 32,[5,7,2],’plod’,{‘name’:’magnus’})
plod_var = tuple_var[-2]
magnus_var = tuple_var[-1][‘name’]
List
list_var = [‘fast’,’times’,89,4.5,(3,8),{‘we’:’believe’}]
times_var = list_var[1]
Dictionary
dict_var = list_var[-1]
believe_var = list_var[-1][‘we’]

Tuples

Tuples are ordered lists that can hold any data type, even in the same tuple. They are immutable, meaning they cannot be altered, and data cannot be added to or removed from the tuple once it has been created. They have length and the built-in len function can be used to get the length of the tuple.

In Python they are declared by using round brackets () or the tuple function. Data is accessed using zero-based indexing by passing the index to square brackets next to the tuple.

In this example, a tuple is assigned to the variable name tuple_var, and data is accessed using indexing:

>>> tuple_var = ("red",45,"left")
>>> type(tuple_var)
<class 'tuple'>
>>> ("red",45,"left")[0]
'red'
>>> tuple_var[0]
'red'

Read more about tuples in Python here: https://www.geeksforgeeks.org/python-tuples/

Lists

Lists (often called Arrays in other programming languages) are data containers that can hold any other type of data type, even in the same list, meaning they do not have to be only one data type. Lists can be altered after they are created. In Python they are declared by using square brackets [] or the list function. Data is accessed using zero-based indexing by passing the index to square brackets next to the list.

In this example, a list is assigned to the variable name list_var, and data is accessed using indexing:

>>> list_var = ["blue",42,"right"]
>>> type(list_var)
<class 'list'>
>>> ["blue",42,"right"][0]
'blue'
>>> list_var[0]
'blue'

Read more about lists in Python here: https://www.geeksforgeeks.org/python-list/

Convert between lists and tuples

Lists can be copied into a new tuple object using the tuple function. Conversely, Tuples can be copied into a list data type using the list function. Technically this does not convert the original data item, but instead creates a copy of the data item in the new data type.

In this example, the list is copied into a tuple data type, and then the tuple is copied into a list data type. Note that the brackets change with each new data type created:

 
>>> tuple_copy = tuple(list_var)
>>> tuple_copy
('blue', 42, 'right', 'ankle')
>>> list_copy = list(tuple_copy)
>>> list_copy
['blue', 42, 'right', 'ankle']

List operations for both tuples and lists

Lists and tuples can be iterated using for loops. They can both be “sliced” as well, creating a subset of the list or tuple that will be operated on for the for loop or other operation.

Slicing

Slicing a list or tuple will create a new list or tuple. The slice is created by passing indexes to the list or tuple in square brackets, separated by a colon. The first index is the start index, and it can be ignored if it is index 0 (i.e. the beginning of the original list). The second index is the index of the first value that you do NOT want to include (it can be blank if it the rest of the original list).

In this example we see a tuple with three data items sliced to only include the first two items. The string “left” is at index 2 in the tuple, meaning that the last index in the slice will be 2. The slice is assigned to variable name tuple_slice: 
>>> tuple_var = ("red",45,"left")
>>> tuple_slice = tuple_var[:2]
>>> tuple_slice
('red', 45)

In this example we see a list with four data items sliced to only include the last two items. The first index is the index of the first data item we want (the string “right”). The last index is blank:

>>> list_var = ["blue",42,"right","ankle"]
>>> list_slice = list_var[2:]
>>> list_slice
['right', 'ankle']

List operations for only lists

A list can be appended (one data item added) or extended (a list or tuple of data items are all added to the main list). The list order can be reversed or sorted. Built-in functions allow for the calculation of the maximum or minimum value of a list or even the sum of a list (given the data type of the items in the list is correct).

Sets

Sets represent a collection of distinct objects. In Python, sets are unordered, no duplicates are allowed, and all data items inside a set must be immutable.

Set operations

Sets are especially useful for getting all distinct members of a list. They cannot be accessed using indexing (they are unordered) but they can be iterated:

>>> orig_list = ["blue","pink","yellow","red","blue","yellow" ]
>>> set_var = set(orig_list)
>>> set_var
{'pink', 'yellow', 'blue', 'red'}
>>> set_var[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object is not subscriptable
>>> for item in set_var:
...     print(item)
... 
pink
yellow
blue
red

Dictionaries

Dictionaries are key:value stores, mean they are data containers that use unordered key and value pairs to organize data. Keys are used as reference points for organization and retrieval. When a key is supplied to a dictionary in square brackets, the value is returned.

>>> dict_var = {"key":"value"}
>>> dict_var['key']
'value'
>>> dict_var = {"address":"123 Main St", "color":"blue"}
>>> dict_var["address"]
'123 Main St'
>>> dict_var["color"]
'blue'

Read more about dictionaries in Python here: https://www.geeksforgeeks.org/python-dictionary/

Keys and values

Keys can be any immutable data type (meaning lists cannot be used as keys, but strings and integers and floats and tuples can be used as keys. Values can be any type of data, including other dictionaries.

All keys in a dictionary can be accessed as a list using the dictionary keys function. In Python 2.x this is a list. In Python 3.x it is a generator.

All values in a dictionary can be accessed as a list using the dictionary values function. In Python 2.x this is a list. In Python 3.x it is a generator.

Functions

Functions are sub routines defined by code. When “called” or run, functions will do something (or nothing if written that way). Functions often accept parameters, and these can be required or optional.

Functions make it easy to perform the same action over and over without writing the same code over and over. This makes code cleaner, shorter and smarter. They are a good idea and should be used often.

Read more about functions here: https://realpython.com/defining-your-own-python-function/

Def keyword

Functions are defined using the “def” keyword, which is short for “define function”. The keyword is written, and then the name of the function and round brackets (), into which expected parameters can be defined.

Return statement

Functions allow for data to be returned from the subroutine to the main loop using return statements. These allow the user to calculate a value or perform some action in the function, and then return back a value to the main loop.

Parameters

Parameters or arguments are values expected by functions and supplied by the code at runtime.

Namespaces

In Python, there is a concept called namespaces. These are refined into two types of namespaces: global and local.

All variables defined in the main part of a script (outside of any functions) are considered to be in the global namespace. Within the function, variables have a different “namespace”, meaning that variables inside a function are in a local namespace and are not the same as variables in the main script, which are in the global namespace. If a variable name inside a function is the same as one outside of the function, changing values inside the function (in the local namespace) will not affect the variable outside the function (in the global namespace)

Function Examples

In this example, a function is defined and written to return “hello world” every time it is called. There are no parameters, but the return keyword is used:

def new_function():
    return "hello world"

In this example, an expected parameter is defined in the brackets. When called, this value is supplied and the function then returns the value from the local namespace back to the global namespace in the main loop:

def accept_param(value):
    return value

In this example an expected parameter has a default value assigned, meaning it only has to be supplied if the function uses a non-default parameter:

def accept_param(value=12):
    return value

Doc strings

Functions allow for a string after the definition line that is used to declare the purpose of the function for documentation purposes.

def accept_param(value=12):
    'this function accepts a parameter if different from default'
    return value

Classes

Classes are special blocks of code that organize multiple variables and functions into an object with its own methods and functions. Classes make it easy to create code tools that can reference the same internal data lists and functions. The internal functions and variables are able to communicate across the class, so that variables defined in one part of the class are available in another.

Classes use the idea of “self” to allow for the different parts of the class to communicate. By introducing self as a parameter into each function inside a class, the data can be called.

Classes are called or “instantiated” to create a class object. This means the class definition is kind of like a factory for that class, and when you want one of those class objects, you call the class type and pass the correct parameters if required.

class Object():
    def __init__(self, name):
        'accepts a string'
        self.name = name
    def get_name(self):
        'return the name'
        return self.get_name

Read more about classes here: https://www.geeksforgeeks.org/python-classes-and-objects/

 

Installing and importing modules

To extend the capabilities of the included standard Python library of modules, Python was built to be extensible. Third-party modules are downloaded in some format from a provider (often PyPI, the Python Package Index, where most are held) using either the built-in pip program or another method. For us modules such as arcpy and the ArcGIS API for Python are perfect examples: they extend the capabilities of Python to be able to control the tools that are available within ArcGIS Desktop or Pro respectively.

Using pip

To make Python module installation easier, Python is now installed with a program called pip. This name is an recursive acronym which stands for Pip Installs Programs. It simplifies installation by allowing for one line command line calls both locates the requested module on an online repository and runs the installation commands.

Pip connects to the Python Package Index (or PyPI). Stored on this repository are hundreds of thousands of free modules written by other developers. It is worth checking the license of the module to confirm that it will allow for your use of its code.

Pip lives in the Scripts folder, where lots of executable files are stored:

Installing modules

We will cover the

The setup.py file

Often Python 2.x and sometimes in Python 3.x a module is includes a “setup.py” file. This file is not run by pip; instead, it is run by Python itself.

Usually, a module will have a downloadable zip file that should be copied to the /sites/packages folder. This should be unzipped, and then the Python executable should be used to run the setup.py file using the install command: python setup.py install

Installing in virtual environments

Virtual environments are a bit of an odd concept at first, but they are extremely useful when programming in Python. Because you will probably have two different Python versions installed on your computer if you have ArcGIS Desktop and ArcGIS Pro, it is convenient to have these versions located in a virtual environment.

The core idea is to use one of the Python virtual environment modules to create a copy of your preferred Python version, which is then isolated from the rest of the Python versions on your machine. This avoids path issues when calling modules, allowing you to have more than one version of these important modules on the same computer.

Here are a few of the Python virtual environment modules:

Name Description Example virtual environment creation
venv Built into Python 3.3+ python3 -m venv
virtualenv Most be installed separately. It is very useful and my personal favorite.
virtualenv namenv --python=python3.6
pyenv Used to isolate Python versions for testing purposes. Must be installed separately.
pyenv install 3.7.7
Conda /Anaconda Used often in academic and scientific environments. Must be installed separately.
conda create --name snakes python=3.9

Read more about virtual environments here: https://towardsdatascience.com/python-environment-101-1d68bda3094d

Importing modules

To access the wide number of modules in the Python standard library, as well as third-party modules such as arcpy, we need to be able to import these modules in our script (or in the interpreter).

To do this you will use import statements. These declare the module or submodules (smaller components of the module) that you will use in the script, and as long as the modules are in the /sites/packages folder in your Python installation, or in the PATH (as arcpy is after its been installed).

import csv
from datetime import timedelta
from arcpy import da.SearchCursor

Three ways to import

There are three different and related ways to import modules. These modules, from either the standard library or from third-parties, are all imported the same in a script.

Method 1: import the whole module

This is the simplest way to import a module, by importing its top-level object. Its sub-methods are accessed using dot notation (e.g. csv.Reader, a method used to read CSV files):

import csv
reader = csv.Reader

Method 2: import a sub module

Instead of importing a top-level object, you can import only the module or method you need, using the “from X import Y” format:

from datetime import timedelta
from arcpy import da.SearchCursor

Method 3: import all sub modules

Instead of importing one sub-object, you can import all the modules or methods, using the “from X import *” format:

from datetime import *
from arcpy import *

Read more about importing modules here: https://realpython.com/python-import/

Importing custom code

Modules don’t have to just come from “third-parties”: they can come from you as well. With the use of the special __init__.py file, you can convert a normal folder into an importable module

The __init__.py file

This special file, which can contain code but mostly is just an empty file, indicates to Python that a folder is a module that can be imported into a script. The file itself is just a text file with a .py extension and the name __init__.py (that’s two underscores on each side), which is placed inside a folder. As long as the folder with the __init__.py is either next to the script or in the Python Path (e.g. in the site-packages folder), the code inside the folder can be imported.

Example custom module

In this example, we see some code in a script called example_module.py:

import csv
from datetime import timedelta
print('script imported')

Create a folder called mod_test. Copy this script into the folder. Then, create an empty text file called __init__.py:

Import your module

Create a new script next to the mod_test folder. Call it “module_import.py”:

Inside the script you will import the function “test_function” from the example_module script in the mod_test folder using the format below:

Scripts inside the module are accessed using dot notation (e.g. mod_test.example_module). The functions and classes inside the script called example_module.py are able to be imported by name.

Because the module is sitting next to the script that is importing the function, this import statement will work. But if you move your script and don’t copy the module to somewhere that is on the Python Path, it won’t be a successful import

That is because the way import statements work is based on the Python Path. This is a list of folder locations that Python will look for the module that you are requesting. By default, the first location is the local folder, meaning the folder containing your script. The next location is the site-packages folder.

The site-packages folder

Most modules are installed in a folder inside the Python folder. This is called the site-packages folder and it sits at */Lib/sites-packages.

To make your module available to for import without needing it to be next to your script, put your module folder in the site-packages folder. When you run “from mod_test.example_module import test_function” it will locate the module called mod_test in the sites packages folder.

 

Basic style tips for writing scripts

To make clean, readable code, it is encouraged to follow these basic tips about how the code should be written and organized. The major rule enforced by Python is the required indentation, which is intended to make the code easier to read and write.

Read more about Python code style here: https://realpython.com/python-pep8/

Indentation

Python code has strict indentation rules that are enforced by all IDEs. These rules relate to functions and loops especially.

As a standard, 4 spaces are used after a function is declared or a loop is created. This is just a standard, as it could be only one space, but that indentation level becomes important when scripts get big and it helps to have 4 spaces for all indented lines so that they can be more easily read.

It is encouraged to not mix tabs and spaces when indenting.

Read more about indentation here: https://www.python.org/dev/peps/pep-0008/ - indentation

Add a comment at the top with script details

This is an optional but recommended way to start your scripts: write a comment at the top with your name, the date, and some quick explanation about what the script is supposed to do. This is especially nice when other people have to read your code.

Add lots of other comments throughout the script as well, to make sure you know what is happening throughout the script.

Follow with Import statements

It is encouraged but not required to put the import statements at or near the top of the script. Imports must happen before the module objects are called in the script, but the import statements can be placed anywhere. It is best to put them at the top so that people reading the script can understand what is being imported.

Define global variables

After the import statements, define the necessary variables that will be used in this script. Sometimes it is necessary to define variables later in the script but it is best to put major variables near the top.

Define functions

By placing function definitions below the global variables, it is easy to read and understand what the functions do when reading them. It is sometimes hard to find a function that is called in another part of the script if the function is not in a known location in the script.

Include print statements

The built-in function called print is used to send messages from the script to the command window while the script is running. Pass any valid data to the print statement and use it to track progress or to debug if there are issues.

>>> print("blueberry")
blueberry
>>> x = 0
>>> print(x)
0

Read more about print statements here: https://realpython.com/python-print/

Write the executable parts of the script

After importing modules and defining functions, the next part of the script is where the action takes place. The for loops are run, the functions are called, and the script is then done.

Make sure to add lots of comments to help yourself understand what is happening throughout the script, and print statements as well to help while the script is running.

If __name__ == ‘__main__’

Often at the end of scripts you will see this line, if __name__ == “__main__”. What it means is that the indented code below this line will be run if the script is executed directly, but if the code in the script is imported by another script, the code will not execute until called in the second script.

Read more about this here: https://www.geeksforgeeks.org/what-does-the-if-__name__-__main__-do/

 

Summary

In this chapter, we did a fast but comprehensive overview of computer programming and the Python programming language. We reviewed the basics of computer programming, including variables and iteration and conditionals. We reviewed the Windows Path environment variable and the Python system path. We explored the data types of Python, including Integers and Strings and Float, and the data containers of Python such as lists and tuples and dictionaries. We learned some basic code structure for scripts, and how to execute those scripts.

In the next chapter we will discuss the basics of arcpy and the ArcGIS API for Python. We will learn how to import these modules and access their methods and submodules. We will begin to execute Python code to automate ArcGIS Desktop and ArcGIS Pro.

About the Authors

  • William Parker

    William Parker is a GIS Professional with over 15 years of GIS and Python experience. He is an ArcGIS Python programmer for ICF who has led GIS analysis for the EIR/S analysis the California High Speed Rail Project San Jose to Merced and San Francisco to San Jose Sections. In addition, he has led GIS analysis on other large scale environmental analysis projects, automating processes using ArcPy.

    Browse publications by this author
  • Silas Toms

    Silas Toms is a long-time geospatial professional and author who has previously published “ArcPy and ArcGIS” and “Mastering Geospatial Analysis with Python.” His career highlights include developing the real-time common operational picture used at Super Bowl 50, building geospatial software for autonomous cars, designing computer vision for next-gen insurance, and developing mapping systems for Zillow. He now works at Volta Charging, predicting the future of electric vehicles adoption.

    Browse publications by this author
Book Title
Unlock this book and the full library for FREE
Start free trial