Python Geospatial Development

4 (2 reviews total)
By Erik Westra
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Geo-Spatial Development Using Python

About this book

Open Source GIS (Geographic Information System) is a growing area with the explosion of applications such as Google Maps, Google Earth, and GPS. The GIS market is growing rapidly and as a Python developer you will find yourself either wanting grounding in GIS or needing to get up to speed to do your job. In today's location-aware world, all commercial Python developers can benefit from an understanding of GIS development gained using this book.

Working with geo-spatial data can get complicated because you are dealing with mathematical models of the Earth's surface. Since Python is a powerful programming language with high-level toolkits, it is well suited to GIS development. will familiarize you with the Python tools required for geo-spatial development such as Mapnik, which is used for mapping in Python. It introduces GIS at the basic level with a clear, detailed walkthrough of the key GIS concepts such as location, distance, units, projections, datums, and GIS data formats. We then examine a number of Python libraries and combine these with geo-spatial data to accomplish a variety of tasks. The book provides an in-depth look at the concept of storing spatial data in a database and how you can use spatial databases as tools to solve a variety of geo-spatial problems.

It goes into the details of generating maps using the Mapnik map-rendering toolkit, and helps you to build a sophisticated web-based geo-spatial map-editing application using GeoDjango, Mapnik, and PostGIS. By the end of the book, you will be able to integrate spatial features into your applications and build a complete mapping application from scratch.

Publication date:
December 2010
Publisher
Packt
Pages
508
ISBN
9781849511544

 

Chapter 1. Geo-Spatial Development Using Python

This chapter provides an overview of the Python programming language and geo-spatial development. Please note that this is not a tutorial on how to use the Python language; Python is easy to learn, but the details are beyond the scope of this book.

In this chapter, we will cover:

  • What the Python programming language is, and how it differs from other languages

  • An introduction to the Python Standard Library and the Python Package Index

  • What the terms "geo-spatial data" and "geo-spatial development" refer to

  • An overview of the process of accessing, manipulating, and displaying geo-spatial data

  • Some of the major applications for geo-spatial development

  • Some of the recent trends in the field of geo-spatial development

 

Python


Python (http://python.org) is a modern, high-level language suitable for a wide variety of programming tasks. Technically, it is often referred to as a "scripting" language, though this distinction isn't very important nowadays. Python has been used for writing web-based systems, desktop applications, games, scientific programming, and even utilities and other higher-level parts of various operating systems.

Python supports a wide range of programming idioms, from straightforward procedural programming to object-oriented programming and functional programming.

While Python is generally considered to be an "interpreted" language, and is occasionally criticized for being slow compared to "compiled" languages such as C, the use of byte-compilation and the fact that much of the heavy lifting is done by library code means that Python's performance is often surprisingly good.

Open source versions of the Python interpreter are freely available for all major operating systems. Python is eminently suitable for all sorts of programming, from quick one-off scripts to building huge and complex systems. It can even be run in interactive (command-line) mode, allowing you to type in commands and immediately see the results. This is ideal for doing quick calculations or figuring out how a particular library works.

One of the first things a developer notices about Python compared with other languages such as Java or C++ is how expressive the language is—what may take 20 or 30 lines of code in Java can often be written in half a dozen lines of code in Python. For example, imagine that you have an array of latitude and longitude values you wish to process one at a time. In Python, this is trivial:

for lat,long in coordinates:
    ...

Compare this with how much work a programmer would have to do in Java to achieve the same result:

for (int i=0; i < coordinates.length; i++) {
    float lat = coordinates[i][0];
    float long = coordinates[i][1];
...
}

While the Python language itself makes programming quick and easy, allowing you to focus on the task at hand, the Python Standard Libraries make programming even more efficient. These libraries make it easy to do things such as converting date and time values, manipulating strings, downloading data from websites, performing complex maths, working with e-mail messages, encoding and decoding data, XML parsing, data encryption, file manipulation, compressing and decompressing files, working with databases—the list goes on. What you can do with the Python Standard Libraries is truly amazing.

As well as the built-in modules in the Python Standard Libraries, it is easy to download and install custom modules, which can be written in either Python or C. The Python Package Index (http://pypi.python.org) provides thousands of additional modules that you can download and install. And, if that isn't enough, many other systems provide python bindings to allow you to access them directly from within your programs. We will be making heavy use of Python bindings in this book.

Note

It should be pointed out that there are different versions of Python available. Python 2.x is the most common version in use today, while the Python developers have been working for the past several years on a completely new, non-backwards-compatible version called Python 3. Eventually, Python 3 will replace Python 2.x, but at this stage most of the third-party libraries (including all the GIS tools we will be using) only work with Python 2.x. For this reason, we won't be using Python 3 in this book.

Python is in many ways an ideal programming language. Once you are familiar with the language itself and have used it a few times, you'll find it incredibly easy to write programs to solve various tasks. Rather than getting buried in a morass of type-definitions and low-level string manipulation, you can simply concentrate on what you want to achieve. You end up almost thinking directly in Python code. Programming in Python is straightforward, efficient and, dare I say it, fun.

 

Geo-spatial development


The term Geo-spatial refers to information that is located on the Earth's surface using coordinates. This can include, for example, the position of a cell phone tower, the shape of a road, or the outline of a country:

Geo-spatial data often associates some piece of information with a particular location. For example, here is a map of Afghanistan from the http://afghanistanelectiondata.org website showing the number of votes cast in each location in the 2009 elections:

Geo-spatial development is the process of writing computer programs that can access, manipulate, and display this type of information.

Internally, geo-spatial data is represented as a series of coordinates, often in the form of latitude and longitude values. Additional attributes such as temperature, soil type, height, or the name of a landmark are also often present. There can be many thousands (or even millions) of data points for a single set of geo-spatial data. For example, the following outline of New Zealand consists of almost 12,000 individual data points:

Because so much data is involved, it is common to store geo-spatial information within a database. A large part of this book will be concerned with how to store your geo-spatial information in a database, and how to access it efficiently.

Geo-spatial data comes in many different forms. Different GIS (Geographical Information System) vendors have produced their own file formats over the years, and various organizations have also defined their own standards. It's often necessary to use a Python library to read files in the correct format when importing geo-spatial data into your database.

Unfortunately, not all geo-spatial data points are compatible. Just like a distance value of 2.8 can have a very different meaning depending on whether you are using kilometers or miles, a given latitude and longitude value can represent any number of different points on the Earth's surface, depending on which projection has been used.

A projection is a way of representing the Earth's surface in two dimensions. We will look at projections in more detail in Chapter 2, GIS, but for now just keep in mind that every piece of geo-spatial data has a projection associated with it. To compare or combine two sets of geo-spatial data, it is often necessary to convert the data from one projection to another.

Note

Latitude and longitude values are sometimes referred to as unprojected coordinates. We'll learn more about this in the next chapter.

In addition to the prosaic tasks of importing geo-spatial data from various external file formats and translating data from one projection to another, geo-spatial data can also be manipulated to solve various interesting problems. Obvious examples include the task of calculating the distance between two points, calculating the length of a road, or finding all data points within a given radius of a selected point. We will be using Python libraries to solve all of these problems, and more.

Finally, geo-spatial data by itself is not very interesting. A long list of coordinates tells you almost nothing; it isn't until those numbers are used to draw a picture that you can make sense of it. Drawing maps, placing data points onto a map, and allowing users to interact with maps are all important aspects of geo-spatial development. We will be looking at all of these in later chapters.

 

Applications of geo-spatial development


Let's take a brief look at some of the more common geo-spatial development tasks you might encounter.

Analyzing geo-spatial data

Imagine that you have a database containing a range of geo-spatial data for San Francisco. This database might include geographical features, roads and the location of prominent buildings and other man-made features such as bridges, airports, and so on.

Such a database can be a valuable resource for answering various questions. For example:

  • What's the longest road in Sausalito?

  • How many bridges are there in Oakland?

  • What is the total area of the Golden Gate Park?

  • How far is it from Pier 39 to the Moscone Center?

Many of these types of problems can be solved using tools such as the PostGIS spatially-enabled database. For example, to calculate the total area of the Golden Gate Park, you might use the following SQL query:

select ST_Area(geometry) from features
  where name = "Golden Gate Park";

To calculate the distance between two places, you first have to geocode the locations to obtain their latitude and longitude. There are various ways to do this; one simple approach is to use a free geocoding web service such as this:

http://tinygeocoder.com/create-api.php?q=Pier 39,San Francisco,CA

This returns a latitude value of 37.809662 and a longitude value of -122.410408.

Note

These latitude and longitude values are in decimal degrees. If you don't know what these are, don't worry; we'll talk about decimal degrees in Chapter 2, GIS.

Similarly, we can find the location of the Moscone Center using this query:

http://tinygeocoder.com/create-api.php?q=Moscone Center, San Francisco, CA

This returns a latitude value of 37.784161 and a longitude value of -122.401489.

Now that we have the coordinates for the two desired locations, we can calculate the distance between them using the pyproj Python library:

import pyproj

lat1,long1 = (37.809662,-122.410408)
lat2,long2 = (37.784161,-122.401489)

geod = pyproj.Geod(ellps="WGS84")
angle1,angle2,distance = geod.inv(long1, lat1, long2, lat2)

print "Distance is %0.2f meters" % distance

This prints the distance between the two points:

Distance is 2937.41 meters

Note

Don't worry about the "WGS84" reference at this stage; we'll look at what this means in Chapter 2, GIS.

Of course, you wouldn't normally do this sort of analysis on a one-off basis like this—it's much more common to create a Python program that will answer these sorts of questions for any desired set of data. You might, for example, create a web application that displays a menu of available calculations. One of the options in this menu might be to calculate the distance between two points; when this option is selected, the web application would prompt the user to enter the two locations, attempt to geocode them by calling an appropriate web service (and display an error message if a location couldn't be geocoded), then calculate the distance between the two points using Proj, and finally display the results to the user.

Alternatively, if you have a database containing useful geo-spatial data, you could let the user select the two locations from the database rather than typing in arbitrary location names or street addresses.

However you choose to structure it, performing calculations like this will usually be a major part of your geo-spatial application.

Visualizing geo-spatial data

Imagine that you wanted to see which areas of a city are typically covered by a taxi during an average working day. You might place a GPS recorder into a taxi and leave it to record the taxi's position over several days. The results would be a series of timestamp, latitude and longitude values like the following:

2010-03-21 9:15:23  -38.16614499  176.2336626
2010-03-21 9:15:27  -38.16608632  176.2335635
2010-03-21 9:15:34  -38.16604198  176.2334771
2010-03-21 9:15:39  -38.16601507  176.2333958
...

By themselves, these raw numbers tell you almost nothing. But, when you display this data visually, the numbers start to make sense:

You can immediately see that the taxi tends to go along the same streets again and again. And, if you draw this data as an overlay on top of a street map, you can see exactly where the taxi has been:

(Street map courtesy of http://openstreetmap.org).

While this is a very simple example, visualization is a crucial aspect of working with geo-spatial data. How data is displayed visually, how different data sets are overlaid, and how the user can manipulate data directly in a visual format are all going to be major topics of this book.

Creating a geo-spatial mash-up

The concept of a "mash-up" has become popular in recent years. Mash-ups are applications that combine data and functionality from more than one source. For example, a typical mash-up may combine details of houses for rent in a given city, and plot the location of each rental on a map, like this:

This example comes from http://housingmaps.com.

The Google Maps API has been immensely popular in creating these types of mash-ups. However, Google Maps has some serious licensing and other limitations. It is not the only option—tools such as Mapnik, open layers, and MapServer, to name a few, also allow you to create mash-ups that overlay your own data onto a map.

Most of these mash-ups run as web applications across the Internet, running on a server that can be accessed by anyone who has a web browser. Sometimes, the mash-ups are private, requiring password access, but usually they are publically available and can be used by anyone. Indeed, many businesses (such as the rental mashup shown above) are based on freely-available geo-spatial mash-ups.

 

Recent developments


A decade ago, geo-spatial development was vastly more limited than it is today. Professional (and hugely expensive) Geographical Information Systems were the norm for working with and visualizing geo-spatial data. Open source tools, where they were available, were obscure and hard to use. What is more, everything ran on the desktop—the concept of working with geo-spatial data across the Internet was no more than a distant dream.

In 2005, Google released two products that completely changed the face of geo-spatial development: Google Maps and Google Earth made it possible for anyone with a web browser or a desktop computer to view and work with geo-spatial data. Instead of requiring expert knowledge and years of practice, even a four year-old could instantly view and manipulate interactive maps of the world.

Google's products are not perfect—the map projections are deliberately simplified, leading to errors and problems with displaying overlays; these products are only free for non-commercial use; and they include almost no ability to perform geo-spatial analysis. Despite these limitations, they have had a huge effect on the field of geo-spatial development. People became aware of what was possible, and the use of maps and their underlying geo-spatial data has become so prevalent that even cell phones now commonly include built-in mapping tools.

The Global Positioning System (GPS) has also had a major influence on geo-spatial development. Geo-spatial data for streets and other man-made and natural features used to be an expensive and tightly controlled resource, often created by scanning aerial photographs and then manually drawing an outline of a street or coastline over the top to digitize the required features. With the advent of cheap and readily-available portable GPS units, anyone who wishes to can now capture their own geo-spatial data. Indeed, many people have made a hobby of recording, editing, and improving the accuracy of street and topological data, which are then freely shared across the Internet. All this means that you're not limited to recording your own data, or purchasing data from a commercial organization; volunteered information is now often as accurate and useful as commercially-available data, and may well be suitable for your geo-spatial application.

The open source software movement has also had a major influence on geo-spatial development. Instead of relying on commercial toolsets, it is now possible to build complex geo-spatial applications entirely out of freely-available tools and libraries. Because the source code for these tools is often available, developers can improve and extend these toolkits, fixing problems and adding new features for the benefit of everyone. Tools such as PROJ.4, PostGIS, OGR, and Mapnik are all excellent geo-spatial toolkits that are benefactors of the open source movement. We will be making use of all these tools throughout this book.

As well as standalone tools and libraries, a number of geo-spatial-related Application Programming Interfaces (APIs) have become available. Google has provided a number of APIs that can be used to include maps and perform limited geo-spatial analysis within a website. Other services such as tinygeocoder.com and geoapi.com allow you to perform various geo-spatial tasks that would be difficult to do if you were limited to using your own data and programming resources.

As more and more geo-spatial data becomes available from an increasing number of sources, and as the number of tools and systems that can work with this data also increases, it has become essential to define standards for geo-spatial data. The Open Geospatial Consortium, often abbreviated to OGC (http://www.opengeospatial.org), is an international standards organization that aims to do precisely this: to provide a set of standard formats and protocols for sharing and storing geo-spatial data. These standards, including GML, KML, GeoRSS, WMS, WFS, and WCS, provide a shared "language" in which geo-spatial data can be expressed. Tools such as commercial and open source GIS systems, Google Earth, web-based APIs, and specialized geo-spatial toolkits such as OGR are all able to work with these standards. Indeed, an important aspect of a geo-spatial toolkit is the ability to understand and translate data between these various formats.

As GPS units have become more ubiquitous, it has become possible to record your location data as you are performing another task. Geolocation, the act of recording your location as you are doing something, is becoming increasingly common. The Twitter social networking service, for example, now allows you to record and display your current location as you enter a status update. As you approach your office, sophisticated To-do list software can now automatically hide any tasks that can't be done at that location. Your phone can also tell you which of your friends are nearby, and search results can be filtered to only show nearby businesses.

All of this is simply the continuation of a trend that started when GIS systems were housed on mainframe computers and operated by specialists who spent years learning about them. Geo-spatial data and applications have been democratized over the years, making them available in more places, to more people. What was possible only in a large organization can now be done by anyone using a handheld device. As technology continues to improve, and the tools become more powerful, this trend is sure to continue.

 

Summary


In this chapter, we briefly introduced the Python programming language and the main concepts behind geo-spatial development. We have seen:

  • That Python is a very high-level language eminently suited to the task of geo-spatial development.

  • That there are a number of libraries that can be downloaded to make it easier to perform geo-spatial development work in Python.

  • That the term "geo-spatial data" refers to information that is located on the Earth's surface using coordinates.

  • That the term "geo-spatial development" refers to the process of writing computer programs that can access, manipulate, and display geo-spatial data.

  • That the process of accessing geo-spatial data is non-trivial, thanks to differing file formats and data standards.

  • What types of questions can be answered by analyzing geo-spatial data.

  • How geo-spatial data can be used for visualization.

  • How mash-ups can be used to combine data (often geo-spatial data) in useful and interesting ways.

  • How Google Maps, Google Earth, and the development of cheap and portable GPS units have "democratized" geo-spatial development.

  • The influence the open source software movement has had on the availability of high quality, freely-available tools for geo-spatial development.

  • How various standards organizations have defined formats and protocols for sharing and storing geo-spatial data.

  • The increasing use of geolocation to capture and work with geo-spatial data in surprising and useful ways.

In the next chapter, we will look in more detail at traditional Geographic Information Systems (GIS), including a number of important concepts that you need to understand in order to work with geo-spatial data. Different geo-spatial formats will be examined, and we will finish by using Python to perform various calculations using geo-spatial data.

About the Author

  • Erik Westra

    Erik Westra has been a professional software developer for over 25 years, and has worked almost exclusively in Python for the past decade. Erik’s early interest in Graphical User Interface design led to the development of one of the most advanced urgent courier dispatch systems used by messenger and courier companies worldwide. In recent years, Erik has been involved in the design and implementation of systems matching seekers with providers of goods and services across a range of geographical areas, as well as real-time messaging, payment, and identity systems. This work has included the creation of real-time geocoders and map-based views of constantly changing data. Erik is based in New Zealand, and works for companies worldwide. Erik is the author of the following Packt books: Python Geospatial Development (third edition), Python Geospatial Analysis, Building Mapping Applications with QGIS, and Modular Programming with Python. Erik has also authored the video course entitled Introduction to QGIS Python Programming.

    Browse publications by this author

Latest Reviews

(2 reviews total)
About 1/3 through an initial read-before-serious-use and am favourably inclined towards the book.
Explains how to use Python for spatial data analysis, using MySQL, PostGIS or Oracle SpatiaLite.
Book Title
Access this book, plus 7,500 other titles for FREE
Access now