Reader small image

You're reading from  Mastering Geospatial Analysis with Python

Product typeBook
Published inApr 2018
Reading LevelBeginner
PublisherPackt
ISBN-139781788293334
Edition1st Edition
Languages
Right arrow
Authors (3):
Silas Toms
Silas Toms
author image
Silas Toms

Silas Toms is a long-time geospatial professional and author who has previously published ArcPy and ArcGIS and Mastering Geospatial Analysis with Python. His career highlights include developing the real-time common operational picture used at Super Bowl 50, building geospatial software for autonomous cars, designing computer vision for next-gen insurance, and developing mapping systems for Zillow. He now works at Volta Charging, predicting the future of electric vehicle adoption and electric charging infrastructure.
Read more about Silas Toms

Paul Crickard
Paul Crickard
author image
Paul Crickard

Paul Crickard authored a book on the Leaflet JavaScript module. He has been programming for over 15 years and has focused on GIS and geospatial programming for 7 years. He spent 3 years working as a planner at an architecture firm, where he combined GIS with Building Information Modeling (BIM) and CAD. Currently, he is the CIO at the 2nd Judicial District Attorney's Office in New Mexico.
Read more about Paul Crickard

Eric van Rees
Eric van Rees
author image
Eric van Rees

Eric van Rees was first introduced to Geographical Information Systems (GIS) when studying Human Geography in the Netherlands. For 9 years, he was the editor-in-chief of GeoInformatics, an international GIS, surveying, and mapping publication and a contributing editor of GIS Magazine. During that tenure, he visited many geospatial user conferences, trade fairs, and industry meetings. He focuses on producing technical content, such as software tutorials, tech blogs, and innovative new use cases in the mapping industry.
Read more about Eric van Rees

View More author details
Right arrow

Chapter 4. Data Types, Storage, and Conversion

This chapter will focus on the many different data types that exist within GIS and will provide an overview of the major data types in GIS and how to use the previously covered Python code libraries to read and write geospatial data. Apart from reading and writing different geospatial data types, you'll learn how to use these libraries to perform file conversion between different data types and how to download data from geospatial databases and remote sources.

The following vector and raster data types will be covered in this chapter:

  • Shapefiles
  • GeoJSON
  • KML
  • GeoPackages
  • GeoTIFF

The following file actions will also be covered, using Python geospatial data libraries covered in Chapter 2, Introduction to Geospatial Code Libraries:

  • Opening existing files
  • Reading and displaying different attributes (spatial and non-spatial)
  • Creating and writing new geospatial data in different formats
  • Converting one file format to another
  • Downloading geospatial data

We'll provide...

Raster and vector data


Before diving into some of the most used GIS data types, a little background is required about what type of information geographical data represents. Earlier in this book, the distinction between raster and vector data was mentioned. All GIS data is comprised of one or the other, but a combination of both vectors and rasters is also possible. When deciding on which data type to use, consider the scale and type of geographical information represented by the data, which in turn determines what Python data libraries to use. As is illustrated in the following examples, the choice for a certain Python library can also depend on personal preference, and there may be various ways to do the same task.

In the geospatial world, raster data comes in the form of aerial imagery or satellite data, where each pixel has an associated value that corresponds to a different color or shade. Raster data is used for large continuous areas, such as differentiating between different temperature...

Raster data formats


These are some of the most popular raster data formats used for geographical information today:

  • ECW (Enhanced Compressed Wavelet): ECW is a compressed image format typically for aerial and satellite imagery. This GIS file type is known for its high compression ratios while still maintaining quality contrast in images.
  • Esri grid: A file format for adding attribute data to a raster file. Esri grid files are available as integer and floating point grids.
  • GeoTIFF (Geographic Tagged Image File Format): An industry image standard file for GIS and satellite remote sensing applications. Almost all GIS and image processing software packages have GeoTIFF compatibility.
  • JPEG 2000: An open source compressed raster format that allows both lossy and lossless compression. JPEG 2000 typically have a JP2 file extension. JPEG 2000 can achieve a compression ratio of 20:1, which is similar to the MrSID format.
  • MrSID (Multi-Resolution Seamless Image Database): A compressed wavelet format that...

Reading and writing vector data with GeoPandas


It's time for some hands-on exercises. We'll start with reading and writing some vector data in the form of GeoJSON using the GeoPandas library, which is the application used to demonstrate all examples is Jupyter Notebook, which comes preinstalled with Anaconda3. If you've installed all geospatial Python libraries from Chapter 2, Introduction to Geospatial Code Libraries, you're good to go. If not, do this first. You might decide to create virtual environments for different combinations of Python libraries because of different dependencies and versioning. Open up a new Jupyter Notebook and a browser window and head over to http://www.naturalearthdata.com/downloads/ and download the Natural Earth quick start kit at a convenient location. We'll examine some of that data for the rest of this chapter, along with some other geographical data files.

First, type the following code in a Jupyter Notebook with access to the GeoPandas library and run the...

Reading and writing vector data with OGR


Now, let's turn to OGR for reading and writing a vector so that you can compare both OGR and GeoPandas functionality for performing the same kind of tasks. To follow the instructions that are mentioned as we proceed, you can download the MTBS wildfire data from: https://edcintl.cr.usgs.gov/downloads/sciweb1/shared/MTBS_Fire/data/composite_data/fod_pt_shapefile/mtbs_fod_pts_data.zip and store them on your PC. The file that will be analyzed here is the mtbs_fod_pts_20170501 shapefile's attribute table, which has 20,340 rows and 30 columns.

We'll start with the ogrinfo command which works in a terminal window and can be used for describing vector data. These are not Python commands, but we'll include them here as you can easily run them in a Jupyter Notebook with a simple prefix (adding an exclamation mark before the used command). Take, for instance, the following command, which is similar to the Fiona driver command:

In: !ogrinfo –-formats

This command...

Reading and writing raster data with Rasterio


After covering how to read and write various vector data formats in Python, we'll now do the same for raster data. We'll start with the Rasterio library and have a look at how we can read and write raster data. Open up a new Jupyter Notebook where you have access to the Rasterio library and type the following code:

In: import rasterio    
    dataset = rasterio.open(r"C:\data\gdal\NE\50m_raster\NE1_50M_SR_W
    \NE1_50M_SR_W.tif")

This imports the rasterio library and opens a GeoTIFF file. We can now perform some simple data description commands, such as printing the number of image bands.

Note

Raster images contain either a single or multiple bands. All bands are contained in a single file, with each band covering the same area. When the image is read by your computer, these bands are overlayed on top of each other so that you'll see one single image. Each band contains a 2D array with rows and columns of data. Each data cell of each array contains...

Reading and writing raster data using GDAL


Here are some commands for reading and writing raster data with GDAL:

In: !gdalinfo --formats

This command lists all supported file formats in GDAL. For a summary including the CRS, use !gdalinfo without any prefixes:

In: !gdalinfo "C:\data\gdal\NE\50m_raster\NE1_50M_SR_W
    \NE1_50M_SR_W.tif"

Out: Driver: GTiff/GeoTIFF
     Files: C:\data\gdal\NE\50m_raster\NE1_50M_SR_W\NE1_50M_SR_W.tif
     Size is 10800, 5400
     Coordinate System is:
     GEOGCS["WGS 84",
     DATUM["WGS_1984", ...

You can convert a GeoTIFF to a JPEG file as follows:

In: !gdal_translate -of JPEG 
    "C:\data\gdal\NE\50m_raster\NE1_50M_SR_W\NE1_50M_SR_W.tif" 
    NE1_50M_SR_W.jpg

Out: Input file size is 10800, 5400
     0...10...20...30...40...50...60...70...80...90...100 - done.

The output, NE1_50M_SR_W.jpg, will look like this:

Now, let's open a GeoPackage using GDAL. GeoPackages can be either vector or raster-based, but in this case, we'll open a raster-based one, which becomes...

Summary


This chapter provided an overview of major data types in GIS. After explaining the difference between vector and raster data, the following vector and raster data types were covered—Esri shapefiles, GeoJSON, KML, GeoPackages, and GeoTIFF files. Next, we explained how to use some of the earlier described Python code libraries to read and write geospatial data. The following geospatial Python libraries for reading and writing raster and vector data were covered in particular—GeoPandas, OGR, GDAL, and Rasterio. Apart from reading and writing different geospatial data types, you learned how to use these libraries to perform file conversion between different data types and how to upload and download data from geospatial databases and remote sources.

The next chapter will cover geospatial analysis and processing. Python libraries covered are OGR, Shapely and GeoPandas. The reader will learn how to use these libraries and write scripts for geospatial analysis, using real-world examples.

...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Geospatial Analysis with Python
Published in: Apr 2018Publisher: PacktISBN-13: 9781788293334
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Silas Toms

Silas Toms is a long-time geospatial professional and author who has previously published ArcPy and ArcGIS and Mastering Geospatial Analysis with Python. His career highlights include developing the real-time common operational picture used at Super Bowl 50, building geospatial software for autonomous cars, designing computer vision for next-gen insurance, and developing mapping systems for Zillow. He now works at Volta Charging, predicting the future of electric vehicle adoption and electric charging infrastructure.
Read more about Silas Toms

author image
Paul Crickard

Paul Crickard authored a book on the Leaflet JavaScript module. He has been programming for over 15 years and has focused on GIS and geospatial programming for 7 years. He spent 3 years working as a planner at an architecture firm, where he combined GIS with Building Information Modeling (BIM) and CAD. Currently, he is the CIO at the 2nd Judicial District Attorney's Office in New Mexico.
Read more about Paul Crickard

author image
Eric van Rees

Eric van Rees was first introduced to Geographical Information Systems (GIS) when studying Human Geography in the Netherlands. For 9 years, he was the editor-in-chief of GeoInformatics, an international GIS, surveying, and mapping publication and a contributing editor of GIS Magazine. During that tenure, he visited many geospatial user conferences, trade fairs, and industry meetings. He focuses on producing technical content, such as software tutorials, tech blogs, and innovative new use cases in the mapping industry.
Read more about Eric van Rees