Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Analytics for the Internet of Things (IoT)

You're reading from  Analytics for the Internet of Things (IoT)

Product type Book
Published in Jul 2017
Publisher Packt
ISBN-13 9781787120730
Pages 378 pages
Edition 1st Edition
Languages
Author (1):
Andrew Minteer Andrew Minteer
Profile icon Andrew Minteer

Table of Contents (20) Chapters

Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
Defining IoT Analytics and Challenges IoT Devices and Networking Protocols IoT Analytics for the Cloud Creating an AWS Cloud Analytics Environment Collecting All That Data - Strategies and Techniques Getting to Know Your Data - Exploring IoT Data Decorating Your Data - Adding External Datasets to Innovate Communicating with Others - Visualization and Dashboarding Applying Geospatial Analytics to IoT Data Data Science for IoT Analytics Strategies to Organize Data for Analytics The Economics of IoT Analytics Bringing It All Together

Chapter 9. Applying Geospatial Analytics to IoT Data

"I know it has been tough since Willard left, but I have some good news; we will be getting you some help." These words are coming to you accompanied by a concerned, paternal look from the Vice President of Connected Product Development, who is standing in your cubicle.

Willard was your boss, now your former boss. He left recently to join another company as Head of their IoT division. They were highly impressed with his accomplishments building up a cloud-based IoT analytics capability, which was all your idea, of course, but c'est la vie.

Now, with your boss gone, the VP and other executives are worried about the momentum stalling. You are a little bit miffed at this since you had to twist your boss's arm to get him to go along with everything in the first place.

Look on the bright side

, you tell yourself,

at least you are getting some help

.

He continues, "We want to up the ante, none of the competition is aggregating their thermostat data...

Why do you need geospatial analytics for IoT?


Imagine that your company sells a device that measures airborne pollutants. It is internet-enabled and reports data back to your company at regular intervals using MQTT. The target market for this product is environmentally-minded consumers who want to both measure pollutants near their home and contribute to the collective monitoring of the environment.

The value proposition is that they get free analysis of their local air quality in exchange for donating their data to support a cause they probably believe in anyway. Your company is planning to aggregate and package analytics of high-quality air pollution data to sell it to government and private organizations.

Since the device is sold to consumers indirectly through various retail outlets, your company is not initially aware of the location of the devices. The consumer connects the device to the internet after it is purchased, and then enters their addresses. At this point, the location can...

The basics of geospatial analysis


Before we jump into the fun stuff, we will cover some basic concepts. This will give context to how the analytics work behind the scenes.

Welcome to Null Island

If you have devices that report GPS location data, you will soon start to notice that many are visiting an area off the west coast of Africa. A new vacation destination, perhaps? Turns out this is a place called Null Island. If you have not heard of Null Island, it is located at precisely 0 latitude and 0 longitude. There is even a tourism website for it where you can get to know the culture and buy a T-shirt, as shown in the following screenshot:

Null Island location and website. Source: www.nullisland.com

But the place does not exist; it is an inside joke in the geospatial community. Missing coordinate values or null values are stored as 0 and 0 (latitude and longitude). Besides it giving insight into the sense of humor of the geospatial community, Null Island helps to introduce a key concept for geospatial...

Vector-based methods


There are two main categories of geospatial analysis and file types, vector and raster. Vectors are all about shapes, while rasters are more about grids. Vector is more common due to flexibility and efficient storage. Vectors can be defined simply by using a set of points. There are three main types of vector geometry:

  • Points: This can be defined in two or three dimensions. It is the common latitude, longitude pair you are probably very familiar with. The airport locations used in the R code previously are examples of points.
  • Lines or LineString: A LineString is defined by a set of points and order is important. More than one LineString can be stored together; in that case it is called, unsurprisingly, a MultiLineString. A river system or roadways network is an example of a MultiLineString. A file that contains a MultiLineString for the US Interstate roadways network can be downloaded from the University of Iowa GIS Library (ftp://ftp.igsb.uiowa.edu/gis_library/USA/us_interstates...

Raster-based methods


Raster consists of a grid of cells arranged in rows and columns. Think of raster like pixels on a screen, except each pixel is defined using a set ground distance. There is a lot in common between raster files and image files. Raster files are sometimes saved using the same formats as image files. Images are often created straight from raster files; you see such examples all the time, from weather forecasts to terrain maps.

The size of the cells in the grid is similar in concept to the resolution of an image. Unlike vector data, a raster contains information for the entire area it covers. It is useful for things that have values for an entire area, such as elevation and temperature. The downside is the resulting large file sizes.

Multiple values per cell can be stored as different bands in the dataset. This is similar in concept to RGB values for a color image. The SRTM and Digital Elevation Model (DEM) datasets discussed in Chapter 7, Decorating Your Data - Adding External...

Storing geospatial data


There are many ways to store geospatial data. Depending on your intended use, a filesystem format or a relational database maybe the most appropriate. We will cover an introduction to both.

File formats

There are hundreds of file formats for storing geospatial data. The most common for vector data is ESRI shapefiles. A shapefile actually consists of multiple different files with the .shp extension for the main file. Most geospatially-aware software and Python packages know to look for the other needed files when given the location of the .shp file.

GeoJSON is another storage format that is human readable. It uses a defined JSON format to store vector data definitions as text. It is easily readable but can get large in size.

Another way to represent vector data, whether in a file or in code, is using the Well-known text (WKT) and Well-known binary (WKB) formats. WKT is human readable, while WKB is not. WKB offers significant compression in size, so is often a good choice...

Processing geospatial data


Specialized software can help in processing and visualizing geospatial data. This can be useful for small data and one-time analyses. Even if you have a big data solution, using these tools can help you communicate your findings more effectively to others.

Geospatial analysis software

We will review the most popular Geographic Information System (GIS) tools, so you have some familiarity with them. They are useful support tools for geospatial analytics.

ArcGIS

ArcGIS is the de facto standard for paid GIS software. It was developed and is maintained by the ESRI corporation. It has an awe-inspiring amount of functionality and is used by most professional geospatial analysts. It has world-class support by ESRI and many training options abound. It links to useful datasets and geospatial analytic capabilities, which are also maintained by ESRI.

ArcGIS is available as a desktop application or as a cloud service. You can sign up for a 60-day free trial (https://www.arcgis.com...

Solving the pollution reporting problem


From what you have learned in this chapter, you can now solve the IoT pollution sensor data by congressional districts problem introduced earlier. Follow these general steps using either Python code or spatial query functions in a database such as PostGIS:

  1. Download a shapefile for U.S. Interstates such as the U.S. National Transportation Atlas Interstate Highways shapefile available from the University of Iowa (ftp://ftp.igsb.uiowa.edu/gis_library/USA/us_interstates.htm).
  2. Download a shapefile for US congressional districts such as the TIGER/Line Shapefile available from the US Census (https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2016&layergroup=Congressional+Districts+%28115%29).
  3. Load the shapefiles into a geospatial database using ogr2ogr or into Python using the fiona package.
  4. Add a 1 km buffer to the Interstates MultiLineString using the shapely package or ST_Buffer in PostGIS.
  5. Use a mapping API such as Google Maps to geocode each...

Summary


In this chapter, you learned about how to use geospatial analytics to find insights and answer complex questions about IoT data. The importance of geospatial analysis for geographically distributed IoT devices was discussed. The concept of CRS was introduced along with haversine distance and its limitations.

The world is not a perfect sphere. Methods to adjust for that in order to accurately measure distance was covered. Python functions for geospatial analytics, such as buffer and contains, were discussed, along with some examples.

Storing and processing geospatial data requires some specialized handling. Some geospatial databases and GIS software tools were reviewed. PostGIS spatial functions were also reviewed. We went over some tips for leveraging geospatial analytics in a big data world.

Geospatial analytics offers a huge opportunity to analyze IoT data in new and innovative ways. It can help discover patterns in noisy data. New services can then be created as another way to extract...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Analytics for the Internet of Things (IoT)
Published in: Jul 2017 Publisher: Packt ISBN-13: 9781787120730
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}