Home

Programming

Mastering PostGIS

By Dominik Mikiewicz , Michal Mackiewicz , Tomasz Nycz

Book

eBook $29.99 $20.98

Print $38.99

Subscription $15.99 $10 p/m for three months

BUY NOW

$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

eBook $29.99 $20.98

Print $38.99

Subscription $15.99 $10 p/m for three months

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

About this book

PostGIS is open source extension onf PostgreSQL object-relational database system that allows GIS objects to be stored and allows querying for information and location services. The aim of this book is to help you master the functionalities offered by PostGIS- from data creation, analysis and output, to ETL and live edits. The book begins with an overview of the key concepts related to spatial database systems and how it applies to Spatial RMDS. You will learn to load different formats into your Postgres instance, investigate the spatial nature of your raster data, and finally export it using built-in functionalities or 3th party tools for backup or representational purposes. Through the course of this book, you will be presented with many examples on how to interact with the database using JavaScript and Node.js. Sample web-based applications interacting with backend PostGIS will also be presented throughout the book, so you can get comfortable with the modern ways of consuming and modifying your spatial data.

Publication date:: May 2017
Publisher: Packt
Pages: 328
ISBN: 9781784391645
Download code from GitHub

Chapter 1. Importing Spatial Data

Learning database tools means working with data, so we need to cover that aspect first. There are many ways of importing data to PgSQL/PostGIS; some are more database-specific, some are PostGIS-specific, and some use external tools. To complicate things a bit more, quite often real-world data import processes are wrapped into programs that perform different tasks and ops in order to maintain the data quality and integrity when importing it. The key though is that even very complex import tools usually use simpler procedures or commands in order to achieve their goals.

Such simple import procedures are described in this chapter. We specifically focus on:

Importing flat data through both psql and pgAdmin and extracting spatial information from flat data
Importing shape files using shp2pgsql
Importing vector data using ogr2ogr
Importing vector data using GIS clients
Importing OpenStreetMap data
Connecting to external data sources with data wrappers
Loading rasters using raster2pgsql
Importing data with pgrestore

Obtaining test data

Before we start importing, let's get some data examples in different formats, specifically these:

AddressBase in CSV and GML
Code-Point Polygons in SHP, TAB and MIF
Points of Interest in TXT format

110M coastlines
110M land
50M Gray Earth

Earthquake data in CSV and KML format (https://earthquake.usgs.gov/earthquakes/map/)
UK Ordnance Survey sample data (https://www.ordnancesurvey.co.uk/business-and-government/licensing/sample-data/discover-data.html)
NaturalEarth (http://www.naturalearthdata.com/downloads/110m-physical-vectors/)

Note

You may either download the data using the links provided or find it in this chapter's resources. The location you extract the data to is not important really, as you can later address it using either relative or absolute file paths.

Setting up the database

All the examples in this chapter use a database named mastering_postgis. This database has been created off the postgis template.

The PgSQL on my dev machine listens on port 5434, which is not the default port for the Postgres database (default is 5432); so when using a default DB setup, you may have to adjust some of the examples a bit.

Note

If you need to change the port your db listens on, you should locate the db data directory, where you will find a postgresql.conf file. This is a text file, so you can edit it with an ordinary text editor. In order to adjust the port, find a port configuration in the Connections and Authentication section.

Schemas are a great way of managing the data and splitting it into meaningful collections. In most scenarios, one will have some production data, archive data, incoming data, and so on sensibly kept in separate schemas. Using additional schemas will depend on your requirements, but we do encourage you to introduce using schemas into your daily practice if you do not yet do so. The following examples import the data into tables defined in the data_import schema.

Importing flat data

Loading flat data may seem to be a bit dull initially but it is important to understand that many popular and interesting datasets often contain the spatial information in very different formats, such as:

Coordinates expressed in Lon/Lat or projected coordinates
Encoded geometry, for example WKT, TopoJSON, GeoJSON
Location in the form of an address
Location in non-cartesian coordinates, for example start point, angle and direction
While the earlier examples indicate the data would require further processing in order to extract the spatial content into a usable form, clearly ability to import flat datasets should not be underestimated

Note

Flat data in our scenario is data with no explicitly expressed geometry - non-spatial format, text-based files

Importing data using psql

Psql is the pgsql's command-line tool. While one can achieve quite a lot with GUI based database management utilities, psql is very useful when one needs to handle database backups, management and alike via scripting. When there is no GUI installed on the server, psql becomes pretty much the only option so it is worth being familiar with it even if you're not a fan.

In order to import the data in psql we will use a \COPY command. This requires us to define the data model for the incoming data first.

Defining the table data model from a text file may be prone to errors that will prevent data from being imported. If for of some reason you are not sure what data types are stored in the particular columns of your source file you can import all the data as text and then re-cast it as required at a later time.

Importing data interactively

In this example we will import the earthquakes data from USGS. So let's fire up psql and connect to the database server:

F:\mastering_postgis\chapter02>psql -h localhost -p 5434 -U postgres

You should see a similar output:

    psql (9.5.0)
    Type "help" for help.
    postgres=#

Then we need to connect to the mastering_postgis database:

    postgres=# \c mastering_postgis

The following output should be displayed:

    You are now connected to database "mastering_postgis" as user 
    "postgres".
    mastering_postgis=#

Note

In the psql examples I am using postgres user. As you may guess, it is a superuser account. This is not the thing you would normally do, but it will keep the examples simple. In a production environment, using a db user with credentials allowing access to specific resources is a sensible approach.

If you have not had a chance to create our data_import schema, let's take care of it now by typing the following command:

    mastering_postgis=# create schema if not exists data_import;

You should see a similar output:

    NOTICE:  schema "data_import" already exists, skipping
    CREATE SCHEMA

Once the schema is there, we create the table that will store the data. In order to do so just type or paste the following into psql:

  create table data_import.earthquakes_csv ( 
     "time" timestamp with time zone, 
     latitude numeric, 
     longitude numeric, 
     depth numeric, 
     mag numeric, 
     magType varchar, 
     nst numeric, 
     gap numeric, 
     dmin numeric, 
     rms numeric, 
     net varchar, 
     id varchar, 
     updated timestamp with time zone, 
     place varchar, 
     type varchar, 
     horizontalError numeric, 
     depthError numeric, 
     magError numeric, 
     magNst numeric, 
     status varchar, 
     locationSource varchar, 
     magSource varchar 
  );

You should see the following output:

    mastering_postgis=# create table data_import.earthquakes_csv (
    mastering_postgis(# "time" timestamp with time zone,
    mastering_postgis(# latitude numeric,
    mastering_postgis(# longitude numeric,
    mastering_postgis(# depth numeric,
    mastering_postgis(# mag numeric,
    mastering_postgis(# magType varchar,
    mastering_postgis(# nst numeric,
    mastering_postgis(# gap numeric,
    mastering_postgis(# dmin numeric,
    mastering_postgis(# rms numeric,
    mastering_postgis(# net varchar,
    mastering_postgis(# id varchar,
    mastering_postgis(# updated timestamp with time zone,
    mastering_postgis(# place varchar,
    mastering_postgis(# type varchar,
    mastering_postgis(# horizontalError numeric,
    mastering_postgis(# depthError numeric,
    mastering_postgis(# magError numeric,
    mastering_postgis(# magNst numeric,
    mastering_postgis(# status varchar,
    mastering_postgis(# locationSource varchar,
    mastering_postgis(# magSource varchar
    mastering_postgis(# );
    CREATE TABLE

Now, as we have our data table ready, we can finally get to the import part. The following command should handle importing the data into our newly created table:

\copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER

You should see a similar output:

mastering_postgis=# \copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER
    COPY 25

Note

If you require a complete reference on the \COPY command, simply type in: \h COPY into the cmd.

While you can customize your data after importing, you may wish to only import a subset of columns in the first place. Unfortunately \COPY command imports all the columns (although you may specify where to put them) so the solution to this would be:

Create a table that models the input CSV
Import all the data
Create a table with a subset of columns
Copy data over
Delete the input table

Even though everything said earlier is possible in psql, it requires quite a lot of typing. Because of that we will take care of this scenario in the next stage. Demonstrating the non-interactive psql mode.

Importing data non-interactively

For the non-interactive psql data import example we'll do a bit more than in the interactive mode. We'll:

Import the full earthquakes dataset
Select a subset of earthquakes data mentioned in the previous example and insert it into its own table
Import another dataset - in this case the Ordnance Survey's POIs

Basically the non-interactive usage of psql means we simply provide it with an SQL to execute. This way we can put together many statements without having to execute them one by one.

Once again we will need the data model prior to loading the data, and then a \COPY command will be used.

If you're still in psql, you can execute a script by simply typing:

\i path\to\the\script.sql

For example:

\i F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql

You should see a similar output:

mastering_postgis-# \i F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql
    CREATE SCHEMA
    psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:5: NOTICE:  table "earthquakes_csv" does not exist, skipping
    DROP TABLE
    CREATE TABLE
    COPY 25
    psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:58: NOTICE:  table "earthquakes_csv_subset" does not exist, skipping
    DROP TABLE
    SELECT 25
    mastering_postgis-#

If you quit psql already, type the following command into cmd:

psql -h host -p port -U user -d database -f path\to\the\script.sql

For example:

psql -h localhost -p 5434 -U postgres -d mastering_postgis -f F:\mastering_postgis\chapter02\code\data_import_earthquakes.sql

You should see a similar output:

F:\mastering_postgis\chapter02>psql -h localhost -p 5434 -U postgres -d mastering_postgis -f F:\mastering_postgis\chapter02\code\data_import_earthquakes.sql
    psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:2: NOTICE:  schema "data_import" already exists, skipping
    CREATE SCHEMA
    DROP TABLE
    CREATE TABLE
    COPY 25
    DROP TABLE
  SELECT 25

The script executed earlier is in the book's code repository under Chapter02/code/ data_import_earthquakes.sql.

Loading OS POI data is now a piece of cake. This dataset is in a bit of a different format though, so it requires slight adjustments. You can review the code in Chapter02/code/ data_import_gb_poi.sql.

Importing data using pgAdmin

In this section we'll import some new data we have not interacted with before - this time we'll have a look at the Ordnance Survey's address data we obtained in the CSV format.

Note

Depending on the pgAdmin version, the UI may differ a bit. The described functionality should always be present though. For the examples involving pgAdmin, screenshots were taken using pgAdmin III (1.22.2).

PgAdmin's import functionality is basically a wrapper around the \COPY so it does require a data model in order to work. Because of that, let's quickly create a table that will be populated with the imported data. You can do it with the GUI by simply right-clicking a schema node you want to create the table in and choosing New Object | New Table and then providing all the necessary model definitions in the displayed window:

You can also type some SQL which in many cases is a bit quicker:

drop table if exists data_import.osgb_addresses; 
create table data_import.osgb_addresses( 
   uprn bigint, 
   os_address_toid varchar, 
   udprn integer, 
   organisation_name varchar, 
   department_name varchar, 
   po_box varchar, 
   sub_building_name varchar, 
   building_name varchar, 
   building_number varchar, 
   dependent_thoroughfare varchar, 
   thoroughfare varchar, 
   post_town varchar, 
   dbl_dependent_locality varchar, 
   dependent_locality varchar, 
   postcode varchar, 
   postcode_type varchar, 
   x numeric, 
   y numeric, 
   lat numeric, 
   lon numeric, 
   rpc numeric, 
   country varchar, 
   change_type varchar, 
   la_start_date date, 
   rm_start_date date, 
   last_update_date date, 
   class varchar 
);

Once our table is ready, importing data is just a matter of right clicking the table node in PgAdmin and choosing Import. An import wizard that assists with the import process will be displayed:

Note

All the earlier could obviously be achieved with pure SQL and in fact we have done this already in the previous section on importing data in psql in non-interactive mode. You can review the SQL code available in Chapter02/code for details.

Extracting spatial information from flat data

As we have some flat data already in our database, it's time to extract the spatial information. So far all the datasets, used Cartesian coordinate systems so our job is really straightforward:

drop table if exists data_import.earthquakes_subset_with_geom; 
select  
   id, 
   "time", 
   depth, 
   mag, 
   magtype, 
   place,Points of Interest in TXT format 
   ST_SetSRID(ST_Point(longitude, latitude), 4326) as geom 
into data_import.earthquakes_subset_with_geom 
from data_import.earthquakes_csv;

This example extracts a subset of data and puts data into a new table with coordinates being expressed as a geometry type, rather than two columns with numeric data appropriate for Lon and Lat.

In order to quickly preview the data, we dump the table's content to KML using ogr2ogr (this is a little spoiler on the next chapter on exporting the data from PostGIS indeed):

ogr2ogr -f "KML" earthquakes_from_postgis.kml PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" data_import.earthquakes_subset_with_geom -t_srs EPSG:4326

Such KML can be viewed for example in Google Earth (you can use the original KML downloaded from USGS just as a cross check for the output data):

Note

More examples of extracting the spatial data from different formats are addressed in the ETL chapter.

Importing shape files using shp2pgsql

ESRI shapefile (SHP) is still the most common exchange format for sharing GIS data. The format itself is made of a few files such as SHP, SHX, DBF, andPRJ, where the first three are the required files and the file with projection information is not obligatory.

The standard PostGIS tool for loading shapefiles is shp2pgsql - you will find it in the bin folder of your postgres installation. shp2pgsql is a command-line utility that can either extract the shapefile data into SQL or pipe the output directly into psql (we'll see both approaches). shp2pgsql also has a GUI version that can be accessed directly in PgAdmin.

In this example, we'll use some NaturalEarth shapefiles we downloaded earlier. We will import the coastlines shapefile using the CMD version of shp2pgsql and then we'll add land masses using the GUI version.

shp2pgsql in cmd

shp2pgsql has an extensive list of parameters that can be accessed by simply typing shp2pgsql in the cmd. We will not use all the options but rather explain the most common ones.

The basic usage of the utility is as follows:

shp2pgsql [<options>] <shapefile> [[<schema>.]<table>]

For example:

shp2pgsql -s 4326 ne_110m_coastline data_import.ne_coastline

Basically you specify what shapefile you want to import and where. If you executed the earlier command, you would just see the shp2pgsql plain SQL output logged to the console, similar to this:

...
INSERT INTO "data_import"."ne_coastline" ("scalerank","featurecla",geom) VALUES ('3','Country','0105000020E6100000010000000102000000060000006666666666A65AC06766666666665240713D0AD7A3505AC0295C8FC2F56852400000000000205AC07B14AE47E15A5240B91E85EB51585AC0713D0AD7A33052405C8FC2F528BC5AC03E0AD7A3705D52406666666666A65AC06766666666665240');
    COMMIT;
    ANALYZE "data_import"."ne_coastline";

So basically, we need to do something with the utility output in order to make use of it. Let's save the output to an SQL file and let psql read it first:

shp2pgsql -s 4326 ne_110m_coastline data_import.ne_coastline > ne_110m_coastline.sql
psql -h localhost -p 5434 -U postgres -d mastering_postgis -f ne_110m_coastline.sql

You should see a similar output:

    SET
    SET
    BEGIN
    CREATE TABLE
    ALTER TABLE

                            addgeometrycolumn
  ---------------------------------------------------------------------
  data_import.ne_coastline.geom SRID:4326 TYPE:MULTILINESTRING DIMS:2
    (1 row)

    INSERT 0 1
    ...
    INSERT 0 1
    COMMIT
    ANALYZE

I suggest you have a look at the generated SQL so you get an idea of what is actually happening behind the scenes.

Now let's pipe the shp2pgsql output directly to psql:

shp2pgsql -s 4326 ne_110m_coastline data_import.ne_coastline | psql -h localhost -p 5434 -U postgres -d mastering_postgis

The cmd output will be exactly the same as the one we have already seen when reading data from the SQL file.

Note

You will have to drop the data_import.ne_coastline table before importing the data again; otherwise the command in its current shape will fail.

There are some shp2pgsql options that are worth remembering:

-s SRID: Specifies the shp data projection identifier. When used in the following form: -s SRID1:SRID2 makes the shp2pgsql apply a coordinate system transformation, so projection of the data changes is required.
-p: Turns on the 'prepare' mode - only a table definition is output.
-d: Drops and recreates a table.
-a: Appends data to the existing table, provided its schema is exactly the same as the schema of the incoming data.
-g: Allows specifying of the geometry column name; the default is geom (or geog if you decide to use geography with the -G param).
-m <filename>: Specifies a file name that contains column mapping for the DBF file. This way, you can remap dbf column names to your preference.
-n: Only imports DBF and no spatial data.

Importing data with SRID transformation: -s SRID1:SRID2.

The shp2pgsql GUI version

shp2pgsql also has a GUI version. In order to use it, when in PgAdmin, simply choose Plugins | PostGIS Shapefile and DBF loader 2.2 (the version may vary); the following import wizard will be displayed:

Note

In pgAdmin 4, accessing the shapefile loader GUI may not be so obvious. To trigger the tool, try typing shp2pgsql-gui in the shell/command line.

Similar to the cmd version of the utility, you can specify the schema, and you should specify the SRID.

The nice thing about the GUI version of shp2pgsql is that it lets one import multiple files at once.

In the options dialogue, you can specify data encoding, decide whether or not you would like to create a spatial index after importing, choose geography over geometry, and so on:

Importing vector data using ogr2ogr

ogr2ogr is the GDAL's vector transform utility. It is - not without reason - considered a Swiss Army knife for vector transformations. Despite its size, ogr2ogr can handle a wide range of formats and this makes it a really worthy tool.

We'll use ogr2ogr to import a few data formats other than SHP, although ogr2ogr can obviously import SHP too. For this scenario, we'll use some data downloaded earlier:

OS GB address base in GML format
OS GB code point polygons in MapInof MIF & TAB formats
USGS earthquakes in KML format

Some of the most common ogr2ogr params are:

-f: The format of the output (when importing to PostGIS it will be PostgreSQL).
-nln: Assigns a name to the layer. In the case of importing the data to PostGIS this will be the table name.
-select: Lets you specify a comma separated list of columns to pick.
-where: Lets you specify a sql like query to filter out the data.
append: Appends data to the output dataset.
overwrite: Overwrites the output datasource - in case of PostgreSQL it will drop and re-create a table.
s_srs: Specifies the input SRID.
t_srs: Transforms coordinates to the specified SRID.
a_srs: Specifies the output SRID.
- lco NAME=VALUE: Layer creation options - these are driver specific; for pgsql options, see http://www.gdal.org/drv_pg.html. The most commonly used layer creation options are:
- LAUNDER: This defaults to YES. It is responsible for converting column names into pgsql compatible ones (lower case, underscores).
- PRECISION: This defaults to YES. It is responsible for using numeric and char types over float and varchar.
- GEOMETRY_NAME: Defaults to wkb_geometry.

Note

For a full list of ogr2ogr params, just type ogr2ogr.Ogr2ogr has an accompanying utility called ogrinfo. This tool lets one inspect the metadata of a dataset. Verifying the metadata of any dataset prior to working with it is considered good practice and one should get into the habit of always using it before importing or exporting the data.

Importing GML

Let's start with importing the GML of the OS GB address base. First we'll see what data we're dealing with exactly:

    ogrinfo sx9090.gml

The following should be the output:

    Had to open data source read-only.
    INFO: Open of `sx9090.gml'
          using driver `GML' successful.
    1: Address (Point)

We can then review the layer information:

    ogrinfo sx9090.gml Address -so

You should see a similar output:

    Had to open data source read-only.
    INFO: Open of `sx9090.gml'
          using driver `GML' successful.

    Layer name: Address
    Geometry: Point
    Feature Count: 42861
    Extent: (-3.560100, 50.699470) - (-3.488340, 50.744770)
    Layer SRS WKT:
    GEOGCS["ETRS89",
        DATUM["European_Terrestrial_Reference_System_1989",
            SPHEROID["GRS 1980",6378137,298.257222101,
                AUTHORITY["EPSG","7019"]],
            TOWGS84[0,0,0,0,0,0,0],
            AUTHORITY["EPSG","6258"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]],
        AUTHORITY["EPSG","4258"]]
    gml_id: String (0.0)
    uprn: Real (0.0)
    osAddressTOID: String (20.0)
    udprn: Integer (0.0)
    subBuildingName: String (25.0)
    buildingName: String (36.0)
    thoroughfare: String (27.0)
    postTown: String (6.0)
    postcode: String (7.0)
    postcodeType: String (1.0)
    rpc: Integer (0.0)
    country: String (1.0)
    changeType: String (1.0)
    laStartDate: String (10.0)
    rmStartDate: String (10.0)
    lastUpdateDate: String (10.0)
    class: String (1.0)
    buildingNumber: Integer (0.0)
    dependentLocality: String (27.0)
    organisationName: String (55.0)
    dependentThoroughfare: String (27.0)
    poBoxNumber: Integer (0.0)
    doubleDependentLocality: String (21.0)
    departmentName: String (37.0)

Note

-so param makes ogrinfo display the data summary only; otherwise, info on a full dataset would be displayed.

Once we're ready to import the data, let's execute the following command:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" sx9090.gml -nln data_import.osgb_address_base_gml -geomfield geom

At this stage, the address GML should be available in our database.

Note

We did not specify the SRID of the GML data. This is because this information is present in GML and the utility picks it up automatically.

Importing MIF and TAB

Both MIF and TAB are MapInfo formats. TAB is the default format that contains formatting, while MIF is the interchange format.

We'll start with reviewing metadata:

    ogrinfo EX_sample.mif

And then:

    ogrinfo EX_sample.mif EX_Sample -so
    Had to open data source read-only.
    INFO: Open of `EX_sample.mif'
          using driver `MapInfo File' successful.

    Layer name: EX_sample
    Geometry: Unknown (any)
    Feature Count: 4142
    Extent: (281282.800000, 85614.570000) - (300012.000000, 
    100272.000000)
    Layer SRS WKT:
    PROJCS["unnamed",
        GEOGCS["unnamed",
            DATUM["OSGB_1936",
                SPHEROID["Airy 1930",6377563.396,299.3249646],
                TOWGS84[375,-111,431,-0,-0,-0,0]],
            PRIMEM["Greenwich",0],
            UNIT["degree",0.0174532925199433]],
        PROJECTION["Transverse_Mercator"],
        PARAMETER["latitude_of_origin",49],
        PARAMETER["central_meridian",-2],
        PARAMETER["scale_factor",0.9996012717],
        PARAMETER["false_easting",400000],
        PARAMETER["false_northing",-100000],
        UNIT["Meter",1]]
    POSTCODE: String (8.0)
    UPP: String (20.0)
    PC_AREA: String (2.0)

Note

Please note that ogrinfo projection metadata for our MIF file does not specify the EPSG code. This is fine, as the projection definition is present. But it will result in ogr2ogr creating a new entry in the spatial_ref_sys, which is not too good, as we'll end up with the wrong coordsys identifiers; the coordinate reference id will be the next available. This is because ogr2ogr expands the coordinate reference into a WKT string and then does a string comparison against the coordsys identifiers definitions in the spatial_ref_sys table; minor differences in formatting or precision will result in ogr2ogr failing to match coordsys. In such a scenario, a new entry will be created; for example, if you happen to use the EPSG:3857 coordinate system and the system's definition is slightly different and cannot be matched, the assigned SRID will not be 3857, but the next available ID will be chosen. A solution to this is to specify the exact coordinate system; ogr2ogr should output the data via the a_srs parameter.

Once ready, we can import the data:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" EX_sample.mif -nln data_import.osgb_code_point_polygons_mif -lco GEOMETRY_NAME=geom -a_srs EPSG:27700

If you followed the very same procedure for TAB file and loaded the data, both datasets are now in their own tables in the data_import schema:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" EX_sample.tab -nln data_import.osgb_code_point_polygons_tab -lco GEOMETRY_NAME=geom -a_srs EPSG:27700

Importing KML

As usual, we'll start with the dataset's metadata checkup:

    ogrinfo 2.5_day_age.kml

The output shows that there is more than one layer:

    INFO: Open of `2.5_day_age.kml'
          using driver `LIBKML' successful.
    1: Magnitude 5
    2: Magnitude 4
    3: Magnitude 3
    4: Magnitude 2

Therefore, in order to review metadata for each layer at once, the following command should be used:

    ogrinfo 2.5_day_age.kml -al -so

The output of the previous command is rather longish, so we'll truncate it a bit and only show the info for the first layer:

    INFO: Open of `2.5_day_age.kml'
          using driver `LIBKML' successful.

    Layer name: Magnitude 5
    Geometry: Unknown (any)
    Feature Count: 2
    Extent: (-101.000100, -36.056300) - (120.706400, 13.588200)
    Layer SRS WKT:
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            TOWGS84[0,0,0,0,0,0,0],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9108"]],
        AUTHORITY["EPSG","4326"]]
    Name: String (0.0)
    description: String (0.0)
    timestamp: DateTime (0.0)
    begin: DateTime (0.0)
    end: DateTime (0.0)
    altitudeMode: String (0.0)
    tessellate: Integer (0.0)
    extrude: Integer (0.0)
    visibility: Integer (0.0)
    drawOrder: Integer (0.0)
    icon: String (0.0)
    snippet: String (0.0)

This time, EPSG information is available, so we do not have to worry; ogr2ogr will create extra SRID definition in the database.

Once we've confirmed that this is the exact dataset we'd like to import, we can continue with the following command:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" 2.5_day_age.kml -nln data_import.usgs_earthquakes_kml -lco GEOMETRY_NAME=geom -append

Note

Note the append param in the command earlier. This is required because our KML has more than one layer and ogr2ogr is importing them one by one. Without the append param, only the first layer would be imported and then ogr2ogr would fail with a similar output:FAILED: Layer data_import.usgs_earthquakes_kml already exists, and -append not specified. Consider using -append, or -overwrite. ERROR 1: Terminating translation prematurely after failed translation of layer Magnitude 4 (use -skipfailures to skip errors)

The cmd output should be similar to:

WARNING: Layer creation options ignored since an existing layer is
    being appended to.
WARNING: Layer creation options ignored since an existing layer is
    being appended to.
WARNING: Layer creation options ignored since an existing layer is
    being appended to.

At this stage, the KML dataset should have made it to our PostGIS database.

ogr2ogr GUI (Windows only)

For those preferring GUI over CMD, there is an alternative to plain old ogr2ogr--ogr2gui available from http://www.ogr2gui.ca/.

Simply download the required archive, extract it, and launch the appropriate .exe. After having played with ogr2ogr a bit already, the GUI should be rather self-explanatory:

Importing data using GIS clients

Many GIS software packages can directly connect to databases for reading and writing the data; from our perspective, they are just more database clients. In this section, we'll have a quick look at the very well-known QGIS and the certainly less popular but very powerful Manifold GIS.

Both can export data to databases and read it back. QGIS has a specialized PostGIS export module called SPIT; Manifold's export facility is built in into the GUI and follows the same export routines as other formats handled by the software.

Exporting a shapefile to PostGIS using QGIS and SPIT

QGIS offers a PostGIS export module called SPIT. You can access it by choosing Database\Spit\Import Shapefiles to PostGIS:

SPIT's GUI is very clear and easy to understand. You can import many files at once and you can specify the destination schema and table names. If required you can change the default Spit's geometry name (the_geom) to your liking. SRID also can be changed, but it will be applied to all the imported files. Once you provide all the required information and click on OK, a progress window is displayed and the data is exported to PostGIS.

Note

In newer versions of QGIS, you may not find SPIT anymore. In such cases, you can use a DbManager instead.

Exporting shapefile to PostGIS using QGIS and DbManager

Before using DbManager, you should load a shapefile you want to import to a database.

When ready, launch DbManager by going to Database\DB Manager\DB Manager. When the DbManager UI displays, locate your database, expand its node and select the schema you want to import the data to and then click the Import button (arrow down). You should be prompted with an import dialog; the Input dropdown lets you choose the layer to import.

Once you are happy with the import options (you may want to click the Update options button to populate the dialog), click on OK. When the import finishes you should see a confirmation dialogbox:

Exporting spatial data to PostGIS from Manifold GIS

In order to export the data, Manifold needs to import it internally first. This means that we do not export a particular format but rather we simply export spatial data. Once you bring a shapefile into Manifold, MapInfo TAB or MIF, SQL Server spatial table or any other supported vector format, exporting it to PostGIS is exactly the same for all of them.

In order to export a vector component, right-click on its node in the project tree and choose the export option. You will see an export dialog, where you should pick the appropriate export format; in this scenario, you need to choose Data Source. You will then be presented with a data source setup window, where you can either pick an existing connection or configure a new one:

Once you choose the appropriate connection, you can then set up the actual export parameters:

You can choose which columns you want to export, and the name of the identity, geometry, and version columns.

Note

There is a minor inconvenience with the exporter: it does not allow for adjusting of the destination schema and always exports to the public schema.

Manifold tries to find the PostGIS projection that best matches the Manifold projection. Unfortunately, it is not always possible as Manifold as such does not rely on EPSG coordinates systems definitions, but rather uses its internal mechanisms for handling projections. If Manifold does not match the PostGIS side projection, you can select it manually.

Export dialogue also offers an option to transform coordinates upon export and create indexes and CREATE/UPDATE triggers when data gets to the database.

Note

In order to enable PostgreSQL connections in Manifold, you may have to copy some PgSQL DLLs over to the Manifold installation directory. The exact information on how to do this can be easily found at georeference.org, the manifold user community forum.

Importing OpenStreetMap data

For importing OSM data into PostGIS, we'll use a command line utility called osm2pgsql. Apparently, making a Linux build of osm2pgsql is straightforward; getting one that runs on Windows may require some more effort as described here: https://github.com/openstreetmap/osm2pgsql/issues/17, https://github.com/openstreetmap/osm2pgsql/issues/472.

I have used a Cygwin build as mentioned here:

http://wiki.openstreetmap.org/wiki/Osm2pgsql#Cygwin

Once we have the osm2pgsql ready, we'll need some data. For the sake of simplicity, I have downloaded the Greenwich Park area from https://www.openstreetmap.org/export#map=16/51.4766/0.0003 and saved the file as greenwich_observatory.osm (you will find it in the data accompanying this chapter).

The downloaded file is actually an XML file. Do have a look what's inside to get an idea of the data osm2pgsql is dealing with.

In order to take advantage of the OSM tags used to describe the data, we will need the PostgreSQL hstore extension. Basically it allows for storing key-value pairs in a column, so data with flexible schema can easily be stored. In order to install it, you need to execute the following query in either PgAdmin or psql:

CREATE EXTENSION hstore;

In order to import OSM data, issue the following command, making sure you adjust the paths and db connection details to your environment:

osm2pgsql.exe -H localhost -P 5434 -U postgres -W -d mastering_postgis -S default.style ../data/greenwich_observatory.osm -hstore

Note

If you happen to receive a message such as Default style not found, please make sure to provide a valid path to the styles definition such as /usr/share/osm2pgsql/default.style.

You should see a similar output:

osm2pgsql SVN version 0.85.0 (64bit id space)                                                                                                                   Password:                                                                       Using projection SRS 900913 (Spherical Mercator)                                Setting up table: planet_osm_point
NOTICE:  table "planet_osm_point" does not exist, skipping
NOTICE:  table "planet_osm_point_tmp" does not exist, skipping
Setting up table: planet_osm_line
NOTICE:  table "planet_osm_line" does not exist, skipping
NOTICE:  table "planet_osm_line_tmp" does not exist, skipping
Setting up table: planet_osm_polygon                                            NOTICE:  table "planet_osm_polygon" does not exist, skipping
NOTICE:  table "planet_osm_polygon_tmp" does not exist, skipping                Setting up table: planet_osm_roads                                              NOTICE:  table "planet_osm_roads" does not exist, skipping                      NOTICE:  table "planet_osm_roads_tmp" does not exist, skipping                  Using built-in tag processing pipeline                                          Allocating memory for sparse node cache                                         Node-cache: cache=800MB, maxblocks=0*102400, allocation method=8192
Mid: Ram, scale=100                                                                                                                                             !! You are running this on 32bit system, so at most                             !! 3GB of RAM can be used. If you encounter unexpected
!! exceptions during import, you should try running in slim
!! mode using parameter -s.

Reading in file: ../data/greenwich_observatory.osm
Processing: Node(4k 4.7k/s) Way(0k 0.55k/s) Relation(41 41.00/s)  parse time: 0s
Node stats: total(4654), max(4268388189) in 0s                                  Way stats: total(546), max(420504897) in 0s                                     Relation stats: total(41), max(6096780) in 0s                                   Committing transaction for planet_osm_point                                     Committing transaction for planet_osm_line                                      Committing transaction for planet_osm_polygon                                   Committing transaction for planet_osm_roads
Writing relation (41)
Sorting data and creating indexes for planet_osm_point                          Analyzing planet_osm_point finished                                             Sorting data and creating indexes for planet_osm_line                           Sorting data and creating indexes for planet_osm_polygon                        Analyzing planet_osm_line finished                                              node cache: stored: 4654(100.00%), storage efficiency: 50.00% (dense blocks: 0, sparse nodes: 4654), hit rate: 2.00%                                            Sorting data and creating indexes for planet_osm_roads                          Analyzing planet_osm_polygon finished
Analyzing planet_osm_roads finished
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on  planet_osm_point
Creating indexes on  planet_osm_point finished
All indexes on  planet_osm_point created  in 0s
Completed planet_osm_point                                                      Copying planet_osm_line to cluster by geometry finished
Creating geometry index on  planet_osm_line                                     Creating indexes on  planet_osm_line finished                                   Copying planet_osm_polygon to cluster by geometry finished                      Creating geometry index on  planet_osm_polygon                                  All indexes on  planet_osm_line created  in 0s                                  Completed planet_osm_line                                                       Creating indexes on  planet_osm_polygon finished                                Copying planet_osm_roads to cluster by geometry finished                        Creating geometry index on  planet_osm_roads                                    All indexes on  planet_osm_polygon created  in 0s                               Completed planet_osm_polygon                                                    Creating indexes on  planet_osm_roads finished                                  All indexes on  planet_osm_roads created  in 0s                                 Completed planet_osm_roads                                                                                                                                      Osm2pgsql took 1s overall

At this stage, you should have the OSM data imported to the public schema. Thanks to using the hstore datatype for tags column, we can now do the following type of queries:

select name FROM planet_osm_point where ((tags->'memorial') = 'stone');

When executed in psql with the dataset used in this example, you should see the following output:

            name
-----------------------------
 Prime Meridian of the World
(1 row)

Note

You may want to index the tags columns in order to optimize the query performance.

Connecting to external data sources with foreign data wrappers

Since PostgreSQL 9.1, one can use Foreign Data Wrappers (FDW) in order to connect to the external data sources that are then treated as they were local tables. More information can be found on the PostgreSQL wiki: https://wiki.postgresql.org/wiki/Foreign_data_wrappers.

Querying the external files or databases seems to be standard these days. But how about querying WFS services or OSM directly? Now, this sounds cool, doesn't it? You should certainly have a look at some of the clever GEO data wrappers:

ogr_fdw: https://github.com/pramsey/pgsql-ogr-fdw
osm_pbf_fdw: https://github.com/vpikulik/postgres_osm_pbf_fdw

In this example, we'll use ogr_fdw to connect to some external data sources. Starting with PostGIS 2.2, it is a part of the bundle and there is no need to install it as it should already be available.

Note

Examples shown in this section can be executed in both psql or in PgAdmin.

Connecting to SQL Server Spatial

First we need to create a server:

CREATE SERVER fdw_sqlserver_test 
  FOREIGN DATA WRAPPER ogr_fdw 
  OPTIONS ( 
    datasource 'MSSQL:server=CM_DOM\MSSQLSERVER12;database=hgis;UID=postgres_fdw;PWD=postgres_fdw', 
    format 'MSSQLSpatial');

Note

You may have noticed I have created a postgres_fdw user with the same password.

If you're using Postgre SQL 9.5+, you can use the IMPORT SCHEMA command:

IMPORT FOREIGN SCHEMA "dbo.Wig100_skorowidz" 
FROM SERVER fdw_sqlserver_test INTO data_linked;

Otherwise you will have to specify the table schema explicitly:

CREATE FOREIGN TABLE data_linked.dbo_wig100_skorowidz 
   (fid integer , 
    geom public.geometry , 
    oid integer , 
    gid integer , 
    version integer , 
    godlo character varying , 
    nazwa character varying , 
    nazwa2 character varying , 
    kalibracja character varying , 
    pas real , 
    slup real ) 
   SERVER fdw_sqlserver_test 
   OPTIONS (layer 'dbo.Wig100_skorowidz');

Note

By default, PgAdmin does not display foreign tables, so you may have to go to File | Options and tick the Foreign Tables checkbox in the Browser node. In PgAdmin 4, foreign tables seem to be visible by default.

At this stage, you should be able to query the foreign table as if it was local.

Connecting to WFS service

This example is based on the ogr_fwd documentation, so it only shows the required stuff. A full example can be reviewed here:

https://github.com/robe2/pgsql-ogr-fdw

First let's create a foreign server:

CREATE SERVER fdw_wfs_test_opengeo 
  FOREIGN DATA WRAPPER ogr_fdw 
  OPTIONS ( 
    datasource 'WFS:http://demo.opengeo.org/geoserver/wfs', 
    format 'WFS');

Automagically bring in the schema:

IMPORT FOREIGN SCHEMA "topp:tasmania_cities" 
FROM SERVER fdw_wfs_test_opengeo INTO data_linked;

And issue a query against the foreign WFS table:

select city_name from data_linked.topp_tasmania_cities;

Since this dataset contains only one record, our result should be Hobart.

Loading rasters using raster2pgsql

raster2pgsql is the default tool for importing rasters to PostGIS. Even though GDAL itself does not provide means to load rasters to the database, raster2pgsql is compiled as a part of PostGIS and therefore supports the very same formats as the GDAL version appropriate for given version of PostGIS.

raster2pgsql is a command-line tool. In order to review its parameters, simply type in the console:

raster2pgsql

While taking a while to get familiar with the raster2pgsql help is an advised approach, here are some params that worth highlighting:

-G: Prints a list of GDAL formats supported by the given version of the utility; the list is likely to be extensive.
-s: Sets the SRID of the imported raster.
-t: Tile size - expressed as width x height. If not provided, a default is worked out automatically in the range of 32-100 so it best matches the raster dimensions. It is worth remembering that when importing multiple files, tiles will be computed for the first raster and then applied to others.
-P: Pads tiles right / bottom, so all the tiles have the same dimensions.
-d|a|c|p: These options are mutually exclusive:
- d: Drops and creates a table.
- a: Appends data to an existing table.
- c: Creates a new table.
- p: Turns on prepare mode. So no importing is done; only a table is created.
-F: A column with raster name will be added.
-l: Comma-separated overviews; creates overview tables named o_<overview_factor>_raster_table_name.
-I: Creates GIST spatial index on the raster column.
-C: Sets the standard constraints on the raster column after the raster is imported.

For the examples used in this section, we'll use Natural Earth's 50M Gray Earth raster.

As you remember, ogr2ogr has a ogrinfo tool that can be used to obtain the information on a vector dataset. GDAL's equivalent for raster files is called gdalinfo and is as worthy as its vector brother:

gdalinfo GRAY_50M_SR_OB.tif

You should get a similar output:

Driver: GTiff/GeoTIFF
Files: GRAY_50M_SR_OB.tif
       GRAY_50M_SR_OB.tfw
Size is 10800, 5400
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]]
Origin = (-179.999999999999970,90.000000000000000)
Pixel Size = (0.033333333333330,-0.033333333333330)
Metadata:
  AREA_OR_POINT=Area
  TIFFTAG_DATETIME=2014:10:18 09:28:20
  TIFFTAG_RESOLUTIONUNIT=2 (pixels/inch)
  TIFFTAG_SOFTWARE=Adobe Photoshop CC 2014 (Macintosh)
  TIFFTAG_XRESOLUTION=342.85699
  TIFFTAG_YRESOLUTION=342.85699
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (-180.0000000,  90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
Lower Left  (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
Upper Right ( 180.0000000,  90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
Center      (  -0.0000000,   0.0000000) (  0d 0' 0.00"W,  0d 0' 0.00"N)
Band 1 Block=10800x1 Type=Byte, ColorInterp=Gray

Before we get down to importing the raster, let's splits into four parts using gdalwarp utility. This way, we'll be able to show how to import a single raster and a set of rasters:

gdalwarp -s_srs EPSG:4326 -t_srs EPSG:4326 -te -180 -90 0 0 GRAY_50M_SR_OB.tif gray_50m_partial_bl.tif
gdalwarp -s_srs EPSG:4326 -t_srs EPSG:4326 -te -180 0 0 90 GRAY_50M_SR_OB.tif gray_50m_partial_tl.tif
gdalwarp -s_srs EPSG:4326 -t_srs EPSG:4326 -te 0 -90 180 0 GRAY_50M_SR_OB.tif gray_50m_partial_br.tif
gdalwarp -s_srs EPSG:4326 -t_srs EPSG:4326 -te 0 0 180 90 GRAY_50M_SR_OB.tif gray_50m_partial_tr.tif

For each command, you should see a similar output:

Creating output file that is 5400P x 2700L.
Processing input file GRAY_50M_SR_OB.tif.
0...10...20...30...40...50...60...70...80...90...100 - done.

Having prepared the data, we can now move onto importing it.

Importing a single raster

In order to import a single raster file, let's issue the following command:

raster2pgsql -s 4326 -C -l 2,4 -I -F -t 2700x2700 gray_50m_sr_ob.tif data_import.gray_50m_sr_ob | psql -h localhost -p 5434 -U postgres -d mastering_postgis

You should see a similar output:

Processing 1/1: gray_50m_sr_ob.tif
BEGIN
CREATE TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 1
(...)
INSERT 0 1
CREATE INDEX
ANALYZE
CREATE INDEX
ANALYZE
CREATE INDEX
ANALYZE
NOTICE:  Adding SRID constraint
CONTEXT:  PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN
NOTICE:  Adding scale-X constraint
(...)
----------------------
 t
(1 row)

 addoverviewconstraints
------------------------
 t
(1 row)

 addoverviewconstraints
------------------------
 t
(1 row)

COMMIT

The executed command created 3 tables: the main raster table called data_import.gray_50m_sr_ob and two overview tables called data_import.o_2_gray_50m_sr_ob and data_import.o_4_gray_50m_sr_ob. The command also created the GIST index and brought in the filename. The raster has been split into tiles of 2700 x 2700 pixels.

Importing multiple rasters

Let's import a directory of rasters now. We have four files with the file name mask gray_50m_partial*.tif. In order to import all the files at once, we'll issue the following command:

raster2pgsql -s 4326 -C -l 2,4 -I -F -t 2700x2700 gray_50m_partial*.tif data_import.gray_50m_partial | psql -h localhost -p 5434 -U postgres -d mastering_postgis

You should see a similar output:

Processing 1/4: gray_50m_partial_bl.tif
BEGIN
CREATE TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
Processing 2/4: gray_50m_partial_br.tif
(...)
Processing 3/4: gray_50m_partial_tl.tif
(...)
Processing 4/4: gray_50m_partial_tr.tif
(...)
CONTEXT:  PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN
NOTICE:  Adding maximum extent constraint
CONTEXT:  PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN
 addrasterconstraints
----------------------
 t
(1 row)

 addoverviewconstraints
------------------------
 t
(1 row)

 addoverviewconstraints
------------------------
 t
(1 row)

COMMIT

The command used to import multiple rasters was very similar to the one we used to import a single file. The difference was a filename mask used in place of a filename: gray_50m_partial*.tif. If we had used a bit more griddy pattern such as *.tif, all the TIF files present in a directory would be imported.

When processing multiple files, one can pipe the output to psql without the connection info specified as psql params, but in such a case, equivalent environment variables will have to be set (on Windows, use the set command, and on Linux, export):

set PGPORT=5434
set PGHOST=localhost
set PGUSER=postgres
set PGPASSWORD=somepass
set PGDATABASE=mastering_postgis
raster2pgsql -s 4326 -C -l 2,4 -I -F -t 2700x2700 gray_50m_partial*.tif data_import.gray_50m_partial | psql

Importing data with pgrestore

Just to make the data import complete, it is worth mentioning the restore command. After all, it is not very an uncommon scenario to receive some data in the form of a database, schema, or even a single table backup.

For this scenario, let's create a backup of one of the tables imported before:

pg_dump -h localhost -p 5434 -U postgres -t data_import.earthquakes_subset_with_geom -c -F c -v -b -f earthquakes_subset_with_geom.backup mastering_postgis

Since there was a -v option specified, you should get a similarly verbose output:

pg_dump: reading schemas
pg_dump: reading user-defined tables
pg_dump: reading extensions
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined operator families
pg_dump: reading user-defined text search parsers
pg_dump: reading user-defined text search templates
pg_dump: reading user-defined text search dictionaries
pg_dump: reading user-defined text search configurations
pg_dump: reading user-defined foreign-data wrappers
pg_dump: reading user-defined foreign servers
pg_dump: reading default privileges
pg_dump: reading user-defined collations
pg_dump: reading user-defined conversions
pg_dump: reading type casts
pg_dump: reading transforms
pg_dump: reading table inheritance information
pg_dump: reading event triggers
pg_dump: finding extension members
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: finding the columns and types of table "data_import.earthquakes_subset_with_geom"
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading rewrite rules
pg_dump: reading policies
pg_dump: reading row security enabled for table "data_import.earthquakes_subset_with_geom"
pg_dump: reading policies for table "data_import.earthquakes_subset_with_geom"
pg_dump: reading large objects
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = on
pg_dump: dumping contents of table "data_import.earthquakes_subset_with_geom"

Having backed up our table, let's drop the original one:

DROP TABLE data_import.earthquakes_subset_with_geom;

And see if we can restore it:

pg_restore -h localhost -p 5434 -U postgres -v -d mastering_postgis earthquakes_subset_with_geom.backup

You should see a similar output:

pg_restore: connecting to database for restore
pg_restore: creating TABLE "data_import.earthquakes_subset_with_geom"
pg_restore: processing data for table "data_import.earthquakes_subset_with_geom"
pg_restore: setting owner and privileges for TABLE "data_import.earthquakes_subset_with_geom"
pg_restore: setting owner and privileges for TABLE DATA "data_import.earthquakes_subset_with_geom"

At this stage, we have successfully imported data by using the PostgreSQL backup / restore facilities.

Note

If you happen to get some errors on the pg_dump version, do make sure you're using the one appropriate for the DB you are exporting from. You can find it in the bin folder of the PostgreSQL directory.

Summary

There are many different ways of importing data into a PostGIS database. It is more than likely that you will not have to use all of them all the time but rather have your preferred ways of loading the data in your standard routines.

It is worth knowing different tools, though considered as different scenarios, may have to be addressed with special care. When you add scripting to the stack (see the chapter on ETL), you are equipped with some simple yet very powerful tools that may constitute a decent ETL toolbox.

Having fed our database with data, next, we'll have a look at spatial data analysis and find our way through a rich function set of PostGIS.

About the Authors

Dominik Mikiewicz

Dominik Mikiewicz is a senior GIS consultant and the owner of one-person software shop Cartomatic. When not coding, he spends time with wife and kids, trying to make the little ones enjoy mountain trekking. He is also a long-distance cycling and running enthusiast.
Browse publications by this author
Michal Mackiewicz

Michal Mackiewicz has been working as a software engineer at GIS Support for five years. His main job is to orchestrate various open source geospatial components and creating application-specific GIS systems. PostgreSQL and PostGIS are among his favorite tools, and are used in almost every project. Apart from developing, he also runs PostGIS training courses. When not at work, he volunteers for OpenStreetMap and a local railway preservation society.
Browse publications by this author
Tomasz Nycz

Tomasz Nycz is a geographer and cartographer. He initiated the implementation of GIS in the State Fire Service units in Poland. He works with recognized GIS companies in the emergency management industry. In practice, he uses QGIS and PostGIS. He has been an OpenStreetMap contributor for years. He also develops his scientific interests in the use of new technologies in geomorphology and remote sensing. He is also an avid drone pilot and mountain explorer.
Browse publications by this author

Un bon achat. Je le recommande...

La compra fue rápida y sin contratiempos

Too many programs to manage. A more focused approach to spatial SQL and OGR would be preferable. Your cookies and pop-ups are horrible. They infect everything I touch. A real TURN-OFF when considering my education resources. You "guys" have to understand your limits. You have crossed them with me and everytime I see your add on a page I think, maybe I WON'T buy from them again! BACK OFF CREEPS.