In this chapter, we will cover the following recipes:
Finding geospatial data on your computer
Describing data sources
Importing data from text files
Importing KML/KMZ files
Importing DXF/DWG files
Opening a NetCDF file
Saving a vector layer
Saving a raster layer
Reprojecting a layer
Batch format conversion
Loading vector layers into SpatiaLite
Loading vector layers into PostGIS
If you want to work with QGIS, the first thing you need is spatial data. Whether you want to prepare a nice-looking map layout or perform spatial analysis, you need to open some data to work with. This chapter deals with the basic input and output commands, which will allow you to use data in several different formats and also export to the most convenient format in case you want to use it in different applications or share with others.
Automation is possible for many of the operations that you will see in this cookbook. This chapter contains some recipes that use automation to process a set of input files.
Before you start working, make sure that you have copied the sample dataset to your filesystem and you have it located.
There are several ways of locating and opening a data file to open it in QGIS, but the most convenient of these is the QGIS browser:
To enable this, go to the View | Panels menu and enable the Browser checkbox in it. The browser will be shown by default in the left-hand side of the QGIS window, as shown in the following screenshot:
Multiple selections are allowed. In that case, select the Add selected layer menu.
There are a few more things that you need to know that are related to this recipe. They are explained in the following sections.
As an alternative to the browser, the Layer menu contains a set of entries. Each of them deals with a different type of data. They give you some additional options, and they might allow you to work with formats that are not directly supported by the browser.
Navigating to the folder where your data is located can be tedious. If you use a given folder regularly, you can right-click on it and select Add as favorite. The folder will appear on the Favorites section at the top of the browser tree.
The browser also shows non-file data, such as remote services. Services have to be defined before they appear on the corresponding section in the browser. To add a service, right-click on the service name and select New connection.... A dialog will appear to define the service connection parameters.
A new entry will appear, containing the layers offered by the service, as shown in the following screenshot:
Before you start working, make sure that you have copied the sample dataset to your filesystem and that you have it located.
In the QGIS browser, navigate to the folder with your sample dataset. Select the
elev_lid792_1mfile and right-click on it. In the context menu, select Properties. A dialog like the one in the following screenshot will appear:
This dialog displays the properties of a raster layer.
Now, let's select a vector layer instead. Select the
elev_lid792_randpts.shpfile, right-click on it, and select Properties. The information dialog will look like the following:
In the upper part of the description window, you will see a field named Provider. Provider defines the type or data origin and who takes care of reading the data and passing it to QGIS. For raster layers, you will see
gdal as Provider. For most file-based vector layers,
ogr will be the provider that will appear. They refer to the GDAL and OGR libraries, two open-source libraries that are used by many GIS programs to access both raster and vector data.
If the data is already loaded in QGIS, you can access the information about it in the Properties section of the layer (right-click on the layer name to select the Properties entry in the context menu). In the sections displayed in the left-hand side, select the Metadata section. You will see a box containing all the information corresponding to the layer data origin:
Functionality provided by the GDAL library, which (mentioned earlier) acts as a provider for raster layers, is also available in the Raster menu. This includes processing and data analysis methods, but it also includes the information tool that is used to describe a raster data source. You will find it by navigating to Raster | Miscellaneous | Info:
This is a more complex way to retrieve properties as you can call the tool by adjusting the parameters with more details to get additional information. To know more, check the gdalinfo help page at http://www.gdal.org/gdalinfo.html.
In the upper field, enter the path to the
elev_lid792_randpts.csvfile in the sample dataset. That file contains a points layer as text.
Once you enter the file path or select it in the file browser that can be opened by clicking on the Browse button, the fields in the lower part of the dialog will be filled, as shown in the following screenshot:
We are using a CSV file that has values separated by commas, so you must select the CSV option in the Format field.
The X field and Y field drop-down lists will be populated with the fields that are available, which are described in the first line of the text file. Select X for X field and Y for Y field. Now, QGIS knows how to create the geometries and has enough information to create a new layer from the text file.
Enter a name for the layer in the Layer name field and click on OK. The layer will be added to the QGIS project, as shown in the following screenshot:
No information about the CRS is contained in the text file or entered in the parameters dialog, so it must be added manually. In this case, the CRS used is
EPSG:3358. To set this as the CRS of the layer, right-click on the layer name and select Set layer CRS:
Data is read from the text file and processed to create geometries. All the fields in the table (all data in a row in the text file) are also added, including the ones used to create the geometries, as you will see by right-clicking on the layer and selecting Open attribute table, as shown in the following screenshot:
Along with the CSV file, this file may contains a CSVT file, which describes the types of the fields. This is used by QGIS to set the appropriate type for the attributes table of the layer. If the CSVT file is missing, as in our example's case, QGIS will try to figure out the type based on the values for each field.
Layers created from text files are not restricted to point files. Any geometry can be created from the text data. However, if it is not a point, instead of selecting two columns, you must place all the geometry information in a single one and enter a text representation of the geometry. QGIS uses the Well-Known Text (WKT) format, which is a text markup language for vector geometries, to describe geometries as strings. Here is an example of a very simple CSV file with line features and two attributes:
geom,id,elevation LINESTRING(0 1, 0 2, 1 3),1,50 LINESTRING(0 -1, 0 -2, 1 -3),2,60 LINESTRING(0 1, 0 3, 5 4),3,70
To know more about the WKT format, you can go to http://en.wikipedia.org/wiki/Well-known_text
To open a KML layer, select Layer/Add vector layer.... In the dialog that opens, click on the Browse button to open the file selector dialog. Select the Keyhole Markup Language (KML) format and then select the file that you want to load. In the example dataset, you can find several KML files. Select the
elcontour1m.kmlfile. Click on OK in the vector layer selector dialog, and the layer will be added to your project, as shown in the following screenshot:
Go to Layer | Add vector layer.... In the dialog that opens, click on the Browse button to open the file selector dialog. Select the All files option to view all the files and then select the
elcontour1m.kmzfile. There is not a KMZ file type defined in QGIS, but QGIS supports it because the underlying OGR library can read KMZ files as well.
Click on OK on the open layer dialog to open the selected layer.
From the layers contained in the KMZ file, you must select one of them. In this case, only a layer is contained in the
elcontour1m.kmz file, so it is loaded automatically. The layer will be added to your QGIS project.
KMZ files are compressed files that contain a set of layers. When you select it, the OGR library will unzip the content of this file and then open the layers that it contains.
If just a single layer is contained, you will not see the layer selection dialog. QGIS will automatically open the only layer in the KMZ file.
As KMZ is not recognized as a supported format, the KMZ file will not appear in the QGIS browser. However, the browser supports zipped files, and a KMZ file is actually a zipped file with KML files inside it. Unzip it in a folder and then you will be able to use the QGIS Browser to open the layers it contains.
To open a DXF layer, select Add vector layer... in the Layer menu. In the dialog that opens, click on the Browse button to open the file selector dialog. Select the Autocad DXF format and then the file that you want to load.
In the example dataset, you can find several DXF files. Select the
Wake_ApproxContour_100.dxffile. Click on OK in the vector layer selector dialog and the layer will be added to your project, as shown in the following screenshot:
The example DXF file that you opened contained just one type of geometry. DXF files can, however, contain several of them: in this case, they cannot be added to QGIS in one layer. When this happens, QGIS will ask you to select the type of geometry that you want to open.
In the sample dataset, you will find a file named
CSS-SITE-CIV.dxf. Open it and you will see the following dialog:
Select one of the available geometries, and a layer will be added to your QGIS project.
DWG is a closed format of Autodesk. This means that the specification of the format is not available. For this reason, QGIS, like other open source applications, does not support DWG files. To open a DWG file in QGIS, you need to convert it. Converting it to a DXF file is a good option as this will let you open your file in QGIS without any problem. There are many tools to do this. The Teigha converter can be found at http://opendesign.com/guestfiles/TeighaFileConverter and is a popular and reliable option.
Another option is using the free service offered by Autodesk, called Autocad 360, which can be found at https://www.autocad360.com/.
The NetCDF data is a data format, which is designed to be used with array-oriented scientific data, and it is frequently used for climate or ocean data, among others. This recipe shows you how to open a NetCDF file in QGIS.
NetCDF files are raster files, and they can be opened using the Add raster layer menu. Select
NGMT NetCDF Grid for CDF as the file format in the file selection dialog that you will see, and select the
rx5dayETCCDI_yr_MIROC5_rcp45_r2i1p1_2006-2100.nc file from the example dataset. Click on OK.
The proposed NetCDF file contains a single variable, which is opened as a regular raster layer.
When only one layer is available, it is opened directly, as in the previously described example.
Another way of opening NetCDF files is using the NetCDF Browser plugin. Select the Manage and install plugins... menu to open the plugin manager. Go to the Not installed section and type
netcdf in the search field to filter the list of available plugins. Select the NetCDF Browser plugin and click on Install plugin to install it. Close the plugin manager.
The plugin is now installed, and you can open it by selecting NetCDF Browser in the Plugins menu:
Select the NetCDF file in the upper field. The other fields will be updated with the content of the selected file. Select a layer from the available ones and click on Add to add the layer to your QGIS project.
You will use the layer named
poi_names_wake.shp in this recipe. Make sure that it is loaded in your QGIS project.
Let's suppose that you want to use this layer to create a web map. A popular format supported by libraries, such as Leaflet of OpenLayers 3, is the GeoJSON format. Select GeoJSON in the format field and enter a path and filename in the Save as field.
In the Save as dialog, click on OK. The GeoJSON file will be created.
The OGR library, which is used by QGIS to read and open files, is also used to write them. Not all of the formats that are supported for reading purposes are also supported for writing purposes.
You can export even the layers that are not originally file-based to a file, such as a layer coming from a PostGIS database or a WFS connection. Just select the layer in the table of contents and proceed as just explained.
The Save as dialog allows additional configuration beyond what you have seen in the example in this recipe.
The options are shown by clicking on the More options button. Select GeoJSON as the export format and then display the options for that particular format. The COORDINATE PRECISION option controls the number of decimal places to write in the output GeoJSON file. The default precision is too high for almost all cases, and most of the time, having three or four decimal places is more than enough. Set the precision to
4, enter a valid path and filename, and export the layer by clicking on OK. Your points layer will now be saved in a smaller GeoJSON file. You can open this with a text editor to verify that the coordinates are expressed with the selected precision or compare its size with the one created without specifying a precision value.
In the Resolution fields, replace both of them with a value of
2. The original resolution (the size of the cell) is
1, as you saw in a previous recipe.
Enter an output file path in the Save as field.
Click on OK. The layer will be saved with a coarser resolution than the original one.
The layer can be exported with a reduced extent. In the QGIS canvas, zoom to a small part of the raster layer. Then open the Save as dialog. In the Extent section, click on the Map view extent button. The bounding coordinates of the current map view will be placed in the four coordinate fields.
Enter a file path to save the file to and click on OK. A layer with a reduced extent covering only the region shown in the map view will be exported.
Layers may be in a CRS other than the one that is best for a given task. Although QGIS supports on-the-fly reprojection when rendering, other tasks, such as performing spatial analysis, may require using a given CRS or having all input layers in the same one. This recipe shows you how to reproject a vector layer.
Davis_DBO_centerline.shp layer uses a CRS with feet as the unit, which makes this unsuitable for certain operations. We plan to use this layer in future recipes to calculate routes and work in metric units, so including this in a CRS that uses them is then a much better option:
Right-click on the layer name in the table of contents and select Save as....
Select Selected CRS in the drop-down list to specify a different output CRS. Click on the Browse button to select a CRS. You will see the CRS selector dialog.
You will be converting the point to the
EPSG:26911CRS. Use the filter box to find it among the list of available CRSs and select it. Then click on OK.
Reprojecting is done by the OGR library when it saves the file because this is one of the options that it supports.
Raster layers can be reprojected in a similar way:
In the Save as dialog, for raster layers, you can find a CRS field with a Browse button.
Click on it to open the CRS selector, and select the destination CRS.
When you click on OK, the raster layer will be exported using the selected CRS instead of its original one.
The Save as dialog can be used to convert the format of a single layer. When several layers have to be converted, it is a better idea to use some automation. This recipe shows you how to easily convert an arbitrary number of layers.
No previous preparation is needed. Batch conversion is not performed based on open layers but performed directly on files, so there is no need to open layers in QGIS before converting them.
Open the Processing Toolbox menu by selecting Toolbox in the Processing menu. The Processing Toolbox menu is the main element of the QGIS Processing framework, and it is used to call its algorithms:
In the filter box of the Processing Toolbox menu, type
saveto filter the list of available algorithms. Locate the Save selected features algorithm, right-click on it, and select Execute as batch process. The batch processing interface will be displayed, as shown in the following screenshot:
In the upper cell in the Input layer column, click on the ... button and select Select from filesystem. A file selector dialog will appear. Select the content of the
batch_conversionfolder in the dataset. It should have a total of three files. Click on OK on the file selection dialog. The batch processing interface should now have all these selected files, one in each row in the parameters table.
In the Output layer column, click on the button in the first row. A dialog for saving the file will be opened. Select a file path in your filesystem where you want to save the output files and type
converted.geojsonas the output filename. Click on OK and a new dialog like the one shown in the following screenshot will appear:
Select Fill with parameter values in the first field and Input layer in the second one. Click on OK. All the rows in the table will now have an output value, which was created using the entered filename as a prefix, followed by the name of the input layer.
To avoid layers being loaded after they are created, set the first cell in the Load into QGIS column to No. Then, double-click on the column header to automatically copy this value to all the rows below.
With the table already complete, you can launch the batch conversion process by clicking on Run. The GeoJSON files will be created in the specified paths.
The conversion is performed by an algorithm from the QGIS Processing framework. Processing algorithms can be run either as individual algorithms or, in this case, in a batch process.
Outputs of Processing algorithms can be created in all formats supported by QGIS. The format is selected using the corresponding extension in the filename and, unlike in the case of saving a single layer, does not have to be selected in a field or list. Using
geojson as the extension for your output files, you tell processing that you want to generate a file in this format.
Although the algorithm saves only the selected features of the layer, if there is no selection, it will use all the layer features. This is the default behavior of all algorithms in processing. As there is no selection in the layers that you have converted, all of their features will have been used.
When converting files this way, the additional options from the Save as dialog are not available, and the default configuration values are used.
You can also convert vector layers with another more complex algorithm from the Processing Toolbox menu, which allows you to enter the configuration parameters used by the underlying OGR library that takes care of the process. It's called Export vector. Find it in the toolbox, right-click on it, and select Execute as batch process:
Layers can be reprojected in a batch operation without having to enter parameters individually on the Save as dialog. This recipe shows you how to reproject a set of layers to a different CRS using an algorithm from the Processing Toolbox menu. You will see how to reproject all the files accompanying the
Davis_DBO_centerline.shp file that you reprojected in the Reprojecting a layer recipe.
In the filter box of the Processing Toolbox menu, type
Reprojectto filter the list of available algorithms. Locate the Reproject layer algorithm, right-click on it, and select Execute as batch process. The batch processing interface will be shown, as follows:
In the upper cell of the Input layer column, click on the ... button and select Select from filesystem. A file selector dialog will appear. Select the content of the
davisfolder in the dataset and add the files to the table.
In the first cell in the Target CRS column, click on the ... button. A CRS selector will appear. Select the
EPSG:26911CRS, as you did in a previous recipe when converting a single layer. Copy the value to the rest of rows in the column by double-clicking on the column header.
Set all the values in the Reprojected layer column. Select a file in the first cell, and then use the Fill with parameter value option to automatically fill the rest of rows.
Once the table is complete, click on Run to reproject the layers.
The reprojection algorithm is a part of the Processing framework, so you can select the output format by changing the output file extension. You can use this to not only reproject a set of input layers but to also convert their format, all in a single step.
Raster layers can also be reprojected with another algorithm from the Processing Toolbox menu named Warp (reproject). These inputs are rather similar to the ones in the reprojection tool for vector layers with some additional parameters that are specific to raster layers. Select the algorithm, right-click on it, and select Execute as batch process to run it and convert a set of raster layers.
SpatiaLite is a single file relational database that is built on top of the well-known SQLite database. It can store many layers of various types, including nonspatial tables. Interfaces to the format also allow the ability to run spatial queries of various kinds. It's a highly-flexible and portable format that is great for everyday use, especially when working on standalone projects or with only one user at a time. SpatiaLite works in a similar manner to PostGIS without the need to configure or run a database server.
Create a SpatiaLite database if you don't already have one and name it
cookbook.db. The easiest way to do this is with the Browser tab, as shown in the following screenshot:
Then, pick one of the following methods to importpick one of the following methods to import your data. The first option is faster, but the second option gives you more control over the import settings:
Import method 1—the fast method
In the QGIS Browser tab, find the layer that you want to copy to the database.
Drag and drop this layer on the Spatialite DB entry.
If you have a lot of files listed, this will be quite difficult as the browser doesn't scroll during the drag operation. You can optionally open a second browser window and drag the layer across. Also, note that this defaults to multi-type geometry. If you need to control the options, use the next method.
Open DB Manager from the Database menu.
Expand the Spatialite item to list your databases. Expand the database that you want to connect to.
Click on the following import layer icon:
A dialog will pop up, providing you with import options.
Select the layer to import from the drop-down list.
Fill in a name for the new table.
In most cases, the only thing left to do is check the Create spatial index checkbox.
If this works, great. Now, you can load the layer to the map and verify that it's identical to the input.
QGIS converts your geometry to a format that is compatible with SpatiaLite and inserts it, along with the attribute table. Afterwards, it updates the metadata tables in SpatiaLite to register the geometry column and build the spatial index on it. These two postprocesses make the database table appear as a spatial layer to QGIS and speed up the loading of data from the table when panning and zooming.
The import dialog contained a few other features that are often useful. You can reproject data as part of the import process if you want, or you can specify the projection if QGIS didn't detect it properly. You can also name the geometry column something different than the default,
geom; for example,
utmz10n83 (this is normally not recommended). You can specify the character encoding of the text in the event that it's not handled correctly.
You can even use the dialog to append data to an existing table; for example, you have multiple counties with the same data structure that come as two separate files, but you want them all in one layer.
If, for some reason, the layer didn't import the way that you want, delete it and redo the import. If you delete layers, make sure to learn how to vacuum the database to recover the now empty space in the file and shrink its total size (this is not automatic).
Look for the Vacuum option as a button in many graphical tools. If you don't see it, no worries, just run the SQL,
What happens if this fails? Databases can be really picky sometimes. Here are some common issues and solutions:
It could be character encoding (accents, non-Latin languages), which requires that you specify the encoding.
It could be picky about mixing multilayers with regular layers. Multilayers is when you have several separate geometries that are part of one record. For example, Hawaii is actually many islands. So, if you only have one row representing Hawaii, you need to cram all the island polygons into one geometry field. However, if you mix this with North Dakota, which is just a polygon, the import will fail. If you have this problem, you'll need to perform the import on the command-line using ogr2ogr and its newish feature,
-nlt PROMOTE_TO_MULTI, which converts all single items to multi-items to fix this.
Depending on your original source, you may have a mix of points, lines, and polygons. You'll either need to convert this to a Geometry Collection, or you need to split each type of geometry into a separate layer. Geometry Collections are currently poorly-supported in many GIS viewers, so this is only recommended for advanced users.
If you need more advanced settings or can't get the QGIS tool to work, you may need to use the QspatiaLite Plugin (install this with Manage Python Plugins under the Plugins menu), the spatialite-gui (download this from https://www.gaia-gis.it/fossil/spatialite_gui/index) application, or the ogr2ogr command line (this comes with QGIS, which is part of OSGeo4w shell on Windows, or the terminal on Mac or Linux).
PostGIS is the spatial add-on to the popular PostgreSQL database. It's a server-style database with authentication, permissions, schemas, and handling of simultaneous users. When you want to store large amounts of vector data and query them efficiently, especially in a multicomputer networked environment, consider PostGIS. This works fine for small data too, but many users find its configuration too much work when SpatiaLite may be better suited.
BostonGIS maintains a decent tutorial on installation for Windows, and getting a PostGIS set up for everyone. You can find this at http://www.bostongis.com/?content_name=postgis_tut01#316.
You should configure QGIS to be aware of your database and its connection parameters by creating a new database item in the PostGIS load dialog or by right-clicking on PostGIS in the Browser tab and selecting New Connection:
You can find more information about PostGIS at http://docs.qgis.org/2.8/en/docs/user_manual/working_with_vector/supported_data.html#postgis-layers.
Now that you can connect to a PostGIS database, you are ready to try importing data:
Open DB Manager from the Database menu.
Click on the following import layer icon:
A dialog will pop up, providing you with import options.
Select the layer to import from the drop-down list.
Fill in a name for the new table.
Check whether schema is set to public.
In most cases, the only thing left to do is check the Create spatial index checkbox:
QGIS converts your geometries to a format that is compatible with PostGIS, and inserts it, along with importing the attributes. Afterwards, it updates the metadata views in PostGIS to register the geometry column and build the spatial index on it. These two post-processes make the database table appear as a spatial layer to QGIS and speed up the loading of data from the table when panning and zooming.
The options presented in the dialog are not all the options that are available. If you need more control or advanced options present, you'll likely be looking at the command-line tools: shp2pgsql (a graphical plugin for pgadmin3 is available on some platforms) and ogr2ogr. The shp2pgsql tool generally only handles shapefiles. If you have other formats, ogr2ogr can handle everything that QGIS is capable of loading. You can also use these tools to develop batch import scripts.
To import large or complicated CSV or text files, you sometimes will need to use the pgadmin3 or psql command-line interface to Postgres.
Need even more control? Then, consider scripting. OGR and Postgres both have very capable Python libraries.
Another option is using the OpenGeo Suite plugin, which has some additional options, such as allowing importing multiple layers into a single table or into one table per layer. To learn more about this, including how to install it, refer to http://qgis.boundlessgeo.com/static/docs/intro.html.
What happens if this fails? Databases can be really picky sometimes:
It could be character encoding (accents, non-Latin languages), which requires specifying the encoding.
It could be picky about mixing multilayers with regular layers. Multilayers is when you have several separate geometries that are part of one record. For example, Hawaii is actually many islands. So, if you only have one row representing Hawaii, you need to cram all the island polygons into one geometry field. However, if you mix this with North Dakota that is just a polygon, the import will fail. If you have this problem, you'll need to perform the import on the command-line using ogr2ogr and its new feature,
-nlt PROMOTE_TO_MULTI, which converts all single items to multi-items, to fix this.
Depending on your original source, you may have a mix of points, lines, and polygons. You'll either need to convert this to a Geometry Collection, or you need to split each type of geometry into a separate layer. Geometry Collections are currently poorly supported in many GIS viewers, so this is only recommended for advanced users.
For more information on PostGIS installation and setup, refer to http://postgis.net/install.
For a more in-depth text on using PostGIS, there are many books available, including Packt Publishing's PostGIS Cookbook.