In this chapter, you will learn the foundation of Geographic Information System (GIS) and spatial data. Although you do not need to understand these subjects in great depth to take advantage of the features of GeoServer, we will give you the basic information required to understand what you will be learning in this book. This chapter will introduce you to the magic of spatial data and processing.
In this chapter, we will cover the following topics:
By the end of this chapter, you will have the basic skills to identify which spatial data format best suits your needs.
Since you were a kid at school, you have been exposed to many maps: maps of countries where you spent hours memorizing the boundaries, rivers, and capitals; historical maps, with the rise and fall of ancient empires, where you dreamed of being a great conqueror; economics maps, with the locations and amounts of goods and services. Every day on newspapers, on TV, or, in a far more accurate way, in books and academic papers, you look at data represented on a map. Maps are a spatial representation of data and are often the main output of a GIS.
GIS is an acronym for Geographic Information System. Does it sound too complicated to you? Do not be afraid; it is not so different from many other systems to manage the information you probably already know. The main difference is the spatial component of information. All the data contained in a GIS has a spatial dimension or a link to another object with spatial attributes.
So what is GIS? In a nutshell, we can define it as a system to acquire and store, process, and produce data representations, that is, maps. In this book, you will learn that working with GeoServer requires you to prepare your data, process it to render in a beautiful map, and build up a set of functions that enable a user to interact with your data. So, building up a GeoServer instance may be described as GIS-building.
A detailed understanding of GIS is far beyond the scope of this book, and it is not required to start with GeoServer. However, you will need to have some basic skills in spatial data, maps, and spatial reference systems.
If you want to dig deeper into the topic, there is a lot of online material available. A couple of excellent sources of information are: https://www.ordnancesurvey.co.uk/support/understanding-gis/ and http://www.esri.com/what-is-gis
Let's go; we will turn you into a neo-cartographer!
Spatial data is the foundation of any GIS. You know that a building is likely to fall down unless it is sitting atop a strong foundation. So, you need to understand spatial data or you will be producing a poor map output.
Then what is spatial data in simple words? Let us start considering, from a general point of view, what a piece of spatial information is. Each description of an object contains a reference to its position on the Earth's surface. Although this is not a rigorous formal definition, it reminds you the mandatory requirements for any spatial data. Any spatial data should contain enough information, irrespective of its format, for determining where it is located on the earth's surface. For now we are fine with this simplistic definition.
Think of some lists of familiar objects:
- A list of bookshops with addresses
- A list of places you visited during your trips
- A list of points of interest, for example, restaurants, museums, and hotels you collected with your mobile phone
- An aerial photo with a view of a city, where you can recognize notable places
You can say where each element is located in a more or less precise way. They are real objects represented with spatial data. As you may have noted, spatial information is represented in quite a heterogeneous way. Most people are able to recognize spatial information in any group from the previous list. Unfortunately, GIS software and GeoServer are an exception to this and tend to prefer a strongly structured piece of information. If you are using your spatial data with GeoServer, you need to organize it more accurately. We will talk specifically about GeoServer's data connectors in Chapter 4, Adding Your Data, but, for now, it is important that you understand how spatial data is commonly organized and stored. As you keep on making maps, you will deal with lots of different spatial data.
Spatial data are references for an object's position on the Earth's surface. How can you measure and store them in a numeric format? An elementary model of the Earth could be a sphere. On a sphere's surface, you can measure positions with angular units called latitude (ϕ) and longitude (λ). Latitude measures the angle between the equatorial plane and a line that passes through that point, and is normal to the surface; whereas, longitude measures the east or west angle from a reference meridian (for example, the one passing through Greenwich observatory) to another meridian that passes through that point. Angular measures can be expressed in decimal degrees or in degrees, minutes, and seconds.
If you want to store the location of the Statue of Liberty, you can express it in the decimal degree form, as shown here:
Alternatively, you can use the degrees, minutes, and seconds form as follows:
Lat. 40° 41′ 21″ N, Long. 74° 2′ 40″ W
In the decimal degrees form, you don't need to indicate the North, South, West, or East direction; this is represented from the plus/minus sign (+/-). The positive latitude is for the North direction and the positive longitude is for the East direction.
Consider the image of the model of the Earth given as follows:
(Image from http://en.wikipedia.org/wiki/Latitude)
We normally think of the Earth as a sphere, but this is not its real shape. Geodesy, the science studying the shape of the Earth, defines the Earth, as represented by a geoid, an ideal surface defined by the level of the sea if oceans were to cover all of Earth. For practical purposes, as in projections, the geoid is too complicated to use, and so the Earth is defined by an ellipsoid. The ellipsoid is described by its semi-major axis (equatorial radius) and flattening.
Does it sound a little bit complicated? Do not be afraid and explore locations on Earth with latitude and longitude coordinates. In the following table, there are a few famous places with coordinates in decimal degrees. Point your browser to http://maps.google.com, insert coordinates in the search textbox, and then press Enter. Your map will shift to the location.
Google Maps enables you to query for coordinates of any place on Earth; find that function and look for some great places.
Colorado Grand Canyon, USA
Iguazú National Park, Argentina
Ayers Rock, Australia
Did you ever play with an orange peel? I did it a lot when I was a child, often pressing them in the hope to flatten it almost perfectly. It's a hopeless challenge, but kids are stubborn and ambitious. Many years later, I found a similar analogy in a geography book. It was about cartographic projection and used an orange as a model of the Earth. If you think of the orange's peel as the Earth's surface, it is suddenly clear why you can't have a planar representation of Earth's surface without a great amount of distortion.
All the maps you will ever find are on a plain paper sheet. Curved digital screens are quite uncommon in GeoGeek's nests. So, how do cartographers represent a curved surface on a plane? This is done by means of a mathematical operation called projection. Consider the following image:
Indeed, there are several different projections developed in the last few centuries by cartographers and mathematicians. There is no mathematical method to transfer a sphere or an ellipsoid to a two-dimensional space without distortion. Hence, projections modify the data and include some deformations about lengths, areas, or shapes you can observe and measure on maps.
We can classify projections according to the geographical features and properties they preserve, as shown here:
- Conformal projections preserve angles locally. Meridian and parallels intersect at 90-degree angles.
- Equal-area projections preserve proportions between areas. In a map with equal-area projections, each part has the same proportional area as the corresponding part of the Earth.
- Equidistant projections maintain a scale along one or more lines, or from one or two points to all other points on the map. Lines along which the scale (distance) is correct are of the same proportional length as the lines they refer to on the globe.
It is important that you understand there is no best projection; choosing one for your map is a trade-off. According to the portion of the earth's surface, the map that you are designing will contain and/or use the projections that suit best. Let's explore some widely-used projections.
You learned about Earth's shape and projection. Coordinate systems use these concepts to build a frame of reference to place objects on the Earth's surface. There are two types of coordinate systems: projected coordinate systems and geographic coordinate systems. Let's understand these as follows:
- Geographic coordinate systems: These use latitude and longitude as angles measured from the Earth's center, as we saw previously. A geographic coordinate system is substantially defined by the ellipsoid used to model the Earth, and the position of the ellipsoid positioned relative to the center of the Earth called the datum.
- Projected coordinate systems: These are defined on a flat two-dimensional surface. A projected coordinate system is always based on a geographic coordinate system; hence, it uses an ellipsoid and a datum. Besides, a projected coordinate system includes a projection method to project coordinates from the Earth's spherical surface onto a two-dimensional Cartesian coordinate plane.
Commonly known as UTM, this is not really a projection. It is a system based on the Transverse Mercator projection. This projection uses a cylinder tangent to a meridian to unwrap the Earth's surface. A maximum of 5° of distortion from the central meridian is acceptable. The UTM splits the world into a series of 6° of longitudinal-wide zones. As you may guess, there are 60 zones numbered from Longitude 180W toward the east. Note that you cannot have a map representing more than one UTM zone. Indeed, UTM is well suited for large-scale maps. Consider the following image:
Web Mercator is a projection derived from Transverse Mercator. It maps ellipsoidal latitude and longitude coordinates onto a plane using Spherical Mercator equations. This projection was popularized by Google in Google Maps, and it is now widely used in online mapping systems. It stretches areas in a north-south direction and, unlike the Transverse Mercator, it is not conformal. Consider the following image:
A spatial reference system identifier is a code to easily reference a spatial reference system (SRS). An SRS contains parameters about projection, ellipsoid, and datum. It can be defined using the Open Geospatial Consortium's (OGC) well-known text (WKT) representation. The SRS for the geographic WGS84 reference system is as follows:
GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.01745329251994328, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4326"]]
The last line contains the number
4326; this is the SRID uniquely identifying this SRS. The long form should also contain the authority, that is
EPSG:4326, but you will often find it indicated only by the number.
EPSG is the acronym for European Petroleum Survey Group. Several European Oil companies founded it in 1986 to collect and maintain geodetic information. In 2005, EPSG was absorbed by OGP (an international forum for Oil and Gas producers) that formed the OGP Geomatics Committee. The committee maintains the registry and publishes it as a public web interface or a downloadable database.
We described a couple of common and widely used SRSs, but there are a lot of them. There are several archives on the internet where you can find detailed information about SRSs and their elements, that is ellipsoids, datums, unit of measurements, projected, or geographic reference systems. One of the most authoritative and complete data sets is the EPSG Geodetic Parameter Registry. If you are curious about it, you can open your browser and point it to http://epsg-registry.org. Then, try a simple search by inserting a location name in the Area textbox as shown in the following screenshot:
There are two main approaches when building a spatial database: modeling vector data or raster data. Vector data uses a set of discrete locations to build basic geometrical shapes, such as points, polylines, and polygons. This is shown in the following image:
Of course, real objects are neither a point, nor a polyline or a polygon. In your model, you have to decide which basic shape better suits the real object. For example, a town can be represented as a point if you draw a map of the world with the countries' capitals shown. On the other hand, if you publish a countries map, a polygon will enable you to draw the city boundaries to give a more realistic representation.
The simpler geometric object is a point. Points are defined as single coordinate pairs (x,y) when we work in two-dimensional space, or coordinate triplets (x,y,z) if you want to take account of the height coordinates. In the following examples, we use point features to store the location of active volcanoes:
Name of volcano
Did you guess the units and projections used? The coordinates are in decimal degrees and SRS is WGS84 geographic, that is,
Points are simple to understand but do not give you many details about the spatial extent of an object. If you want to store rivers, you need more than a coordinate pair. Indeed, you have to memorize an array of coordinate pairs for each feature in a structure called polyline shown as follows:
Colorado; (40.472 -105.826, ... , 31.901 -114.951) Nile; (-2.282 29.331, ... , 30.167 31.101) Danube; (48.096 8.155, ... ,45.218 29.761)
If you need to model an area features such as an island, you can extend the polyline object, adding the constraint that it must be closed; that is, the first and the last coordinate pairs must be coincident. This is the polygon shape:
Ellis Island; (-74.043 40.699, -74.041 40.700, -74.040 40.700, -74.040 40.701, -74.037 40.699, -74.038 40.699, -74.038 40.698, -74.039 40.698, -74.041 40.700,-74.042 40.699, -74.040 40.698, -74.042 40.696, -74.044 40.698, -74.043 40.699)
The feature model used in GIS is a little bit more complex than what we have discussed. There are some more constraints regarding vertex ordering, line intersections, and areal shapes with holes. Different GIS specify several different rules, often in proprietary formats. Open Geospatial Consortium (OGC) defined a standard for simple features, and, lately, most systems, open source firstly, are compliant with it. If you are curious about it, you can point your browser at http://www.opengeospatial.org/standards/is and look for
The OpenGIS® Simple Features Interface Standard.
Raster data uses a regular tessellation, defining cells where one or more values are uniform. Usually, the cells are square; although, this is not a constraint. Raster data is generally used to represent values continuously changing in the space, that is, a field. You can use a regular tessellation to build a digital elevation model of the Earth's surface. In the following figure, each cell has a height and width of 20 meters, and the value stored is the height above sea level in meters:
Can you use raster data to model real features, such as a river? Yes, you can, but there are some drawbacks you have to consider. The following figure shows a linear feature represented as vector data (the red line) and as raster data (the black and white cells). If your purpose is drawing the shapes on a map, raster data is not a good choice, as raster graphics are resolution dependent. They cannot scale up to an arbitrary resolution without the apparent loss of quality.
In the previous sections, we explored spatial data and SRS. They are the key elements you need to build your map. Indeed, maps are a planar representation of spatial data. You need to collect the appropriate data to represent the real objects you want to include in your map, and you need to choose an SRS to organize your data onto the map.
Keep in mind that maps are representations, a proposition of yours. They are the way you express your knowledge and your vision of the world. To fully accomplish this, there is a third basic ingredient for your map: symbology.
Symbology enables you to add information to the features shown on a map. For example, colors can be used to indicate a classification of roads. Imagine you need to produce a map of a country with a road network. You have a vector dataset containing road polylines. A simple approach is to render all features with the same symbol, as shown in the following figure. The map is not really informative unless you are a transportation expert. You won't extract any information from the map and it looks ugly too.
Let's take a look at a similar map produced with ArcGIS Online (http://www.esri.com/software/arcgis/arcgisonline).
It contains the road network symbolized with different colors and line widths, labels showing you highway codes, and major towns represented with small circles and labels. Besides, there is a background depicting heights with colors and shading. Does it now look more familiar to you?
In Chapter 6, Styling Your Layers, you will learn how to apply symbols in GeoServer to produce maps like the previous one. For now, you need to familiarize yourself with simple and thematic maps.
- Open your browser and go to http://www.openstreetmap.org.
- The website offers you a small scale map centered on your actual location, as derived from the browser information:
- Center your map on London, UK, and zoom in with the tool shown on the left-hand side. You can see that many more road types and locations are now shown in this map:
- Now, enter the
Piccadilly Circus, London, UKaddress in the
Searchtextbox on the left and click on the
Gobutton. A list of results matching your search is presented on the left side of the map. Pick the first item:
- The map is now at a great scale (look at the scale bar on the bottom-left of the map panel) and the symbols changed to show you greater detailed information about roads and locations. You can find street names, directions for car traffic, buildings' footprints, and icons for points of interest. The general look and feel resemble a printed city map you can pick up at tourist offices.
OpenStreetMap does not require you to register for browsing or exporting the data. Anyway, if you are interested in maps and open source data, you may consider getting involved in the project. OSM is a collaborative project to create a free editable map of the world, currently involving over half a million users all around the world. You may add data or find errors on locations you know well.
You explored several maps representing the same data set in quite different ways. Different symbols and hiding subsets of data are powerful tools to produce clear and nice looking maps. You are now ready to discover a different kind of map.
In the previous paragraphs, we encountered some simple maps. Geographers define these kinds of maps as general maps. General maps focus on the description of the physical, political, and human features on the territory. All this data is portrayed for its own sake. In a nutshell, it can be said that general maps tell you where objects are located on the Earth's surface, while thematic maps talk about things happening on the Earth's surface. Thematic maps focus on displaying a single topic and portray spatial distribution and variation. You have general data, such as administrative boundaries or road networks, but this is represented as a base layer for general reference.
Among thematic maps, those using choropleth or dot representations, are by far the most common type you will be using GeoServer for.
Choropleth maps show statistical data aggregated over predefined regions, such as counties or states, by coloring or shading these regions. You can draw states according to their population, gross domestic product, car owners, and the number of national parks. You are not limited to a single variable; indeed, you can merge different values from more than one attribute associated with spatial objects.
The following figure shows a map of European countries colored according to gross domestic product values. Legend on the right shows the five classification intervals. Values were normalized to Eu-27 average (EU stands for the European Union, in the period 2007-2013 when it had 27 countries):
(Image courtesy of http://epp.eurostat.ec.europa.eu)
In proportional maps, symbols of different sizes represent data associated with different areas or locations within the map. As an example, the countries' capitals can be represented with a circle proportional to their population:
This map contains a representation of European countries. They are drawn all using the same symbology. The information is pointed out by the circles, a nongeographical feature, with a radius proportional to the residents. For the reader convenience there are also some labels, but he may also guess the name of the capital from its position.
Are you ready to build some maps? We can do this without the use of GeoServer since we have not yet discussed how to install it; we will cover that in the next chapter. For the moment we will play with an online map engine to assist your understanding of thematic map concepts:
The World Bank is an international financial institution that provides loans to countries of the world for capital programs. It also distributes a lot of social and economic data under an open data license. The data used in this section is available at http://datacatalog.worldbank.org/.
- To build the thematic map, we will use an online engine. Although it's built on open source software, it's a commercial solution. You need to register to use it, but, for the purpose of this section, and for other small maps you may want to create, you can use the free of charge account. Point your browser to https://carto.com/:
- Click on the
Sign uplink from the home page and complete your application for a free of charge account. After signing up, log in to Carto and you will arrive at the front dashboard, the starting point for building your maps:
- Select the
WorldBank.csvfile and drag it on the dashboard to create your first map. The engine will process your data, trying to georeference it, and then a new map will be shown for you:
- The map you just created does not seem interesting. All the countries use the same orange symbol, what about the economic data from World Bank? Locate the toolbar in the right part of the user interface and press the symbol with a paintbrush; this will show you a custom interface to change the rendering of your data:
- Select the choropleth category and leave the other setting at default. Now your map shows the countries with a color ramp, according to the GDP value. You can explore the setting; try to change the classification and the color ramp used:
You built a brand new thematic map, selecting data and symbol colors. You will need to set these parameters exactly in GeoServer to produce beautiful maps. This time we did it without exploring the technical details behind feature rendering. In Chapter 6, Styling Your Layers, you will learn how to use SLD (Styled Layer Descriptor) to make thematic maps.
We had a brief but complete introduction to spatial data and maps in this chapter. It was somewhat a theoretical chapter, but we promise you it was the first and last of this kind! From now on, we will run real stuff with GeoServer.
Specifically, you learned how an object is referenced to its location and which storage models you can use with spatial data (for example, vector versus raster) and, eventually, you learned to represent spatial features on a map.
You are now ready to pick up GeoServer, unpack, and install it on your computer.