Using the Union join
The PostgreSQL UNION clause is used to combine the results of two or more SELECT statements without returning any duplicate rows.
The basic rules to combine two or more queries using the UNION join are as follows:
- The number and order of columns of all queries must be the same
- The data types of the columns on involving table in each query must be same or compatible
- Usually, the returned column names are taken from the first query
By default, the UNION join behaves like DISTINCT, that is, eliminates the duplicate rows; however, using the ALL keyword with the UNION join returns all rows, including the duplicates, as shown in the following example:
SELECT <column_list> FROM table WHERE condition GROUP BY <column_list> [HAVING ] condition UNION SELECT <column_list> FROM table WHERE condition GROUP BY <column_list> [HAVING ] condition ORDER BY column list;
The queries are all executed independently, but their output is merged. The Union operator may place rows in the first query, before, after, or in between the rows in the result set of the second query. To sort the records in a combined result set, you can use ORDER BY.
Let's consider an example where you combine the data of customers belonging to two different sites. The table structure of both the tables is the same, but they have data of the customers from two different sites:
SELECT customer_id,customer_name,location_id FROM customer_site1 UNION SELECT customer_id,customer_name,location_id FROM customer_site2 ORDER BY customer_name asc;
Both the SELECT queries would run individually, combine the result set, remove the duplicates (as we are using UNION), and sort the result set according to the condition, which is customer_name in this case.