The UNION is used to combine two or more separate queries by placing it between them to return a single result. The syntax of UNION operator is as below:
SELECT col_1, col_2, ....
UNION [ALL | DISTINCT]
SELECT col_1, col_2, ....
[UNION [ALL | DISTINCT]
SELECT col_1, col_2, ....
Take an example:
SELECT * FROM table_1 ....
UNION ALL
SELECT * FROM table_2 ....
UNION ALL
SELECT * FROM table_3 ...
UNION is useful under the following circumstances:
- You have similar information in multiple tables and you want to retrieve rows from all of them at once.
- You want to select several sets of rows from the same table, but the conditions that characterize each set aren't easy to write as a single WHERE clause. UNION allows retrieval of each set with a simpler WHERE clause in its own SELECT statement; the rows retrieved by each are combined and produced as the final query result.
Things to note about UNION Operator
- The column names kept in first SELECT statement are used as the column names for the results returned. In other words, all the rows from the first query are turned first, followed by all the rows from the second query.
- Selected columns listed in corresponding positions of each SELECT statement should have the same data type.
- ALL and DISTINCT keywords are optional. By default UNION uses DISTINCT option. If no one key is given the UNION uses DISTINCT and remove all duplicate rows from the result.
- In the case of optional ALL keyword, removal of duplicate rows does not occur and the result has all matching rows from all the SELECT statements.
- The terminating semicolon (;) must only appear once, after the final query.
Sorting a UNION
To specify a sort order on a UNION, you must use the column name from the first query in the ORDER BY Clause. View the following example:
SELECT first_name FROM buyers
UNION DISTINCT
SELECT second_name FROM buyers ORDER BY first_name;
ORDER BY and LIMIT clauses can be used to sort or limit a UNION result set as a whole. To do this, surround each SELECT with parentheses and then add ORDER BY or LIMIT after the last parenthesis. Columns named in such an ORDER BY should refer to columns in the first SELECT of the statement. (This is a consequence of the fact that the first SELECT determines the result set column names.) The following statement sorts the result of the UNION by email address and returns the first 10 rows of the combined result:
(SELECT subscriber, email FROM list1) UNION
(SELECT name, address FROM list2) UNION
(SELECT real_name, email FROM list3) ORDER BY email LIMIT 10;
ORDER BY and LIMIT clauses also can be applied to individual SELECT statements within a UNION. Surround each SELECT with parentheses and add ORDER BY or LIMIT to the end of the appropriate SELECT. In this case, an ORDER BY should refer to columns of the particular SELECT with which it's associated. (Also, although LIMIT may be used by itself in this context, ORDER BY has no effect unless combined with LIMIT. The optimizer ignores it otherwise.) The following query sorts the result of each SELECT by email address and returns the first five rows from each one:
(SELECT subscriber, email FROM list1 ORDER BY email LIMIT 5) UNION
(SELECT name, address FROM list2 ORDER BY address LIMIT 5) UNION
(SELECT real_name, email FROM list3 ORDER BY email LIMIT 5);