Itís sometimes difficult to know which SQL syntax to use when combining data that spans multiple tables. The following are a few of the more frequently used methods for consolidating queries on multiple tables into a single statement.
A D V E R T I S E M E N T
A simple SELECT statement is the most basic way to query multiple tables. More than one table can be called in the FROM clause to combine results from multiple tables. Hereís an example of how this works:
SELECT table1.column1, table2.column2 FROM table1, table2 WHERE table1.column1 = table2.column1;
In this example, dot notation (table1.column1) has been used to specify which table the column comes from. If the column in question only appears in one of the referenced tables, the fully qualified name need not be included, but it may be useful to do so for readability.
Tables are separated in the FROM clause by commas. As many tables as needed can be included, although some databases have a limit to what they can efficiently handle.
This syntax is, in effect, a simple INNER JOIN. Some databases treat it exactly the same as an explicit JOIN. The WHERE clause tells the database which fields to correlate, and it returns results as if the tables listed were combined into a single table based on the provided conditions. The conditions for comparison need not necessarily be the same columns as the result set. In the example above, table1.column1 and table2.column1 are used to combine the tables, but table2.column2 is returned.
This functionality can be extended to more than two tables using AND keywords in the WHERE clause.
A combination of tables can be used to restrict the results without actually returning columns from every table. In the example below, table3 is matched up with table1, but nothiong actually has been returned from table3 for display. It merely checks to make sure the relevant column from table1 exists in table3. Also table3 needs to be referenced in the FROM clause for this example.
SELECT table1.column1, table2.column2 FROM table1, table2, table3 WHERE table1.column1 = table2.column1 AND table1.column1 = table3.column1;
This method of querying multiple tables is effectively an implied JOIN. Different databases handle things differently, depending on the optimization engine it uses. Also, neglecting to define the nature of the correlation with a WHERE clause can give undesirable results, such as returning the rogue field in a column associated with every possible result from the rest of the query, as in a CROSS JOIN. A simple SELECT statement is sufficient to combine two or just a few tables.
JOIN works in the same way as the SELECT statement aboveóit returns a result set with columns from different tables. The advantage of using an explicit JOIN over an implied one is greater control over the result set, and possibly improved performance when many tables are involved.
There are several types of JOINóLEFT, RIGHT, and FULL OUTER; INNER; and CROSS. The type of JOI to be used is determined by the results to be returned. For example, using a LEFT OUTER JOIN will return all relevant rows from the first table listed, while potentially dropping rows from the second table listed if they donít have information that correlates in the first table.
This differs from an INNER JOIN or an implied JOIN. An INNER JOIN will only return rows for which there is data in both tables. The following JOIN statement is equivalent to the first SELECT query above:
SELECT table1.column1, table2.column2 FROM table1 INNER JOIN table2
ON table1.column1 = table2.column1;
Subqueries, or subselect statements, are a way to use a result set as a resource in a query. These are often used to limit or refine results rather than run multiple queries or manipulate the data in an application. With a subquery, tables can be referenced to determine inclusion of data or, in some cases, return a column that is the result of a subselect.
The following example uses two tables. One table actually contains the data to be returned, while the other gives a comparison point to determine what data is actually required.
SELECT column1 FROM table1 WHERE EXISTS ( SELECT column1 FROM table2 WHERE table1.column1 = table2.column1);
One important factor about subqueries is performance. Convenience comes at a price and, depending on the size, number, and complexity of tables and the statements to use, one may want to allow an application to handle processing. Each query is processed separately in full before being used as a resource for your primary query. If possible, creative use of JOIN statements may provide the same information with less lag time.
The UNION statement is another way to return information from multiple tables with a single query. The UNION statement allows to perform queries against several tables and return the results in a consolidated set, as in the following example.
SELECT column1, column2, column3 FROM table1 UNION SELECT column1, column2, column3 FROM table2;
This will return a result set with three columns containing data from both queries. By default, the UNION statement will omit duplicates between the tables unless the UNION ALL keyword is used. UNION is helpful when the returned columns from the different tables donít have columns or data that can be compared and joined, or when it prevents running multiple queries and appending the results in your application code.
If the column names donít match while using UNION statement, aliases are used to give results meaningful headers:
SELECT column1, column2 as Two, column3 as Three FROM table1 UNION SELECT column1, column4 as Two, column5 as Three FROM table2;
As with subqueries, UNION statements can create a heavy load on the database server, but for occasional use they can save a lot of time.
When it comes to database queries, there are usually many ways to approach the same problem. These are some of the more frequently used methods for consolidating queries on multiple tables into a single statement. While some of these options may affect performance, practice will help to know when itís appropriate to use each type of query.