The previous sections of this chapter demonstrated the use of the SELECT statement to query rows from one table of a database. If the Transact-SQL language supported only such simple SELECT statements, the attachment of two or more tables to retrieve data would not be possible. Consequently, all data of a database would have to be stored in one table. Although the storage of all the data of a database inside one table is possible, it has one main disadvantage—the stored data are highly redundant.
Transact-SQL provides the join operator, which retrieves data from more than one table. This operator is probably the most important operator for relational database systems, because it allows data to be spread over many tables and thus achieves a vital property of database systems—nonredundant data.
The join operator is applied to base tables and views. In this chapter, joins between base tables are discussed, while joins concerning views will be discussed in Chapter “Views”.
There are several different forms of the join operator. This section discusses the following fundamental types:
- Natural join
- Cartesian product (cross join)
- Outer join
- Theta join, self-join, and semi-join
Before explaining different join forms, this section describes the different syntax forms of the join operator.
Two Syntax Forms to Implement Joins
To join tables, you can use two different forms:
- Explicit join syntax (ANSI SQL:1992 join syntax)
- Implicit join syntax (old-style join syntax)
The ANSI SQL:1992 join syntax was introduced in the SQL92 standard and defines join operations explicitly—that is, using the corresponding name for each type of join operation. The keywords concerning the explicit definition of join are
- CROSS JOIN
- [INNER] JOIN
- LEFT [OUTER] JOIN
- RIGHT [OUTER] JOIN
- FULL [OUTER] JOIN
CROSS JOIN specifies the Cartesian product of two tables. INNER JOIN defines the natural join of two tables, while LEFT OUTER JOIN and RIGHT OUTER JOIN characterize the join operations of the same names, respectively. Finally, FULL OUTER JOIN specifies the union of the right and left outer joins. (All these different join operations are explained in the following sections.)
The implicit join syntax is “old-style” syntax, where each join operation is defined implicitly via the WHERE clause, using the so-called join columns (see the second statement in Example 6.57).