TOP

SQL-Lesson 11. Advanced combination of tables (OUTER JOIN)

YouLibreCalc for Excel logo

In the previous section, we considered the simplest ways to combine tables - using WHERE and INNER JOIN sentences. These combinations are called internal combinations or equivalence combinations. However, SQL has in its arsenal many more options for joining tables, namely, there are also other types of joins: outer joins, natural joins, and self-joins. But first, let's consider how we can assign aliases to tables, since later we will be forced to use full names of fields (Table.Field), which without abbreviations will be very difficult to operate due to their large length.


1. Use of table aliases (Aliases)

In the previous section, we learned how aliases can be used to refer to specific table fields or calculated fields. SQL also allows us to use aliases instead of table names. This gives us advantages such as shorter SQL syntax and allows us to use the same table multiple times in the SELECT statement.

Run SQLSELECT Seller_name, SUM(Amount) AS Sum1 
FROM Sellers AS S, Sumproduct AS SP 
WHERE S.City = SP.City 
GROUP BY Seller_name

We displayed the total amount of sold goods for each seller. In our SQL query, we used the following aliases: for the calculated field SUM(Amount) the alias Sum1, for the table Sellers the alias S and for Sumproduct the alias SP. Note that table aliases can also be used in other sentences, such as ORDER BY, GROUP BY and others.

2. Self-connection (SELF JOIN)

Let's consider an example. Suppose we need to know the address of sellers who trade in the same country as John Smith. To do this, we will create the following request:

Run SQLSELECT Address, City, Country, Seller_name
FROM Sellers
WHERE Country = (SELECT Country 
                 FROM Sellers 
                 WHERE Seller_name = 'John Smith')

Also, we can solve this problem through self-connection by writing the following code:

Run SQLSELECT S1.Address, S1.City, S1.Country, S1.Seller_name
FROM Sellers AS S1, Sellers AS S2 
WHERE S1.Country = S2.Country AND S2.Seller_name = 'John Smith'

Pseudonyms were used to solve this problem. The first time the table Sellers was assigned the alias S1, the second time - the alias S2. These aliases can then be used as table names. In the WHERE operator, we add the prefix S1 to the name of each field, so that the DBMS understands the fields of which table should be output (since we made two virtual tables from one table). The WHERE clause first joins the tables and then filters the data of the second table by the Seller_name field to return only the necessary values.

Self-joins are often used to replace subqueries that select data from the same table as the outer operator SELECT. Although the end result is the same, many DBMSs process joins much faster than subqueries. It is worth experimenting to determine which query works faster.

3. Natural combination

A natural join is a join in which you select only those columns that do not repeat. This is usually done by writing (SELECT *) for one table and specifying a list of fields for the rest of the tables. Example:

Run SQLSELECT SP.*, S.Country
FROM Sumproduct AS SP, Sellers AS S 
WHERE SP.City = S.City

In this example, the wildcard (*) is used only for the first table. All other columns are specified explicitly, so duplicate columns are not selected.

4. External combination (OUTER JOIN)

Usually, when merging, the rows of one table are linked with the corresponding rows of another table, however, in some cases, it may be necessary to include in the result rows that do not have related rows in another table (that is, all rows from one table are selected and only added related lines from another). A connection of this type is called external. For this, the keywords OUTER JOIN ... ON ... with the prefix LEFT or RIGHT are used.

Let's consider an example, having previously added a new seller - Semuel Piter, who does not yet have sales, to the Sellers table:

Run SQLSELECT Seller_name, SUM(Quantity) AS Qty
FROM Sellers2 
LEFT OUTER JOIN Sumproduct 
ON Sellers2.City = Sumproduct.City
GROUP BY Seller_name

With this request, we extracted the list of all sellers in the database and calculated the total amount of goods sold for them for all months. We see that there are no sales for the new seller Semuel Piter. If we used an inner join, then we would not see the new seller, because he has no records in the Sumproduct table. We can also change the direction of the combination not only by writing LEFT or RIGHT, but also by simply changing the order of the tables (that is, the following two records will give the same result: Sellers LEFT OUTER JOIN Sumproduct and Sumproduct RIGHT OUTER JOIN Sellers).

Also, some DBMSs allow external joins on a simplified record using the signs *= and =*, which corresponds to LEFT OUTER JOIN and RIGHT OUTER JOIN, respectively. Thus, the previous query could be rewritten as follows:

SELECT Seller_name, SUM(Quantity) AS Qty
FROM Sellers, Sumproduct
WHERE Sellers.City *= Sumproduct.City

Unfortunately MS Access does not support a shortened record for an outer join.

5. Full outer join (FULL OUTER JOIN)

There is also another type of outer join - a full outer join, which displays all rows from both tables and joins only those that can be related. The syntax of a complete outer join is as follows:

SELECT Seller_name, Product
FROM Sellers 
FULL OUTER JOIN Sumproduct 
ON Sellers.City = Sumproduct.City

Again, full outer join is not supported by the following DBMS: MS Access, MySQL, SQL Server, and Sybase. How to get around this injustice, we will consider in the next section.