SQL JOIN Tables

Section 1

SQL JOIN Tables

Learning Objectives: This presentation explains how to use SQL JOINS with syntax, visual illustrations, and examples. By the end of this lesson, learners will be able to:

  • Describe the JOIN Predicate.

  • Demonstrate the inner JOIN, left JOIN, right JOIN, self JOIN, cross JOIN, and Union.

Prerequisite

Note: We will utilize a “classicmodels” database for demonstrations and examples in this presentation.

Table of Contents

  • Overview of SELF JOIN

  • Overview of FULL JOIN

  • Overview of Union operator

  • UNION vs. JOIN

  • Joining without JOIN

  • Overview of Joining Tables

  • Types of Joins

  • General JOIN Syntax

  • Overview of INNER JOIN clause

  • Syntax of the INNER JOIN

  • Avoid ambiguous column error in JOIN

    • Challenge for learners

  • Overview of LEFT JOIN

  • Using LEFT JOIN

  • Syntax of the LEFT JOIN

  • Overview of RIGHT JOIN

  • Overview of CROSS JOIN

Overview of Joining Tables

!https://i.imgur.com/8UcGQjT.png

Question: We have used SELECT statement to query data from a single table. How do we query data from multiple tables that are related by one or more primary key/foreign key relationships? For example, in the database diagrammed, there are nine tables; all are related, either directly or indirectly. Some of the connections go through junction tables, defining many-to-many relationships. Answer: A JOIN predicate is used to combine rows from two or more tables, based on a related column between them.

Types of JOINS

  • JOIN predicate allow us to walk through the relationships between two or more tables in the FROM clause.

  • JOINS are queries that combine the data of multiple tables based on their common columns (primary key and foreign key) and constraints to produce a combined result set.

  • The following JOINS are supported by MySQL:

    1. Inner JOIN

    2. Left JOIN

    3. Right JOIN

    4. Cross JOIN

    5. Self JOIN Note that MySQL does not support the full Outer JOIN.

General JOIN Syntax

Select <column-names>
FROM <left table> [join type] JOIN <right table> ON <join predicate> ....

The Join predicate often requires us to use table aliases to distinguish field names. The join predicate is a boolean expression, specifying criteria for matching rows between two or more tables. Most often, the predicate is based on key relationships, but other boolean expressions can be used.

Overview of INNER JOIN Clause

  • The INNER JOIN clause joins two or more tables based on a condition, which is known as a join predicate.

  • For inner joins, rows from the left and right tables will appear in the output if and only if they both satisfy the join predicate.

  • The INNER JOIN includes only matching rows from both tables.

Visual illustration

select a.value, b.value from TableA a join TableB b on a.id=b.id;

!https://i.imgur.com/7BnfqKg.png

Syntax of the INNER JOIN

The following shows the basic syntax of the “INNER JOIN” clause that joins tables: table_1 and table_2.

SELECT
column_list
FROM
INNER JOIN   ON join_condition;
table_1
table_2

The INNER JOIN clause compares each row from the first table with every row from the second table. When the join predicate is based on equality between two columns with the same name, SQL gives us a shortcut. You can use the USING clause instead as shown below:

SELECT
column_list
FROM
INNER JOIN   USING (column_name);
table_1
table_2

Both queries will give the identical result.

Ambiguous Column Errors in JOIN

  • If you join multiple tables that have the same column name, you have to use a table qualifier to refer to that column in the SELECT statement to avoid ambiguous column errors.

  • For example, if both and Table2 have the same column named City in the SELECT statement City column using the table qualifiers as Table1.City or Table2.City

  • To save time typing the table qualifiers, you can use table aliases in the query. For example, you can give the verylongtablename table an alias T and refer to its columns using T.column instead of verylongtablename.column.

Example 1: INNER JOIN

If you want to get the product code and product name from the products table, or the text description of product lines productlines table, you need to select data from both tables and match rows by comparing the from the productline column from the products table with the productline column from the productlines table, as shown in the following query.

SELECT  T1.productCode,  T1.productName,  T2.textDescription
FROM  products T1
INNER JOIN productlines T2 ON T1.productline = T2.productline;

Example 2: INNER JOIN with GROUP BY Clause

Consider the orders and orderdetails tables. We can get the order number, order status, and total sales from the orders and orderdetails tables using the INNER JOIN clause with the GROUP BY clause, as shown in the query below:

SELECT  T1.orderNumber,  STATUS, SUM(quantityOrdered * priceEach) total
FROM orders AS T1
INNER JOIN orderdetails AS T2 ON T1.orderNumber = T2.orderNumber
GROUP BY T1.orderNumber;

We can write the above query with the “using” keyword. See the next slide for a demonstration.

Example 2: INNER JOIN with GROUP BY Clause (continued)

We can write the query on the previous page with the “using” keyword:

SELECT T1.orderNumber, STATUS, SUM(quantityOrdered * priceEach) total FROM orders AS T1
INNER JOIN orderdetails AS T2 using(orderNumber)
GROUP BY T1.orderNumber;

Example 3: INNER JOIN – Join Three Tables

Consider the products, orders and orderdetails tables. The query below is using two INNER JOIN to join three tables: and products.

SELECT orderNumber, orderDate, orderLineNumber, productName, quantityOrdered, priceEach FROM orders
INNER JOIN
INNER JOIN
orderdetails USING (orderNumber)
products USING (productCode)
ORDER BY orderNumber, orderLineNumber;

This picture shows the partial output.

Example 4: INNER JOIN With GROUP BY and Having Clause

Let’s take a look at the orders and orderdetails tables from the classicmodels database. The following query finds sales orders whose value total is greater than $60K.

SELECT orderNumber, SUM(priceEach * quantityOrdered) as total
FROM orderdetails
INNER JOIN orders USING (orderNumber)
GROUP BY orderNumber HAVING SUM(priceEach * quantityOrdered) > 60000;

It returns three rows, which means that there are three sales orders whose total values total greater than $60K.

Natural JOIN

The INNER JOIN is also called a Natural JOIN. We could also write the previous query without INNER or using a keyword. See below:

select a.value, b.value from A a natural join B b;

For a Natural JOIN, the JOIN predicate is formed “naturally” based on fields from both tables with matching names. Tables A and B both have an “id” field; therefore, the Natural JOIN forms the predicate based on the equality of A.id and B.id. You should be aware of Natural JOIN syntax in case you encounter it, but its use is strongly discouraged. As databases evolve, it is not uncommon to rename fields, and this process will cause Natural JOINS, based on those fields, to return incorrect results without generating an error message.

Challenge!

Take a look at the image on the right; it shows us how the tables for orders, orderdetails, customers, and products are related in the "classicmodels" database. Problem statement: Write a query to display the customer number, customer name, order number, order date, product code, product name, and price. You can use the JOIN, GROUP By, and HAVING clauses.

Overview of LEFT JOIN Clause

In a LEFT JOIN, all rows from the left table are guaranteed to be represented in the result set whether or not a row from the right table is found to satisfy the predicate. When no matching row from the right table is found, those fields will be NULL:

Visual illustration

select a.value, b.value from TableA a LEFT JOIN TableB b on a.id=b.id;

!https://i.imgur.com/7BnfqKg.png

Using LEFT JOIN

One common usage for LEFT JOIN is to count records for which there is no match. Consider our previous example:

select count(*) from A a LEFT JOIN B b using(id) where b.value is NULL;

Syntax - LEFT JOIN

The syntax for the LEFT JOIN in SQL is:

SELECT columns
FROM table1
LEFT JOIN
table2
ON   column_name = table2.column_name;

You can use the USING keyword syntax like this:

SELECT
FROM
LEFT JOIN
USING(column_name);
columns
table1
table2

Example 1: LEFT JOIN Clause - Join Two Tables

Let’s take a look at the image on the right, which shows the and orders tables.

  • Each order in the orders table must belong to a customer in the customers table.

  • Each customer in the customers table can have zero or more orders in the orders table.

To find all customers regardless of their order status, you can use the LEFT JOIN clause:

SELECT c.customerNumber, c.customerName, o.orderNumber, o.status
FROM customers c
LEFT JOIN orders o ON c.customerNumber = o.customerNumber;

This image shows the partial output:

!https://i.imgur.com/HVJjWJp.png

Example 2: LEFT JOIN Clause - Find Unmatched Rows

The LEFT JOIN clause is very useful when you want to find the rows in the left table that do not match with the rows in the right table. To find the rows between two tables, you add a WHERE statement to select only rows whose column values in the right table contain the NULL values.

For example, to find all customers who have not ordered any products, you can use the following query:

SELECT c.customerNumber, c.customerName, orderNumber, o.STATUS
FROM customers c
LEFT JOIN orders o ON c.customerNumber = o.customerNumber
WHERE orderNumber IS NULL;

This image shows the partial output:

!https://i.imgur.com/9J2di3v.png

Example 3: LEFT JOIN Clause - Find Unmatched Rows

Take a look at the right-hand figure, which shows how the tables for employees, customers, and payments are related in the "classicmodels" database.

The below query uses two LEFT JOIN clauses to join the three tables: employees, customers, and payments.

SELECT lastName, firstName, customerName, checkNumber, amount
FROM employees
LEFT JOIN customers ON employeeNumber = salesRepEmployeeNumber
LEFT JOIN payments ON payments.customerNumber = customers.customerNumber
ORDER BY customerName, checkNumber;

This image shows the output:

!https://i.imgur.com/fW9v9jJ.png

Overview of RIGHT JOIN Clause

  • In a RIGHT JOIN, all rows from the right table are guaranteed to be represented in the result set, whether or not a row from the left table is found to satisfy the predicate.

  • When no matching row from the left table is found, those fields will be NULL:

    !https://i.imgur.com/Eu7tLH1.png

Syntax - RIGHT JOIN

The syntax for the RIGHT JOIN in SQL is:

SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;

You can use the USING syntax like this:

SELECT columns
FROM table1
RIGHT JOIN table2
USING(column_name);

Example 1: Right JOIN Clause - Join Two Tables

Take a look at the image on the right; it shows how the tables for employees and customers are related in the "classicmodels" database.

The below query uses RIGHT JOIN to find all records from the right table, and the matched records from the left table.

SELECT customerNumber, salesRepEmployeeNumber
FROM employees
RIGHT JOIN customers
ON employeeNumber = salesRepEmployeeNumber
ORDER BY customerName;

This image shows the output:

!https://i.imgur.com/6Q1kuW1.png

Overview of CROSS JOIN Clause

  • The CROSS JOIN is an unfiltered join – it uses a no join predicate. The output of a CROSS JOIN is a cartesian product – every possible combination of rows are in the left and right tables.

  • CROSS JOIN on very large tables can generate a huge result set.

    !https://i.imgur.com/5dgpE2N.png

Example: CROSS JOIN Clause

The example below uses the CROSS JOIN Clause to join customers with payments:

SELECT * FROM customers e
CROSS JOIN payments p;

Overview of SELF JOIN

  • A SELF JOIN allows you to join a table to itself. It helps to query hierarchical data or compare rows within the same table.

  • SELF JOINS require the use of a table alias to distinguish the field names.

Example 1: SELF JOIN

You can use the SELF JOIN to determine who reports to whom; to do so, we can use the INNER JOIN.

SELECT m.employeeNumber AS ManagerID, m.lastName AS Manager, e.lastName AS 'employee', e.employeeNumber AS EmployeeID
FROM employees e
INNER JOIN employees m ON m.employeeNumber = e.reportsTo
ORDER BY m.employeeNumber;

This image shows the output:

!https://i.imgur.com/5L7Z5Dk.png

Example 2: SELF JOIN - Compare Successive Rows

By using the SELF JOIN, you can display a list of customers who are located in the same city by joining the customers table to itself.

SELECT c1.city, c1.customerName, c2.customerName
FROM customers c1
INNER JOIN customers c2 ON c1.city = c2.city AND c1.customername > c2.customerName
ORDER BY c1.city;

Overview of FULL JOIN Clause

The FULL JOIN Clause combines the effects of left and right JOINS – all rows of both left and right tables are guaranteed to appear in the result set, regardless of whether the JOIN predicate is satisfied or not.

Overview of Union Operator

The UNION operator allows you to combine two or more result sets of queries into a single result set.

Joining Without JOIN

Joins can be achieved without the JOIN keyword by using the WHERE clause to define the join predicate.

Hands-On Lab

Complete the following Labs:

  • GLAB - 304.6.1 - Joins and Clauses - Classicmodels Database

  • GLAB - 304.6.2 - Joins and Clauses - Banking Database

Practice Assignments

Complete the following practice assignments:

  • The JOIN operation

  • JOIN and clauses

  • Self join

  • SQL JOINS - Exercises, Practice, Solution - JOINS

Check Your Knowledge

  • What Are SQL JOINs?

  • What are the different types of JOINS in SQL?

  • How can you join a table to itself?

Summary

In this presentation, we discussed JOIN queries, which allow us to walk through the relationships between two or more tables in the FROM clause. JOINS are queries that combine the data of multiple tables based on their common columns and constraints to produce a combined result set.

References:

Questions?

Last updated