Article

Simply SQL: The FROM Clause

Page: 1 2 3 4 5 6 Next

Real World Joins

Chapter 2, An Overview of the SELECT Statement introduced the Content Management System entries table, which we’ll continue to use in the following queries to demonstrate how to write joins. Figure 3.7, “The entries table” shows some—but not all—of its contents. The content column, for example, is missing.

The entries table

Within our CMS web site, the aim is to give each category its own area on the site, linked from the site's main menu and front page. The science area will contain all the entries in the science category, the humor area will contain all the entries in the humor category, and so on, as shown in Figure 3.8, “A suggested CMS site structure”. To this end, each entry is given a category, stored in the category column of each row.

A suggested CMS site structure

The main category pages themselves would need more than just the one word category name that we see in the entries table. Site visitors will want to understand what each section is about, so we’ll need a more descriptive name for each category. But where to store this in the site? We could hardcode the longer name directly into each main section page of the web site. A better solution, however, would be to save the names in the database. Another table will do the job nicely, and so we create the categories table for this purpose; we’ll give it two columns—category and name—as shown in Figure 3.9, “The categories table”.

The categories table

The category column is the key to each row in the categories table. It’s called a key because the values in this column are unique, and are used to identify each row. This is the column that we’ll use to join to the entries table. We’ll learn more about designing tables with keys in Chapter 10, Relational Integrity. Right now, let’s explore the different ways to join the categories and entries tables.

Creating the Categories Table

The script to create the categories table can be found in Appendix C, Sample Scripts and in the download for the book in a file called CMS_05_Categories_INNER_JOIN_Entries.sql.

Inner Join: Categories and Entries

The first join type we’ll look at is an inner join:

SELECT  
 categories.name, entries.title, entries.created  
FROM  
 categories  
   INNER JOIN entries  
     ON entries.category = categories.category

The figure below shows the results of this query.

The results of the inner join

Let’s walk through the query clause by clause and examine what it’s doing, while comparing the query to the results it produces. The first part of the query to look at, of course, is the FROM clause:

FROM  
 categories  
   INNER JOIN entries  
     ON entries.category = categories.category

The categories table is joined to the entries table using the keywords INNER JOIN. The ON clause specifies the join condition, which dictates how the rows of the two tables must match in order to participate in the join. It uses dot notation (tablename.rowname) to specify that rows of the categories table will match rows of the entries table only when the values in their category columns are equal. We'll look in more detail at dot notation later in this chapter.

The figure below shows in detail how the result set of the query is produced by the inner join of the categories table to the entries table. Because it’s an inner join, each of the rows of the categories table is joined only to the rows of the entries table that have matching values in their respective category columns.

The inner join in detail

Some of the Entries Table is Hidden

The entries table actually has several additional columns that are not shown: id, updated, and content. These columns are also available, but were omitted to keep the diagram simple. In fact, the diagram would’ve been quite messy if the content column had been included, as it contains multiple lines of text. Since these columns were not mentioned in the query at all, including them in the diagram might have made it confusing. Some readers would surely ask, “Hey, where did these come from?”

  • Regarding the matching of rows of the categories and entries tables, notice that: The categories row for humor matched two entries rows, and both instances of matched rows are in the results, with the name of the humor category appearing twice.
  • The categories row for blog matched no entries rows. Consequently, as this is an inner join, this category does not appear in the results.
  • The other categories rows matched one entries row each, and these matched rows are in the result.

Stating these observations in a slightly different way, we can see that a single row in the categories table can match no rows, one row, or more than one row in the entries table.

One-to-Many Relationships

The more than one aspect of the relationship between a row in the categories table and matching rows in the entries table is the fundamental characteristic of what we call a one-to-many relationship. Each (one) category can have multiple (many) entries.

Even though a given category (blog) might have no matching entries, and only one of the categories (humor) has more than one entry, the relationship between the categories and entries tables is still a one-to-many relationship in structure. Once the tables are fully populated with live data, it’s likely that all categories will have many entries.

Looking at this relationship from the other direction, as it were, we can see that each entry can belong to only one category. This is a direct result of the category column in the entries table having only one value, which can match only one category value in the categories table. Yet more than one entry can match the same category, as we saw with the humor entries. So a one-to-many relationship is also a many-to-one relationship. It just depends on the direction of the relationship being discussed.

Now we've examined the FROM clause and seen how the INNER JOIN and its ON condition have specified how the tables are to be joined, we can look at the SELECT clause:

SELECT    
 categories.name  
, entries.title  
, entries.created

As you would expect, the SELECT clause simply specifies which columns from the result of the inner join are to be included in the result set.

Leading Commas

Notice that the SELECT clause has now been written with one line per column, using a convention called leading commas; this places the commas used to separate the second and subsequent items in a list at the front of their line. This may look unusual at first, but the syntax is perfectly okay; remember, new lines and white space are ignored by SQL just as they are by HTML. Experienced developers may be more used to having trailing commas at the end of the lines, like this:

SELECT    
 categories.name,  
 entries.title,  
 entries.created

I use leading commas as a coding style convention to make SQL queries more readable and maintainable. The importance of readability and maintainability can’t be overstated. For example, see if you can spot the two coding errors in this hypothetical query:

SELECT  
 first_name,  
 last_name,  
 title  
 position,  
 staff_id,  
 group,  
 region,  
FROM  
 staff

Now see if you can spot the coding errors here:

SELECT  
 first_name  
, last_name  
, title  
 position  
, pay_scale  
, group  
, region  
,  
FROM  
 staff

The query is missing a comma in the middle of the column list and has an unneeded, additional comma at the end of the list. In which example were the errors easier to spot?

In addition, leading commas are easier to handle if you edit your SQL in a text editor with the keyboard. Sometimes you need to move or delete a column from the SELECT clause, and it's easier to select (highlight) the single line with the keyboard's Shift and Arrow keys. Similarly, removing the last column requires also removing the trailing comma from the previous line, which is easy to forget. A dangling comma in front of the FROM keyword is a common error that’s difficult to make using leading commas.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article