ANSI Joins | Database Journal (original) (raw)

Introduction

Are you confused by all this "inner join" style
syntax that is becoming more and more prominent lately?

It’s happening because Microsoft is moving towards using
ANSI syntax. The old syntax still works, but it is strongly recommended you
learn the new syntax, Microsoft have stated that they will not support the old
syntax indefinitely.

It’s fairly easy to translate from the old to the new
syntax, and correctly translated results will usually produce identical results,
but there are exceptions that you should be aware of.

Inner or natural joins

Inner joins return rows where data matching exists in the
tables you are joining. This is the simplest type of join, and moving between
the old and new syntax should present no problems. In general, this:

Click here for code example 1.

Is equivalent to this:

Click here for code example 2.

Note that the components of the where clause that
define the joins between tables are now defined in the from clause
instead. The where is now only used for specifying selection criteria. To
illustrate this:

Click here for code example 3.

Becomes:

Click here for code example 4.

Incidentally, what the query above does is to display table
names and index names in the current database where the table has a Clustered
Index. The sysindexes.indid column is always set to one for clustered indexes.

I will use similar examples later on, and if you are not too
clear on these system tables, then now would be a good time for you to review
the documentation on sysobjects and sysindexes in Books Online.

Update and Delete

These statements also support the ANSI join syntax, although
it’s not very well documented in Books Online for some versions of SQL Server

Click here for code example 5.

Outer joins

When two tables are joined with an inner join, data will only
be returned if matching data exists in both tables. An outer join is like saying
"and also include the rows from one table if there are no matching rows in
the other one."

With an outer join the columns from the table where data is
"missing" are returned as NULL values.

Outer joins come in two basic flavours, called Left and Right.
Left outer joins mean that the data must be contained in the table defined to
the left side of the equivalence, but not necessarily the right hand side. Right
outer joins, of course, work the other way around.

To illustrate this, cut and paste the code below into a Query
Analyser window and try running it. I have used the newer ANSI syntax here, and
the older equivalents are included but commented out using the "–"
comment notation. Comment them back in if you want to try them.

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)

insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** Inner join ***"
select * from #left inner join #right on leftI = rightI
-- select * from #left, #right where leftI = rightI

print " "
print "*** left join ***"
select * from #left left outer join #right on leftI = rightI
-- select * from #left, #right where leftI *= rightI

print " "
print "*** right join ***"
select * from #left right outer join #right on leftI = rightI
-- select * from #left, #right where leftI =* rightI

drop table #left
drop table #right
set nocount off
go

Outer Join Gotchas

Unlike natural joins, outer joins have a few traps you can
fall into if you translate your query from the old syntax to the new and expect
it to produce the same results in both formats.

Returning to our sysobjects and sysindexes examples, the
following two queries should produce the same results

select o.name, i.name
from sysobjects o, sysindexes i
where o.id *= i.id and
o.type = "S"
select o.name, i.name
from sysobjects o left outer join sysindexes i on o.id = i.id
where o.type = "S"

However, if we qualify the select criteria on sysindexes as
follows, in order to look for details on clustered indexes, the two queries
should return different results.

select o.name, i.name
from sysobjects o, sysindexes i
where o.id *= i.id and
o.type = "S" and
i.indid = 1

select o.name, i.name
from sysobjects o left outer join sysindexes i on o.id = i.id
where o.type = "S" and
i.indid = 1

The reason this happens is that the SQL Server reacts
differently to the "Where" clause when the outer joined table does not
have any data.

When no data is found in the sysindexes table that matches the
sysobjects table, indid will be returned as NULL. In this case:

Full outer joins

Full outer joins effectively combine the left and right outer
joins so that data will be returned if it matches in both tables, or if it exist
in either one.

The old join syntax has no direct equivalent of the full outer
join.

Try out the SQL below to illustrate the full outer join.

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)
insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** FULL join ***"
select * from #left full outer join #right on leftI = rightI

drop table #left
drop table #right
set nocount off
go

Cross Joins

Cross Joins create a Cartesian Product, rather like when you forget to
include a "where" clause to join two tables. There are not many cases
where you would want to use a cross join, but this article discusses one possible use for them.

Orphan hunting

Prior to ANSI SQL, if you wanted to find records in one table
that did not match a record in another, the recommended solution was to use a
"NOT IN" or "NOT EXISTS" based query. It is now recommended
that you use outer joins instead. Here is a simple example using all three
possibilities-they should all produce the same results:

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)
insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

select *
from #right
where rightI not in (
    select leftI from #left
)

select *
from #right
where not exists (
    select * from #left where leftI = rightI
)

select #right.*
from #left right outer join #right on leftI = rightI
where leftI IS NULL

drop table #left
drop table #right
set nocount off
go

Beware of falling into the trap we discussed earlier. Looking
at our previous "Clustered Index" example, either of these two
following old-style queries would produce a list of system tables that do not
have clustered indices:

select *
from sysobjects o
where type = "S" and
id not in (
    select id
    from sysindexes
    where indid = 1
)

select *
from sysobjects o
where type = "S" and
not exists (
    select id
    from sysindexes i
    where indid = 1 and
    o.id = i.id
)

Here is a simple translation of the above queries into the new
"left outer join" style. Try it out and see what happens.

select *
from sysobjects o left outer join sysindexes i on o.id = i.id
where type = "S" and
i.indid = 1 AND i.id IS NULL

No rows are returned.

The query now fails because the where i.indid = 1
clause makes no sense when we are looking for rows that do not exist in
sysindexes.

To get around this, the following example uses a "derived
query" which forces a select on sysindexes to return a subset of rows with indid=1, which is subsequently used in the outer join.

select *
from sysobjects o left outer join
(select * from sysindexes i where indid = 1) as i on o.id = i.id
where type = "S" and
i.id IS NULL

In effect, the derived query–

(select * from sysindexes i where indid = 1) as i

creates a notional table called "i", which contains
a copy of all the sysindexes rows with an indid of 1. It is this notional table
that is then fed to the rest of the query. The final results should now match
the NOT EXISTS and NOT IN examples shown above.

About the author

Neil Boyle is an independant SQL Server consultant working out of London, England. Neil’s free SQL Server guide is available on-line at http://www.impetus-sql.co.uk.

Neil Boyle

Neil Boyle

Neil Boyle left school at the age of sixteen thinking that computers were things that only existed in Star Trek. After failed careers as a Diesel Mechanic, Industrial Cleaner, Barman and Bulldozer Driver he went back to college to complete his education. Since graduating from North Staffs Poly he has worked up through the ranks from Trainee COBOL Programmer to SQL Server Consultant, a role in which he has specialised for the past seven years.