Sunday, May 20, 2012

Revisiting LEFT OUTER JOIN


It’s time to revisit our old friend LEFT OUTER JOIN.
You ask “why???”
I say because it can give you unexpected results (rows) if you are not careful about where you put the conditions.
Let’s first create a few tables

--Stores Product Data (we are going to store different cars)
CREATE TABLE [dbo].[Product](
      ProductID INT IDENTITY(1,1)  PRIMARY KEY
      ,  Name VARCHAR(200)
      , Description VARCHAR(MAX)
)

--Stores Parts of different cars
CREATE TABLE [dbo].[Part](
      PartID INT IDENTITY(1,1) PRIMARY KEY
      , Name VARCHAR(30)   
)

--Stores relationship between Product and its Parts
CREATE TABLE [dbo].[ProductPart](
      ProductID INT FOREIGN KEY REFERENCES Product(ProductID)
      , PartID INT FOREIGN KEY REFERENCES Part(PartID)
)

And now populate these tables with some data

INSERT INTO Product(Name, Description)
VALUES('Car', 'it will take you to work'), ('Sports Car', 'it will make u feel great'), ('Imaginary Car', 'it flys and stays in your dreams')


INSERT INTO Part(Name)
VALUES('Wheel'),('Door'), ('Boot'), ('55 inch tv'), ('Speed'), ('Thrill')


INSERT INTO ProductPart(ProductID, PartID)
VALUES(1,1),(1,2),(1,3),(2,1),(2,2),(2,5),(2,6)


Let us examine what the left outer join really means based on the conditions:

SELECT DISTINCT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
AND PT.Name = 'Thrill'

In the above query you are trying to get names of all the products and the products which have the thrill part.
when you run this query the following result is returned


now if you modify the above query a little bit and place the condition in the where clause:

SELECT DISTINCT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
WHERE
      PT.Name = 'Thrill'




If you can guess it, this query even though is written using LEFT OUTER JOIN, actually its an INNER JOIN because of the condition in the where clause. The result of the above query is 



If you look at the execution plan, the query optimizer has actually run this as an INNER JOIN 


lets modify the above query a little bit 

SELECT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
WHERE
     PT.Name IS NULL OR PT.Name = 'Thrill'



When you run this query, because we have specified a NULL condition in the where clause, the query optimizer will run it as a LEFT OUTER JOIN. As expected it will return all the cars which have either no parts specified or one of its parts is Thrill


So its really important that you put the condition at the right place (depending on your needs).
The purpose of this article was just to brush up on the LEFT OUTER JOIN. 
If you have any questions/comments please don't hesitate to comment.



   




2 comments: