Sunday, May 20, 2012

Revisiting LEFT OUTER JOIN


It’s time to revisit our old friend LEFT OUTER JOIN.
You ask “why???”
I say because it can give you unexpected results (rows) if you are not careful about where you put the conditions.
Let’s first create a few tables

--Stores Product Data (we are going to store different cars)
CREATE TABLE [dbo].[Product](
      ProductID INT IDENTITY(1,1)  PRIMARY KEY
      ,  Name VARCHAR(200)
      , Description VARCHAR(MAX)
)

--Stores Parts of different cars
CREATE TABLE [dbo].[Part](
      PartID INT IDENTITY(1,1) PRIMARY KEY
      , Name VARCHAR(30)   
)

--Stores relationship between Product and its Parts
CREATE TABLE [dbo].[ProductPart](
      ProductID INT FOREIGN KEY REFERENCES Product(ProductID)
      , PartID INT FOREIGN KEY REFERENCES Part(PartID)
)

And now populate these tables with some data

INSERT INTO Product(Name, Description)
VALUES('Car', 'it will take you to work'), ('Sports Car', 'it will make u feel great'), ('Imaginary Car', 'it flys and stays in your dreams')


INSERT INTO Part(Name)
VALUES('Wheel'),('Door'), ('Boot'), ('55 inch tv'), ('Speed'), ('Thrill')


INSERT INTO ProductPart(ProductID, PartID)
VALUES(1,1),(1,2),(1,3),(2,1),(2,2),(2,5),(2,6)


Let us examine what the left outer join really means based on the conditions:

SELECT DISTINCT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
AND PT.Name = 'Thrill'

In the above query you are trying to get names of all the products and the products which have the thrill part.
when you run this query the following result is returned


now if you modify the above query a little bit and place the condition in the where clause:

SELECT DISTINCT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
WHERE
      PT.Name = 'Thrill'




If you can guess it, this query even though is written using LEFT OUTER JOIN, actually its an INNER JOIN because of the condition in the where clause. The result of the above query is 



If you look at the execution plan, the query optimizer has actually run this as an INNER JOIN 


lets modify the above query a little bit 

SELECT P.Name AS Product
    , Pt.Name AS Part
FROM Product P
LEFT OUTER JOIN ProductPart PPT
ON P.ProductID = PPT.ProductID
LEFT OUTER JOIN Part PT
ON PPT.PartID = PT.PartID
WHERE
     PT.Name IS NULL OR PT.Name = 'Thrill'



When you run this query, because we have specified a NULL condition in the where clause, the query optimizer will run it as a LEFT OUTER JOIN. As expected it will return all the cars which have either no parts specified or one of its parts is Thrill


So its really important that you put the condition at the right place (depending on your needs).
The purpose of this article was just to brush up on the LEFT OUTER JOIN. 
If you have any questions/comments please don't hesitate to comment.



   




Monday, April 23, 2012

Dynamically create a temp table with variable number of columns

While programming with sql we usually come across a situation where we want to pass data to stored procedure where the data is actually a string concatenation of different values and the stored procedure would construct a table out of this string to perform relational operations. Let’s look at one such scenario.
Scenario:
                Say you have the following tables
CREATE TABLE [dbo].[Product](
ProductID INT IDENTITY(1,1)  PRIMARY KEY
,  Name VARCHAR(200)
, Description VARCHAR(MAX)
)

CREATE TABLE [dbo].[Part](
      PartID INT IDENTITY(1,1) PRIMARY KEY
      , Name VARCHAR(30)     
)

CREATE TABLE [dbo].[ProductPart](
      ProductID INT FOREIGN KEY REFERENCES Product(ProductID)
      , PartID INT FOREIGN KEY REFERENCES Part(PartID)
)

Let’s populate these tables with test data
INSERT INTO Product(Name, Description)
VALUES('Car', 'it will take you to work')


INSERT INTO Part(Name)
VALUES('Wheel'),('Door'), ('Boot'), ('Monitor')
These tables will now have the following data

Let us suppose you want to store the data into ProductPart table. You want to store the information where you want to say a car has wheels, Doors and a Boot.

Traditionally you will have to write a stored procedure where it will take partID and ProductID as parameters and saves it into the ProductPart table one at a time.
It would look like this
CREATE PROCEDURE [dbo].[ProductPartSave]
(
      @ProductID INT
      , @PartID INT
)
AS
INSERT INTO [ProductPart]
SELECT @ProductID, @PartID

It will work fine. Except that you will have to make three calls to the database in our case.
There is one more way. You can create a stored procedure that will take two parameters. The first parameter would be the @ProductID and the second @PartIDs.



Definition of our new stored procedure is

CREATE PROCEDURE [dbo].[ProductMultiplePartsSave]
(
      @ProductID INT
      , @PartIDs VARCHAR(MAX)
)
AS

DECLARE @Parts TABLE(PartID INT)

INSERT INTO @Parts(PartID)
EXEC [GetDynamicTable] @DataList = @PartIDs, @ColumnSeperator = ',', @RowSeperator = '|', @NumberOfColumns = '1'

INSERT INTO [ProductPart](ProductID, PartID)
SELECT @ProductID
      , PartID
FROM @Parts

As you can see, we have a used a stored procedure called [dbo].[GetDynamicTable] to get a table from string.
The code for this sp is

CREATE PROCEDURE [dbo].[GetDynamicTable]
(
      @DataList AS VARCHAR(MAX),
      @ColumnSeperator AS CHAR(1),
      @RowSeperator AS CHAR(1),
      @NumberOfColumns INT
)
AS   
DECLARE @DynamicTableSQL NVARCHAR(MAX)
DECLARE @ColumnCount INT
DECLARE @Row NVARCHAR(MAX)


SET @DynamicTableSQL = 'DECLARE @DynamicTable TABLE('
SET @ColumnCount = 1
WHILE @ColumnCount <= @NumberOfColumns
BEGIN
      SET @DynamicTableSQL = @DynamicTableSQL + 'Column' + CAST(@ColumnCount AS VARCHAR(10))+ ' VARCHAR(MAX) ,'
      SET @ColumnCount = @ColumnCount + 1
END
SET @DynamicTableSQL = SUBSTRING(@DynamicTableSQL, 0, LEN(@DynamicTableSQL))
SET @DynamicTableSQL = @DynamicTableSQL + ') '

SET @DataList = REPLACE(@DataList, '''', '''''')
     
SET @DynamicTableSQL = @DynamicTableSQL + ' INSERT INTO @DynamicTable '
     
SELECT @DataList = @DataList + @RowSeperator

DECLARE @PosA INT, @PosB INT, @PosC INT

SELECT @PosA = 1
SELECT @PosC = CHARINDEX(@RowSeperator, @DataList, @PosA)

SET @Row = SUBSTRING(@DataList, @PosA, @PosC - @PosA)      

IF (LEN(@Row) > 0)
BEGIN
      SET @Row = REPLACE( @Row, @ColumnSeperator, ''',''')
      SET @DynamicTableSQL = @DynamicTableSQL + 'SELECT ''' + @Row + ''' '   
END

SELECT @PosA = @PosC + 1
SELECT @PosC = CHARINDEX(@RowSeperator, @DataList, @PosA)


WHILE @PosC > 0
BEGIN
     
      SET @Row = SUBSTRING(@DataList, @PosA, @PosC - @PosA)
      IF (LEN(@Row) > 0)
      BEGIN
            SET @DynamicTableSQL = @DynamicTableSQL + ' UNION ALL '
           
            SET @Row = REPLACE( @Row, @ColumnSeperator, ''',''')
            SET @DynamicTableSQL = @DynamicTableSQL + 'SELECT ''' + @Row + ''' '               
      END
     
      SELECT @PosA = @PosC + 1
      SELECT @PosC = CHARINDEX(@RowSeperator, @DataList, @PosA)
END
     
SET @DynamicTableSQL = @DynamicTableSQL + ' SELECT * FROM @DynamicTable '

EXEC SP_EXECUTESQL @DynamicTableSQL

The above stored procedure takes four parameters
1.       @DataList a string containing the concatenated data
2.  @ColumnSeperator a character which is the column delimiter
3.  @RowSeperator a character which is the row delimiter
4.       @NumberOfColumns – number of columns the dynamic table should have

Let’s first see how we can insert values into ProductPart table using our new stored procedures.

As you can see, we inserted multiple rows into ProductPart table with just one call to the [ProductMultiplePartsSave] stored procedure.


[dbo].[GetDynamicTable] is very powerful in terms of creating temp tables on the fly with just change of parameters. Let’s look at a few different uses of this sp:
In the following image you can see how it can be used to get a two column table

In the following image you can see it returning a table with 5 columns

I hope this article was informative and please don’t hesitate to comment or ask questions.

Tuesday, January 3, 2012

Finding out where (in code) the connections to sql server are not getting closed

We had a problem where lots of connections were being created to the SQL Server and it just kept going up. The web application is just too big to manually check all the code pages to see where the connections were not getting closed properly.

(if you are using .net it’s better to use ‘USING’ so the connection is properly disposed)

I wanted to find out what query was run on the connections that got opened and didn’t get closed. That way I could check where in code the sql was getting called and make sure the connection was getting closed there.

For this we are going to use the dynamic management view dm_exec_connections and dynamic management function dm_exec_sql_text.

  • dm_exec_connections returns information about the established connections to this sql server instance. The column that we are most interested in is the most_recent_sql_handle, as this would have an handle to the sql that was run last before the connection was abandoned.
  • dm_exec_sql_text is a table valued function which takes a sql_handle(or plan_handle) parameter and returns the database id and text(this is null if the if the object is encrypted) among other things.

The following query will give you list of connections on which the read or write was done 5 mins ago along with the last sql that was run on it.

SELECT *
FROM sys.dm_exec_connections C
CROSS APPLY sys.dm_exec_sql_text(C.most_recent_sql_handle) T
where
last_read < DATEADD(minute, -5, getdate()) or last_write < DATEADD(minute, -5, getdate())

here is a image of running the above query on my sql server




Once you know what was run last on the sql server for a connection, you just need to check where in code it’s getting called and fix it.

Hope this little article was helpful.