GROC (Getting Rid Of Cursors) part four: the T-SQL Pivot operator

This is part four of the GROC-files. I guess it's time to discover some syntactical constructions that make the use of cursors (and also temporary tables) obsolete. One of these constructions was added in SQL server 2005: the PIVOT operator. This operator allows you to write crosstab or matrix queries in a more-or-less relational way.

Let's say we have a relatively normalized data model that has a DailySales table or a view like the following:

  SELECT * 
    FROM DailySales
ORDER BY ProductName, [DayofWeek]



In most cases our business analysts/managers/sales people would like this information to be presented in a less normalized structure, like this:



Each value in the original DayOfWeek column is presented as a column name in the new resultset. The content of that column is the SUM of OrderQuantity after a GROUP BY ProductName is applied. I can imagine that a large number of database developers would attack this problem by creating a table variable -or worse: a temporary table- and then run through the DailySales table with a cursor to do INSERTs and UPDATEs. Well, this huge overkill is not necessary: the PIVOT statement can take care of everything. Here's the T-SQL statement that generates the pivoted result:

SELECT ProductName, 
       Monday, 
       Tuesday, 
       Wednesday, 
       Thursday,
       Friday, 
       Saturday, 
       Sunday
  FROM DailySales
 PIVOT (
          SUM(OrderQuantity) 
          FOR [DayOfWeek] IN 
          (
             [Monday], [Tuesday], [Wednesday], [Thursday],[Friday], [Saturday], [Sunday]
          )
       ) As pvt

The only disadvantage is that the list of columns should be enumerated in your query, but with some dynamic SQL you can get around this.

By the way, there is not much magic involved under the hood. The Pivot statement is just syntactic sugar for CASE constructions. The following query uses more traditional T-SQL, but returns exactly the same result, with the same performance (in SQL Server 2008 even with the exact same query plan steps):

  SELECT ProductName,
         SUM(CASE WHEN [DayOfWeek] = 'Monday' THEN OrderQuantity END) AS Monday,
         SUM(CASE WHEN [DayOfWeek] = 'Tuesday' THEN OrderQuantity END) AS Tuesday,
         SUM(CASE WHEN [DayOfWeek] = 'Wednesday' THEN OrderQuantity END) AS Wednesday,
         SUM(CASE WHEN [DayOfWeek] = 'Thursday' THEN OrderQuantity END) AS Thursday,
         SUM(CASE WHEN [DayOfWeek] = 'Friday' THEN OrderQuantity END) AS Friday,
         SUM(CASE WHEN [DayOfWeek] = 'Saturday' THEN OrderQuantity END) AS Saturday,
         SUM(CASE WHEN [DayOfWeek] = 'Sunday' THEN OrderQuantity END) AS Sunday         
    FROM DailySales
GROUP BY ProductName

If you want to testdrive the queries, then here's the full script:

/*********************************************/
/*                                           */
/* SQL Server 2005/2008 Pivot Statement Demo */
/*                                           */
/*********************************************/
 
-- If you use the SQL2005 AdventureWorksDW 
-- then replace DimDate by DimTime         
--          and DateKey by TimeKey         
 
USE AdventureWorksDW2008
GO
 
/*********************************/
/* What we have in the database  */
/* A 3rd Normal Form(-ish) model */
/*********************************/
 
;WITH DailySales AS
(
    SELECT p.EnglishProductName AS ProductName, 
           SUM(f.OrderQuantity) AS OrderQuantity, 
           d.EnglishDayNameOfWeek AS [DayofWeek] 
      FROM FactInternetSales f 
INNER JOIN DimProduct p 
        ON f.ProductKey = f.ProductKey 
INNER JOIN DimDate d 
        ON f.OrderDateKey = d.DateKey 
  GROUP BY p.EnglishProductName, 
           d.EnglishDayNameOfWeek, 
           d.DayNumberOfWeek
)
  SELECT * 
    FROM DailySales
ORDER BY ProductName, [DayofWeek]
 
/*********************************/
/* What our boss wants to see    */
/* A heavily denormalized report */
/*********************************/
 
;WITH DailySales AS
(
    SELECT p.EnglishProductName AS ProductName, 
           SUM(f.OrderQuantity) AS OrderQuantity, 
           d.EnglishDayNameOfWeek AS [DayofWeek] 
      FROM FactInternetSales f 
INNER JOIN DimProduct p 
        ON f.ProductKey = f.ProductKey 
INNER JOIN DimDate d 
        ON f.OrderDateKey = d.DateKey 
  GROUP BY p.EnglishProductName, 
           d.EnglishDayNameOfWeek, 
           d.DayNumberOfWeek
)
SELECT ProductName, 
       Monday, 
       Tuesday, 
       Wednesday, 
       Thursday,
       Friday, 
       Saturday, 
       Sunday
  FROM DailySales
 PIVOT (
          SUM(OrderQuantity) 
          FOR [DayOfWeek] IN 
          (
             [Monday], [Tuesday], [Wednesday], [Thursday],[Friday], [Saturday], [Sunday]
          )
       ) As pvt
 
/***************************************/       
/* What SQL Server does under the hood */
/* A Select Case T-SQL statement       */
/***************************************/
 
;WITH DailySales AS
(
    SELECT p.EnglishProductName AS ProductName, 
           SUM(f.OrderQuantity) AS OrderQuantity, 
           d.EnglishDayNameOfWeek AS [DayofWeek] 
      FROM FactInternetSales f 
INNER JOIN DimProduct p 
        ON f.ProductKey = f.ProductKey 
INNER JOIN DimDate d 
        ON f.OrderDateKey = d.DateKey 
  GROUP BY p.EnglishProductName, 
           d.EnglishDayNameOfWeek, 
           d.DayNumberOfWeek
)
  SELECT ProductName,
         SUM(CASE WHEN [DayOfWeek] = 'Monday' THEN OrderQuantity END) AS Monday,
         SUM(CASE WHEN [DayOfWeek] = 'Tuesday' THEN OrderQuantity END) AS Tuesday,
         SUM(CASE WHEN [DayOfWeek] = 'Wednesday' THEN OrderQuantity END) AS Wednesday,
         SUM(CASE WHEN [DayOfWeek] = 'Thursday' THEN OrderQuantity END) AS Thursday,
         SUM(CASE WHEN [DayOfWeek] = 'Friday' THEN OrderQuantity END) AS Friday,
         SUM(CASE WHEN [DayOfWeek] = 'Saturday' THEN OrderQuantity END) AS Saturday,
         SUM(CASE WHEN [DayOfWeek] = 'Sunday' THEN OrderQuantity END) AS Sunday         
    FROM DailySales
GROUP BY ProductName


In this article Devin Knight explains how to pivot the same result set, using SQL Server Integration Services.

No comments:

Post a Comment