Refactoring SQL

time to read 17 min | 3282 words

Okay, I just went through a SQL refactoring tha I consider interesting. We'll start with this code:

Cursor transactions IS

      SELECT L.CustomerId ,

         Sum(Y.Debt)    Sum_Debt ,

         Sum(Y.Credit)  Sum_Credit

      FROM    Transactions           Y  join 

              Customers              L

      ON L.CustomerId    = Y.CustomerId

      AND L.Status          NOT IN (1 , 2 , 3)

      GROUP BY L.CustomerId

 

For rec0 IN transactions Loop

 

      UPDATE CustomerStatus

        SET Debt = rec0.Sum_Debt,

            Credit = rec0.Sum_Credit

      WHERE CustomerId = rec0.CustomerId

 

End Loop;

The above is using Oracle syntax, which I find much saner for cursors than what SQL Server requires you to do. This is undoubtly one of the reasons that cursors are so much more popular in Oracle than in SQL Server.

From now on I'm using SQL Server syntax, btw. The first thing that I did was to remove the cursor. I did it by simply adding a FROM clasue to the update, like this:

UPDATE CustomerStatus

  SET Debt = rec0.Sum_Debt,

      Credit = rec0.Sum_Credit

FROM

(

      SELECT L.CustomerId ,

         Sum(Y.Debt)    Sum_Debt ,

         Sum(Y.Credit)  Sum_Credit

      FROM    Transactions       Y  join 

              Customers          L

      ON L.CustomerId    = Y.CustomerId

      AND L.Status          NOT IN (1 , 2 , 3)

      GROUP BY L.CustomerId

) rec0

WHERE CustomerId = rec0.CustomerId

 

This is much better performance wise, but I challange you to understand it when the query grows just a little bit more complex. Add a few more where clauses, do some more processing, and it's Write Once Hope You'll Never Read type of thing.

 

I then decided to use a Common Table Expression to simplify the statement, the end result looks like this:

 

WITH rec0 (CustomerId, Sum_Debt, Sum_Credit) AS

(

      SELECT L.CustomerId ,

         Sum(Y.Debt)    Sum_Debt ,

         Sum(Y.Credit)  Sum_Credit

      FROM  Transactions            Y  join 

            Customers               L

      ON L.CustomerId    = Y.CustomerId

      AND L.Status          NOT IN (1 , 2 , 3)

      GROUP BY L.CustomerId

)

UPDATE CustomerStatus

  SET Debt = rec0.Sum_Debt,

      Credit = rec0.Sum_Credit

FROM rec0

WHERE CustomerId = rec0.CustomerId

 

I find this much easier to understand than the previous one, and it's performance it just fine. One thing I wish I could do is either nest WITH statements or cascade them.

 

By nesting I mean something like:

 

WITH DrunkCustomers (CustomerId) AS

(

      WITH Bar(CustomerId)

      AS

      {

            SELECT CustomerId FROM Transactions

            WHERE Payee like '%bar%'

      }

      SELECT L.CustomerId ,

         Sum(Y.Debt)    Sum_Debt ,

         Sum(Y.Credit)  Sum_Credit

      FROM  Bar                           Y  join 

                  Customers               L

      ON L.CustomerId    = Y.CustomerId

      AND L.Status          NOT IN (1 , 2 , 3)

      GROUP BY L.CustomerId

)

 

By cascading I mean:

 

 

WITH Bar(CustomerId)

AS

{

      SELECT CustomerId FROM Transactions

      WHERE Payee like '%bar%'

}

WITH DrunkCustomers (CustomerId) AS

(

      SELECT L.CustomerId ,

         Sum(Y.Debt)    Sum_Debt ,

         Sum(Y.Credit)  Sum_Credit

      FROM  Bar                           Y  join 

                  Customers               L

      ON L.CustomerId    = Y.CustomerId

      AND L.Status          NOT IN (1 , 2 , 3)

      GROUP BY L.CustomerId

)

 

Sadly, none of those options works.