Using Data Ranges

time to read 76 min | 15161 words

Okay, let's see some use for having a date range, let's assume that I've the following employees table:

EmpId Name
1 Ayende
2 Rahien
3 Bar

And an EmpHours tables with the following data:

EmpHourId EmpId Start End
3 1 01/01/2006 08:00 01/01/2006 17:00
4 1 02/01/2006 08:00 02/01/2006 17:00
5 1 03/01/2006 09:00 03/01/2006 16:00
6 2 01/01/2006 09:00 01/01/2006 16:00
7 2 02/01/2006 07:00 02/01/2006 16:00
8 2 03/01/2006 07:00 03/01/2006 18:00
9 3 03/01/2006 07:00 03/01/2006 18:00

Our goal is to produce a report that shows how much hours a day each of those employees worked for the first week of 2006.

Anatoly suggested something like this (Refactored a bit to match my table structure):

SELECT e.Name, AVG(DATEDIFF(hh, StartHour, EndHour)) AS AvgHoursPerDay

  FROM EmpHours eh join Employees e on eh.EmployeeId = e.EmployeeId

 WHERE DATEPART(week, StartHour) = 1

 GROUP BY eh.EmployeeId, e.Name

The result of this query is this table:

Name AvgHoursPerDay
Ayende

 8

Rahien

 9

Bar

 11

Okay, this gives me the average, but it doesn't gives me the number of hours per day, for that I need to join against a date range, which is where my previous post comes into play, here is the query:

SELECT

      EmpInDate.EmployeeId,

      EmpInDate.Name,

      EmpInDate.Date,

      SUM(DATEDIFF(hour,eh.StartHour, eh.EndHour)) HoursPerDay

FROM

(

      SELECT

            EmployeeId,

            Name,

            CurrentDate as Date

      FROM Employees e, DateRange('20060101','20060107')

)     EmpInDate LEFT JOIN EmpHours eh

ON    EmpInDate.Date = dbo.JustDate(eh.StartHour)

AND EmpInDate.EmployeeId = eh.EmployeeId

GROUP BY EmpInDate.EmployeeId, EmpInDate.Name, EmpInDate.Date

This simply return the number of hours that an employee worked in a spesific date in the range, in this case, from the 01/01/2006 to 06/01/2006 (DateRange does not include the last date), then we join against the number of hours an employee worked in that date, and sum the result.

Here is the result of this query:

EmployeeId Name Date HoursPerDay
1 Ayende 01/01/2006 9
2 Rahien 01/01/2006 7
3 Bar 01/01/2006 NULL
1 Ayende 02/01/2006 9
2 Rahien 02/01/2006 9
3 Bar 02/01/2006 NULL
1 Ayende 03/01/2006 7
2 Rahien 03/01/2006 11
3 Bar 03/01/2006 11
1 Ayende 04/01/2006 NULL
2 Rahien 04/01/2006 NULL
3 Bar 04/01/2006 NULL
1 Ayende 05/01/2006 NULL
2 Rahien 05/01/2006 NULL
3 Bar 05/01/2006 NULL
1 Ayende 06/01/2006 NULL
2 Rahien 06/01/2006 NULL
3 Bar 06/01/2006 NULL

If we add another entry for three more hours on the 01/01/2006 to Employee Ayende, we will get the same number of hours, but the first row will have HoursPerDay of 11. Still, this isn't something that is nice to look at (nor does it let you see which of your workers is lazy :-)), so let's keep making the query better by pivoting it:

SELECT

      Name,

      [1] as [Day #1],[2] as [Day #2],[3] as [Day #3],

      [4] as [Day #4],[5] as [Day #5],[6] as [Day #6]

FROM

(

      SELECT

            EmpInDate.Name,

            day(EmpInDate.Date) day,

            DATEDIFF(hour,eh.StartHour, eh.EndHour) Hours

      FROM

      (

            SELECT

                  EmployeeId,

                  Name,

                  CurrentDate as Date

            FROM Employees e, DateRange('20060101','20060107')

      )     EmpInDate LEFT JOIN EmpHours eh

      ON    EmpInDate.Date = dbo.JustDate(eh.StartHour)

      AND EmpInDate.EmployeeId = eh.EmployeeId

) Source

PIVOT

(

      SUM(Hours)

      FOR [DAY] IN

      (

             [1],[2],[3],[4],[5],[6]

      )

) as PVT

And the result:

Name Day #1 Day #2 Day #3 Day #4 Day #5 Day #6
Ayende 11 9 7 NULL NULL NULL
Bar NULL NULL 11 NULL NULL NULL
Rahien 7 9 11 NULL NULL NULL

I talked about pivoting before, btw. Now, like I said, in a serious application, you'll want to do this only on business days, and that isn't likely to be calculated in a function, most likely you'll have a table of pre-calculated dates that you'll use, but it's can be cool to do it on an ad-hoc basis, or maybe when you really need all the dates. One thing that I can easily think about is having a MonthlyDateRange that you use for joining to create reports for things that happened within that month (that is assuming that you're not working in a place where it's customary for the whole company to take a month long vacations).