In log4net 1.2.9 there is a bug in the AdoNetAppender. It will not log null values appropriately. This bug is fixed in log4net 1.2.10, but it is not always possible to move to the next version. (In my case, both NHibernate and Castle uses log4net 1.2.9)
I tried recompiling everything to use 1.2.10, but it took too long, and eventually I simply wrote this little trigger:
CREATE TRIGGER [ReplaceNullLookAlikeWithNulls]
ON [dbo].[Logs]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
INSERT INTO .[dbo].[Logs]
([Date]
,[Thread]
,[Level]
,[Logger]
,[Message]
,[Exception]
,[Filename]
SELECT
[Date]
,[Thread]
,[Level]
,[Logger]
,[Message]
,[Exception]
,case [Filename] when '(null)' then NULL else [Filename] end
FROM Inserted
END
Not the best solution, but it will hold water.
I need to be able to ask the database for some rows to work on, and I need to be sure that I am the only one that works on those rows. After thinking about it, I came up with this SQL:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
INSERT INTO Locks(LockedBy, LockedAt, LcokedItemId, LockedItemType)
OUTPUT inserted.Id
SELECT TOP 100 @lockedBy, getdate(), item.Id, @itemType
FROM Items item
WHERE NOT EXISTS (SELECT 1 FROM Locks lock
WHERE lock.LcokedItemId = item.id AND lock.LockedItemType = @itemType)
ORDER BY item.Id
COMMIT TRANSACTION
This is the first time I need to implement pessimistic locking, and I am not sure if this is the right way to go. The work I am doing is disconnected, so I can't just lock the rows and continue to work.
Any suggestions?
Update: This is what I have now:
CREATE PROCEDURE lockAndGetItems AS
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
DECLARE @lockedBy uniqueIdentifier, @countOfItemsToLock int
SET @lockedBy = '68A6D54C-557D-428D-8B82-5D68C9C1B33E'
-- Get number of rows already locked by me
SELECT @countOfItemsToLock = 100 - COUNT(Id) From Locks where LockedBy = @lockedBy
-- Lock rows to complete to an even 100
INSERT INTO Locks(LockedBy, LockedAt, LockedItemId)
SELECT TOP (@countOfItemsToLock) @lockedBy, getdate(), item.Id
FROM Items item
WHERE NOT EXISTS (SELECT 1 FROM Locks lock
WHERE lock.LockedItemId = item.id and lock.LockedBy = @lockedBy)
ORDER BY item.Id
-- Return row data
SELECT Items.* FROM Items JOIN Locks ON Items.Id = Locks.LockedItemId AND LockedBy = @lockedBy
COMMIT TRANSACTION
Basically it fetch the first 100 items designated for this SP. Each SP has a single caller, so I don't worry about expiring the lock. I will always get the locked rows. In the case of a crash when the rows are locked, when the client is restarted, it will get the same rows that it already locked, and can start working on them. The reason that I implement is as a SP with hard coded Guid is that each SP locks a different table.
Moran has pointed me to the CROSS APPLY syntax in T-SQL 2005, which does allows joining against a table valued function. Using this, the query goes down to this:
SELECT
EmpName,
CONVERT(NVARCHAR, CurrentDate,103) Date,
HasWorked = CASE SUBSTRING(Roster,
dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)
WHEN '_' THEN 0
WHEN '*' THEN 1
ELSE NULL
END
FROM Schedules CROSS APPLY DateRange(StartDate,EndDate)
This is much nicer way to deal with it. Considerring that I am using similar techniques all over the place, this is a really good thing to know.
I'm starting to get quite a bit of mail from this blog. Some of those questions are about subjects I can answer immediately, some require a fair amount of work (which can be had, if you really want), and the more interesting ones are those that require some thinking, but does not require too much time. This question from Dave is the best one so far, and I got his permission to blog about it, so I'm doubly happy.
The issue is working against a legacy database to get the data for further processing. I'll let Dave explain the issue, since he does it much better:
I have to write a query to generate a report over some interesting data. It's basically scheduling which days people are working. The data looks like this:
Id | EmpName | StartDate | EndDate | Roster |
1 | Bob | 12/06/2006 | 18/06/2006 | _*___** |
2 | Mary | 12/06/2006 | 18/06/2006 | *_*__*_ |
The trick is, the roster field contains a string with a _ or * depending on wether the person is scheduled to work that day or not, but the first character always starts on the sunday. The startdate and enddate can be any day of the week.
In the example above, the 12-jun is a monday, so monday corresponds to the second character in the roster string, so Bob's working and Mary's not.
The roster string wraps around, so the first character of the roster string actually corresponds with the enddate here! Now, this roster string could be 7, 10, 14 days long.
I could get the report out if I can write a query to get it to this:
Employee DateWorking Bob 12/06/2006 Bob 16/06/2006 Mary 13/06/2006 Mary 16/06/2006
By the way, I haven't asked, but I'll bet that this schema has originated from a MainFrame, if not currently, than in its recent past.
First I created the schema I needed:
CREATE TABLE Schedules
(
Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
EmpName NVARCHAR(255) NOT NULL,
StartDate DATETIME NOT NULL,
EndDate DATETIME NOT NULL,
Roster NVARCHAR(50) NOT NULL
);
GO
INSERT INTO Schedules
SELECT 'Bob','12-Jun-06','18-Jun-06','_*___**'
UNION ALL
SELECT 'Mary','12-Jun-06','18-Jun-06','*_*__*_'
GO
Then, I started playing with DatePart(), getting the day of the week of StartDate from each row. This gave me the index I needed into the Roster column. But, this only told me whatever the employee worked or didn't work on the start date, which isn't very helpful. What I needed was a way to check for all the values between StartDate and EndDate.
I posted about this issue a while ago, and I made use of this techqnique here:
CREATE FUNCTION DateRange ( @start datetime, @end datetime )
RETURNS @DateRange TABLE ( CurrentDate datetime )
AS
BEGIN
WHILE (@start <= @end)
BEGIN
INSERT INTO @DateRange(CurrentDate) VALUES(@start)
SELECT @start = DATEADD(day,1,@start)
END
RETURN
END
GO
Conceptually, what I wanted was this:
SELECT
IndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)
FROM Schedules, DateRange(StartDate,EndDate)
Unfortantely, DateRange() is a table valued function, and what this query ask from SQL Server is to join each row in the Schedules table to another table. This is not possible, of course.
I settled on faking it using this appraoch:
WITH AllDatesInTable(CurrentDate) AS
(
SELECT CurrentDate FROM dbo.DateRange(
(SELECT MIN(StartDate) FROM Schedules),
(SELECT Max(EndDate) FROM Schedules) )
)
SELECT
TestIndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)
FROM Schedules JOIN AllDatesInTable
ON CurrentDate BETWEEN StartDate AND EndDate
This query uses Common Table Expression to define a table that has all the dates in the Schedules table. Notice that I constrained it to all the dates in the current row. In essense, this give me a row per each date in the date range of each row. This is the basis of solving this problem.
The other issue is the wrapping of the day index in the roster. This is a bit complicated because we need to take into account three things. SQL Server string handling is 1 base, not 0 based (argh!), we are shifting based on the start date functionality, and we need to wrap around correctly. In order to handle this issue I created this function:
CREATE FUNCTION IndexInRoster(@StartDate DATETIME, @CurrentDate DATETIME, @RosterLen INT)
RETURNS INT AS
BEGIN
DECLARE @Result int
SET @Result = (DATEDIFF(day,@StartDate,@CurrentDate) + DATEPART(dw,@StartDate)) % (@RosterLen)
IF @Result = 0
RETURN @RosterLen
RETURN @Result
END
GO
The check for @Result equals 0 is there because SQL Server is using 1 based string handling.
Brining it all together, we get this:
WITH AllDatesInSchedulesTable(CurrentDate) AS
(
SELECT CurrentDate FROM dbo.DateRange(
(SELECT MIN(StartDate) FROM Schedules),
(SELECT Max(EndDate) FROM Schedules) )
)
SELECT
EmpName,
CONVERT(NVARCHAR, CurrentDate,103) Date,
HasWorked = CASE SUBSTRING(Roster,
dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)
WHEN '_' THEN 0
WHEN '*' THEN 1
ELSE NULL
END
FROM Schedules JOIN AllDatesInSchedulesTable
ON CurrentDate BETWEEN StartDate AND EndDate
And the result of this query:
EmpName | Date | HasWorked |
Bob | 12/06/2006 | 1 |
Mary | 12/06/2006 | 0 |
Bob | 13/06/2006 | 0 |
Mary | 13/06/2006 | 1 |
Bob | 14/06/2006 | 0 |
Mary | 14/06/2006 | 0 |
Bob | 15/06/2006 | 0 |
Mary | 15/06/2006 | 0 |
Bob | 16/06/2006 | 1 |
Mary | 16/06/2006 | 1 |
Bob | 17/06/2006 | 1 |
Mary | 17/06/2006 | 0 |
Bob | 18/06/2006 | 0 |
Mary | 18/06/2006 | 1 |
And from here it is trivial to get to whatever format you want.
There
is another holy war converging about the SP vs. Parameterized SQL topic.
I got into one not long ago, and I wouldn't get into this one if there wasn't
some new stuff brewing in it.
Eric
Wise started this with a The
Pragmatic Adhoc SQL vs Stored Procedures Discussion, Jeremy D. Miller responded
with Why
I do not use Stored Procedures and then Eric posted a new post, this time
with In
Response To No Stored Procedures which has some new questions that I have
not seen before.
1.
About the security, naturally, security
is a multi layered approach. In the database, I would grant permissions to
views or tables according to the user's need. Granting a SELECT permissions
does not equals to letting the user SA access to the database, after all. I also
tend to add another layer in the application level, which is completely transparent
to my code and my business objects, you can see a high level overview here.
Then there is the application logic itself.
2. Transactional Database Structures : A good coverage of the topic
can be found in Roy 's What payroll
system are a pain to design – Temporal data and in Eric's "Transactional"
Database Structures. I am currently working on a big payroll system
which is using NHibernate, and this issue is a major PITA. The problem is that both
the relations between objects and the objects themselves are changed as time
goes by. We solved this issue by deciding that each object is the system
is composed of Id, which cannot be changed, and a history, which contains
immutable version of the data for a specific date. A highly simplified version
of this will look like this:
This way, I can ask any object for a view of how it was in a point in time. The
views also handle the associations, so if I ask an employee to see its view in
2000-12-31, I can easily get any data that I need without dealing with the
complex timed model. For instance, if I want to check if the employee had a car in a certain date, I can use this code:
EmployeeView employeeView = emp.For(new DateTime(2000,12,31)); if(employeeView.BenefitPackage.HasCar) Console.WriteLine("The
employee {0} had a car in {1}", |
This model translate very well into something NHibernate can handle. It reduce
a lot of the problem that you have when you deal with such system, since
the infrastructure code remove a lot of the problems that you get in such
models. To set a new property, I ask the employee to give me its current
status, change whatever I want, and then I give it back and saves. A new record
is inserted to the database, and the employee's current state moves to the
newest piece of data.
3.
OR/M in complex databases structures. The
system above gets its schema from a main frame 30 years old or more, written by
a bunch of guys that thought that transliteration Hebrew in English is fun
(with spelling mistakes!). Enough said, I think.
4.
Stored Procedures makes TDD slower. The
problem with SP and TDD is that you don't get any seam to verify the
expected behavior. I may have a procedure that takes a long time to run,
and I need to call it from another procedure that eventually calls that
procedure, I'm in more than a bit of a bind. I can replace the long running procedure
on the fly, and spend the rest of my life running from angry DBAs, but that
about all I can do. There is no way to put mocks in the database. And database
access takes time. Often a lot of time.
Note
about security: If an attacker is in the position where he can run
arbitrary code in your system, you are screwed no matter what. OR/M, ADO.Net or
SP, he can make the appropriate calls to do whatever he wants.
Oh,
and unrelated tidbit, in the last six months I wrote over 600 stored procedures
(data warehousing project), so I feel confident that I know my stored
procedures.
I will give some wierder problem next time, Scott solved the problem (the simple one and the harder) in a really nice way. What more, he posted about how he solved the problem, and this is great.
I often don't bother to explicitly think about the step to a solution, so it was a pleasant surprise to see how he got to the answer, not to mention that anyone reading it gains some valuable insight about how to solve those kind of problems.
Great job Scott!
FromDate | ToDate |
01/01/2000 | 31/01/2000 |
01/02/2000 | 28/02/2000 |
05/03/2000 | 31/03/2000 |
I have the following table, what must I do to get to this one:
FromDate | ToDate |
01/01/2000 | 28/02/2000 |
05/03/2000 | 31/03/2000 |
Basically, the first two date ranges were merge into a single date range because they were adjunct to each other.
Update: It looks like I barely get a chance to post a problem before it is solved. Let try increasing the complexity of this one. Good job, Scott.
Now I want to get all the inclusive date range. So for this data:
FromDate | ToDate |
01/01/2000 | 31/01/2000 |
01/02/2000 | 28/02/2000 |
05/03/2000 | 31/03/2000 |
01/02/2000 | 15/02/2000 |
15/03/2000 | 15/04/2000 |
And the result should be:
FromDate | ToDate |
01/01/2000 | 28/02/2000 |
05/03/2000 | 15/04/2000 |
Chris Bilson give a nearly complete answer of how to find the valid / invalid commands for an inmate in the comments of the post. It is a very elegant solution, in my opinion. I took upon myself to improve his solution a bit. The result is this query:
SELECT
inmate.FirstName,
inmate.Surname,
command.IssuedAt AS Start,
dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt) AS [End],
nextCommand.IssuedAt AS StartOfNextCommand,
CASE
WHEN DateDiff(ms,
dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt) ,
nextCommand.IssuedAt) > 0 THEN 'Gap In Commands!'
WHEN getdate() > dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt)
AND nextCommand.Id is null THEN 'Command ended!'
ELSE 'Fine' END AS Status
FROM dbo.Commands command INNER JOIN dbo.Inmates inmate
ON command.InmateID = inmate.ID
LEFT OUTER JOIN dbo.Commands nextCommand
ON nextCommand.InmateID = inmate.Id
AND nextCommand.IssuedAt = (SELECT MIN(IssuedAt)
FROM dbo.Commands c3
WHERE c3.InmateId = inmate.Id
AND c3.IssuedAt > command.IssuedAt )
ORDER BY inmate.Id, Start, [End]
The nice part about this solution that how the query match a command with the (logical) next command. Here is the result of the query:
FirstName | Surname | Start | End | StartOfNextCommand | Status |
Bad | Man | 2/5/2006 6:39 PM | 10/5/2006 6:39 PM | 10/5/2006 6:40 PM | Gap In Commands! |
Bad | Man | 10/5/2006 6:40 PM | 11/5/2006 6:40 PM | NULL | Fine |
Nasty | Guy | 3/5/2006 12:00 AM | 8/5/2006 12:00 AM | 7/5/2006 5:00 PM | Fine |
Nasty | Guy | 7/5/2006 5:00 PM | 9/5/2006 5:00 PM | NULL | Fine |
I wish that I had something similar when I needed to do this kind of software when I needed it.
No, I'm not abandoning NHibernate to greener pastors, I'm talking about this post.
I'm not sure what the point of the author was, but on first reading, it read like FUD to me. Maybe it is the general tone of the post, or that there are a lot of numbers without any clue how the author got them, but that was my first feeling. A lot of the points that are raised are valid, but I just don't understand where the numbers came from.
I had a math teacher that used to look at a student excersize and say: "And suddenly there is a QED." when we did something stupid. I fill like saying it here.
Take for instance the part about entities and methods. I present you my most common method of dealing with O/RM:
public interface IRepository<T>
{
void Save(T item);
void Delete(T item);
ICollection<T> Find(params Predicate[] preds);
T Load(int id);
T TryLoad(int id);
}
That is it. The implementation is generic as well, so I go five methods, not matter how many entities I have. In theory I can switch from O/RM to DAO to SOA to AOM (Acronym Of Month) with very little effort. I would replace the implementation of the IRepository<T>, and I'm done. Five methods, about 2.5 hours if I understand the math correctly, we'll add a couple of days to be safe, but we are still under 20 hours. I don't even need to touch the UI since it works against the interface, and not the implementation.
The problem is that this math is that it is based on nothing in particular. I recently had to evaluate moving from one O/RM to another, and I had to survey the number of places where I was dependant on the O/RM. I counted 23 major features that I had to have if I didn't want to basically re-write my application from scratch.
Those ranged from lazy loading to cascades to reference equality for same entities in the same session, etc.
Replacing the O/RM would've have been hard at best, impossible at worst, and painful whatever I would have done. And I used a very similar interface to isolate myself from the O/RM (even though I never had any intention of replacing it).
Frankly, the issue is not the support of interface, but far smaller interactions. From knowing that you can map the same fields in a class to different properties in a hierarchy (an integer in one case, an object in the other, etc) is very powerful tool that will cause quite a bit of pain if you need to stop using it. That is where the cost of switching O/RM will come into play.
If you are doing anything at all interesting with your O/RM of choice (or hand written DAL), then you are going to get into some serious trouble if you think about switching.
For instance, I'm using NHibernate's Table Hierarchy to save strategies to the database, so when I load the object later, I'll get all its saved strategies with their parameters. It is making my life so much easier, but I don't want to even think what I would have to do if I had to write it all on my own.
To conclude, I agree with the first two observations, and disagree with the third. You can certainly optimize a good O/RM as much as you like. Usually you don't even need to see the O/RM code for this.
I really don't agree with his conclustions, though.