Ayende @ Rahien

filter by tags archive

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (380) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1454) rss
ravendb.net (538) rss
reviews (184) rss

2025
- July (4)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Think inside the database - RavenDB with native GenAI integration

Aug 31 2006

Spot the WTF

time to read 1 min | 20 words

Tweet Share Share 0 comments

Tags:

Thankfully, I haven't run into this one yet, but can you spot the WTF here?

Jul 24 2006

Fixing log4net 1.2.9 AdoNetAppender NULL bug

time to read 8 min | 1449 words

Tweet Share Share 0 comments

Tags:

In log4net 1.2.9 there is a bug in the AdoNetAppender. It will not log null values appropriately. This bug is fixed in log4net 1.2.10, but it is not always possible to move to the next version. (In my case, both NHibernate and Castle uses log4net 1.2.9)

I tried recompiling everything to use 1.2.10, but it took too long, and eventually I simply wrote this little trigger:

CREATE TRIGGER [ReplaceNullLookAlikeWithNulls]

ON [dbo].[Logs]

INSTEAD OF INSERT

BEGIN

-- SET NOCOUNT ON added to prevent extra result sets from

-- interfering with SELECT statements.

SET NOCOUNT ON;

INSERT INTO .[dbo].[Logs]

([Date]

,[Thread]

,[Level]

,[Logger]

,[Message]

,[Exception]

,[Filename]

SELECT

[Date]

,[Thread]

,[Level]

,[Logger]

,[Message]

,[Exception]

,case [Filename] when '(null)' then NULL else [Filename] end

FROM Inserted

END

Not the best solution, but it will hold water.

Jul 01 2006

Pessimistic Locking

time to read 12 min | 2212 words

Tweet Share Share 0 comments

Tags:

I need to be able to ask the database for some rows to work on, and I need to be sure that I am the only one that works on those rows. After thinking about it, I came up with this SQL:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION

INSERT INTO Locks(LockedBy, LockedAt, LcokedItemId, LockedItemType)

OUTPUT inserted.Id

SELECT TOP 100 @lockedBy, getdate(), item.Id, @itemType

FROM Items item

WHERE NOT EXISTS (SELECT 1 FROM Locks lock

WHERE lock.LcokedItemId = item.id AND lock.LockedItemType = @itemType)

ORDER BY item.Id

COMMIT TRANSACTION

This is the first time I need to implement pessimistic locking, and I am not sure if this is the right way to go. The work I am doing is disconnected, so I can't just lock the rows and continue to work.

Any suggestions?

Update: This is what I have now:

CREATE PROCEDURE lockAndGetItems AS

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION

DECLARE @lockedBy uniqueIdentifier, @countOfItemsToLock int

SET @lockedBy = '68A6D54C-557D-428D-8B82-5D68C9C1B33E'

-- Get number of rows already locked by me

SELECT @countOfItemsToLock = 100 - COUNT(Id) From Locks where LockedBy = @lockedBy

-- Lock rows to complete to an even 100

INSERT INTO Locks(LockedBy, LockedAt, LockedItemId)

SELECT TOP (@countOfItemsToLock) @lockedBy, getdate(), item.Id

FROM Items item

WHERE NOT EXISTS (SELECT 1 FROM Locks lock

WHERE lock.LockedItemId = item.id and lock.LockedBy = @lockedBy)

ORDER BY item.Id

-- Return row data

SELECT Items.* FROM Items JOIN Locks ON Items.Id = Locks.LockedItemId AND LockedBy = @lockedBy

COMMIT TRANSACTION

Basically it fetch the first 100 items designated for this SP. Each SP has a single caller, so I don't worry about expiring the lock. I will always get the locked rows. In the case of a crash when the rows are locked, when the client is restarted, it will get the same rows that it already locked, and can start working on them. The reason that I implement is as a SP with hard coded Guid is that each SP locks a different table.

Jun 23 2006

Better solution to the Tricky SQL problem

time to read 4 min | 614 words

Tweet Share Share 0 comments

Tags:

Moran has pointed me to the CROSS APPLY syntax in T-SQL 2005, which does allows joining against a table valued function. Using this, the query goes down to this:

SELECT

EmpName,

CONVERT(NVARCHAR, CurrentDate,103) Date,

HasWorked = CASE SUBSTRING(Roster,

dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)

WHEN '_' THEN 0

WHEN '*' THEN 1

ELSE NULL

END

FROM Schedules CROSS APPLY DateRange(StartDate,EndDate)

This is much nicer way to deal with it. Considerring that I am using similar techniques all over the place, this is a really good thing to know.

Jun 22 2006

Solving a Tricky SQL Problem

time to read 55 min | 10901 words

Tweet Share Share 0 comments

Tags:

I'm starting to get quite a bit of mail from this blog. Some of those questions are about subjects I can answer immediately, some require a fair amount of work (which can be had, if you really want), and the more interesting ones are those that require some thinking, but does not require too much time. This question from Dave is the best one so far, and I got his permission to blog about it, so I'm doubly happy.

The issue is working against a legacy database to get the data for further processing. I'll let Dave explain the issue, since he does it much better:

I have to write a query to generate a report over some interesting data. It's basically scheduling which days people are working. The data looks like this:

Id	EmpName	StartDate	EndDate	Roster
1	Bob	12/06/2006	18/06/2006	____*
2	Mary	12/06/2006	18/06/2006	___*_

The trick is, the roster field contains a string with a _ or * depending on wether the person is scheduled to work that day or not, but the first character always starts on the sunday. The startdate and enddate can be any day of the week.
In the example above, the 12-jun is a monday, so monday corresponds to the second character in the roster string, so Bob's working and Mary's not.
The roster string wraps around, so the first character of the roster string actually corresponds with the enddate here! Now, this roster string could be 7, 10, 14 days long.

I could get the report out if I can write a query to get it to this:

Employee	DateWorking
Bob	12/06/2006
Bob	16/06/2006
Mary	13/06/2006
Mary	16/06/2006

By the way, I haven't asked, but I'll bet that this schema has originated from a MainFrame, if not currently, than in its recent past.

First I created the schema I needed:

CREATE TABLE Schedules

(

Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,

EmpName NVARCHAR(255) NOT NULL,

StartDate DATETIME NOT NULL,

EndDate DATETIME NOT NULL,

Roster NVARCHAR(50) NOT NULL

);

INSERT INTO Schedules

SELECT 'Bob','12-Jun-06','18-Jun-06','_*___**'

UNION ALL

SELECT 'Mary','12-Jun-06','18-Jun-06','*_*__*_'

Then, I started playing with DatePart(), getting the day of the week of StartDate from each row. This gave me the index I needed into the Roster column. But, this only told me whatever the employee worked or didn't work on the start date, which isn't very helpful. What I needed was a way to check for all the values between StartDate and EndDate.

I posted about this issue a while ago, and I made use of this techqnique here:

CREATE FUNCTION DateRange ( @start datetime, @end datetime )

RETURNS @DateRange TABLE ( CurrentDate datetime )

BEGIN

WHILE (@start <= @end)

BEGIN

INSERT INTO @DateRange(CurrentDate) VALUES(@start)

SELECT @start = DATEADD(day,1,@start)

END

RETURN

END

Conceptually, what I wanted was this:

SELECT

IndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)

FROM Schedules, DateRange(StartDate,EndDate)

Unfortantely, DateRange() is a table valued function, and what this query ask from SQL Server is to join each row in the Schedules table to another table. This is not possible, of course.

I settled on faking it using this appraoch:

WITH AllDatesInTable(CurrentDate) AS

(

SELECT CurrentDate FROM dbo.DateRange(

(SELECT MIN(StartDate) FROM Schedules),

(SELECT Max(EndDate) FROM Schedules) )

)

SELECT

TestIndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)

FROM Schedules JOIN AllDatesInTable

ON CurrentDate BETWEEN StartDate AND EndDate

This query uses Common Table Expression to define a table that has all the dates in the Schedules table. Notice that I constrained it to all the dates in the current row. In essense, this give me a row per each date in the date range of each row. This is the basis of solving this problem.

The other issue is the wrapping of the day index in the roster. This is a bit complicated because we need to take into account three things. SQL Server string handling is 1 base, not 0 based (argh!), we are shifting based on the start date functionality, and we need to wrap around correctly. In order to handle this issue I created this function:

CREATE FUNCTION IndexInRoster(@StartDate DATETIME, @CurrentDate DATETIME, @RosterLen INT)

RETURNS INT AS

BEGIN

DECLARE @Result int

SET @Result = (DATEDIFF(day,@StartDate,@CurrentDate) + DATEPART(dw,@StartDate)) % (@RosterLen)

IF @Result = 0

RETURN @RosterLen

RETURN @Result

END

The check for @Result equals 0 is there because SQL Server is using 1 based string handling.

Brining it all together, we get this:

WITH AllDatesInSchedulesTable(CurrentDate) AS

(

SELECT CurrentDate FROM dbo.DateRange(

(SELECT MIN(StartDate) FROM Schedules),

(SELECT Max(EndDate) FROM Schedules) )

)

SELECT

EmpName,

CONVERT(NVARCHAR, CurrentDate,103) Date,

HasWorked = CASE SUBSTRING(Roster,

dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)

WHEN '_' THEN 0

WHEN '*' THEN 1

ELSE NULL

END

FROM Schedules JOIN AllDatesInSchedulesTable

ON CurrentDate BETWEEN StartDate AND EndDate

And the result of this query:

EmpName	Date	HasWorked
Bob	12/06/2006	1
Mary	12/06/2006	0
Bob	13/06/2006	0
Mary	13/06/2006	1
Bob	14/06/2006	0
Mary	14/06/2006	0
Bob	15/06/2006	0
Mary	15/06/2006	0
Bob	16/06/2006	1
Mary	16/06/2006	1
Bob	17/06/2006	1
Mary	17/06/2006	0
Bob	18/06/2006	0
Mary	18/06/2006	1

And from here it is trivial to get to whatever format you want.

May 26 2006

Advanced Uses For OR/M

time to read 5 min | 998 words

Tweet Share Share 0 comments

Tags:

There is another holy war converging about the SP vs. Parameterized SQL topic. I got into one not long ago, and I wouldn't get into this one if there wasn't some new stuff brewing in it.

Eric Wise started this with a The Pragmatic Adhoc SQL vs Stored Procedures Discussion, Jeremy D. Miller responded with Why I do not use Stored Procedures and then Eric posted a new post, this time with In Response To No Stored Procedures which has some new questions that I have not seen before.

1. About the security, naturally, security is a multi layered approach. In the database, I would grant permissions to views or tables according to the user's need. Granting a SELECT permissions does not equals to letting the user SA access to the database, after all. I also tend to add another layer in the application level, which is completely transparent to my code and my business objects, you can see a high level overview here. Then there is the application logic itself.

2. Transactional Database Structures : A good coverage of the topic can be found in Roy 's What payroll system are a pain to design – Temporal data and in Eric's "Transactional" Database Structures. I am currently working on a big payroll system which is using NHibernate, and this issue is a major PITA. The problem is that both the relations between objects and the objects themselves are changed as time goes by. We solved this issue by deciding that each object is the system is composed of Id, which cannot be changed, and a history, which contains immutable version of the data for a specific date. A highly simplified version of this will look like this:

This way, I can ask any object for a view of how it was in a point in time. The views also handle the associations, so if I ask an employee to see its view in 2000-12-31, I can easily get any data that I need without dealing with the complex timed model. For instance, if I want to check if the employee had a car in a certain date, I can use this code:

EmployeeView employeeView = emp.For(new DateTime(2000,12,31));

if(employeeView.BenefitPackage.HasCar)

Console.WriteLine("The employee {0} had a car in {1}",
employeeView.Name, employeeView.Date);

This model translate very well into something NHibernate can handle. It reduce a lot of the problem that you have when you deal with such system, since the infrastructure code remove a lot of the problems that you get in such models. To set a new property, I ask the employee to give me its current status, change whatever I want, and then I give it back and saves. A new record is inserted to the database, and the employee's current state moves to the newest piece of data.

3. OR/M in complex databases structures. The system above gets its schema from a main frame 30 years old or more, written by a bunch of guys that thought that transliteration Hebrew in English is fun (with spelling mistakes!). Enough said, I think.

4. Stored Procedures makes TDD slower. The problem with SP and TDD is that you don't get any seam to verify the expected behavior. I may have a procedure that takes a long time to run, and I need to call it from another procedure that eventually calls that procedure, I'm in more than a bit of a bind. I can replace the long running procedure on the fly, and spend the rest of my life running from angry DBAs, but that about all I can do. There is no way to put mocks in the database. And database access takes time. Often a lot of time.

Note about security: If an attacker is in the position where he can run arbitrary code in your system, you are screwed no matter what. OR/M, ADO.Net or SP, he can make the appropriate calls to do whatever he wants.

Oh, and unrelated tidbit, in the last six months I wrote over 600 stored procedures (data warehousing project), so I feel confident that I know my stored procedures.

May 19 2006

Date Merging Challange Solved

time to read 1 min | 98 words

Tweet Share Share 0 comments

Tags:

I will give some wierder problem next time, Scott solved the problem (the simple one and the harder) in a really nice way. What more, he posted about how he solved the problem, and this is great.

I often don't bother to explicitly think about the step to a solution, so it was a pleasant surprise to see how he got to the answer, not to mention that anyone reading it gains some valuable insight about how to solve those kind of problems.

Great job Scott!

May 18 2006

SQL Challange, Date Merging

time to read 15 min | 2938 words

Tweet Share Share 0 comments

Tags:

FromDate	ToDate
01/01/2000	31/01/2000
01/02/2000	28/02/2000
05/03/2000	31/03/2000

I have the following table, what must I do to get to this one:

FromDate	ToDate
01/01/2000	28/02/2000
05/03/2000	31/03/2000

Basically, the first two date ranges were merge into a single date range because they were adjunct to each other.

Update: It looks like I barely get a chance to post a problem before it is solved. Let try increasing the complexity of this one. Good job, Scott.

Now I want to get all the inclusive date range. So for this data:

FromDate	ToDate
01/01/2000	31/01/2000
01/02/2000	28/02/2000
05/03/2000	31/03/2000
01/02/2000	15/02/2000
15/03/2000	15/04/2000

And the result should be:

FromDate	ToDate
01/01/2000	28/02/2000
05/03/2000	15/04/2000

May 03 2006

The Prison Domain Model Challange: Answers

time to read 20 min | 3895 words

Tweet Share Share 0 comments

Tags:

Chris Bilson give a nearly complete answer of how to find the valid / invalid commands for an inmate in the comments of the post. It is a very elegant solution, in my opinion. I took upon myself to improve his solution a bit. The result is this query:

SELECT

inmate.FirstName,

inmate.Surname,

command.IssuedAt AS Start,

dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt) AS [End],

nextCommand.IssuedAt AS StartOfNextCommand,

CASE

WHEN DateDiff(ms,

dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt) ,

nextCommand.IssuedAt) > 0 THEN 'Gap In Commands!'

WHEN getdate() > dbo.CommandEnds(command.CommandType, command.ValidFor, command.IssuedAt)

AND nextCommand.Id is null THEN 'Command ended!'

ELSE 'Fine' END AS Status

FROM dbo.Commands command INNER JOIN dbo.Inmates inmate

ON command.InmateID = inmate.ID

LEFT OUTER JOIN dbo.Commands nextCommand

ON nextCommand.InmateID = inmate.Id

AND nextCommand.IssuedAt = (SELECT MIN(IssuedAt)

FROM dbo.Commands c3

WHERE c3.InmateId = inmate.Id

AND c3.IssuedAt > command.IssuedAt )
ORDER BY inmate.Id, Start, [End]

The nice part about this solution that how the query match a command with the (logical) next command. Here is the result of the query:

FirstName	Surname	Start	End	StartOfNextCommand	Status
Bad	Man	2/5/2006 6:39 PM	10/5/2006 6:39 PM	10/5/2006 6:40 PM	Gap In Commands!
Bad	Man	10/5/2006 6:40 PM	11/5/2006 6:40 PM	NULL	Fine
Nasty	Guy	3/5/2006 12:00 AM	8/5/2006 12:00 AM	7/5/2006 5:00 PM	Fine
Nasty	Guy	7/5/2006 5:00 PM	9/5/2006 5:00 PM	NULL	Fine

I wish that I had something similar when I needed to do this kind of software when I needed it.

Apr 30 2006

Replacing O/RM Is Hard

time to read 4 min | 601 words

Tweet Share Share 0 comments

Tags:

No, I'm not abandoning NHibernate to greener pastors, I'm talking about this post.
I'm not sure what the point of the author was, but on first reading, it read like FUD to me. Maybe it is the general tone of the post, or that there are a lot of numbers without any clue how the author got them, but that was my first feeling. A lot of the points that are raised are valid, but I just don't understand where the numbers came from.
I had a math teacher that used to look at a student excersize and say: "And suddenly there is a QED." when we did something stupid. I fill like saying it here.

Take for instance the part about entities and methods. I present you my most common method of dealing with O/RM:

public interface IRepository<T>
{
    void Save(T item);
    void Delete(T item);
    ICollection<T> Find(params Predicate[] preds);
    T Load(int id);
    T TryLoad(int id);
}

That is it. The implementation is generic as well, so I go five methods, not matter how many entities I have. In theory I can switch from O/RM to DAO to SOA to AOM (Acronym Of Month) with very little effort. I would replace the implementation of the IRepository<T>, and I'm done. Five methods, about 2.5 hours if I understand the math correctly, we'll add a couple of days to be safe, but we are still under 20 hours. I don't even need to touch the UI since it works against the interface, and not the implementation.
The problem is that this math is that it is based on nothing in particular. I recently had to evaluate moving from one O/RM to another, and I had to survey the number of places where I was dependant on the O/RM. I counted 23 major features that I had to have if I didn't want to basically re-write my application from scratch.
Those ranged from lazy loading to cascades to reference equality for same entities in the same session, etc.
Replacing the O/RM would've have been hard at best, impossible at worst, and painful whatever I would have done. And I used a very similar interface to isolate myself from the O/RM (even though I never had any intention of replacing it).

Frankly, the issue is not the support of interface, but far smaller interactions. From knowing that you can map the same fields in a class to different properties in a hierarchy (an integer in one case, an object in the other, etc) is very powerful tool that will cause quite a bit of pain if you need to stop using it. That is where the cost of switching O/RM will come into play.
If you are doing anything at all interesting with your O/RM of choice (or hand written DAL), then you are going to get into some serious trouble if you think about switching.

For instance, I'm using NHibernate's Table Hierarchy to save strategies to the database, so when I load the object later, I'll get all its saved strategies with their parameters. It is making my life so much easier, but I don't want to even think what I would have to do if I had to write it all on my own.
To conclude, I agree with the first two observations, and disagree with the third. You can certainly optimize a good O/RM as much as you like. Usually you don't even need to see the O/RM code for this.
I really don't agree with his conclustions, though.

Oren Eini

Oren Eini

CEO of RavenDB

Spot the WTF

Fixing log4net 1.2.9 AdoNetAppender NULL bug

Pessimistic Locking

Better solution to the Tricky SQL problem

Solving a Tricky SQL Problem

Advanced Uses For OR/M

Date Merging Challange Solved

SQL Challange, Date Merging

The Prison Domain Model Challange: Answers

Replacing O/RM Is Hard

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed