Piping, Primary Keys and Three Hours I'll not get back
Okay, so I mentioned that I'd a query that violated a primary key constraint, and I couldn't figure out why. Well, I finally got a hold of what was going on. It's simple and non-obvious at the same time. Let's start with a code example that shows the problem and then I'll put in a couple of words about the reasons behind it. First, let's create a table with a unique constraint (easier than a primary key):
CREATE TABLE Foo
(
Bar INT UNIQUE
)
Now, let's insert some data into the table:
INSERT INTO Foo
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3
So far, everything is nice and cozy, but now I want to add more numbers, so I want to make sure that I won't break the code, so now I'll add a guard clause to the next inserts:
INSERT INTO Foo
SELECT Kabum FROM
(
SELECT 4 as Kabum
SELECT 5 as Kabum
SELECT 5 as Kabum
) Bang
WHERE NOT EXISTS
(SELECT 1 FROM Foo where Bang.Kabum = Bar)
Running the above result in this error:
Violation of UNIQUE KEY constraint 'UQ__Foo__5832119F'. Cannot insert duplicate key in object 'dbo.Foo'.
Do you realize what happened here? We failed because when we checked it was just fine to put 5 into Foo, but when we actually did put the second 5 into Foo, that caused a constraint violation. Go figure that when you're dealing with sixteen million records.
The solution to this problem is to add a distinct to the select statement, that would make sure that your own query isn't causing the very same query to fail.
On that note, I wonder if I can call it a concurrency bug.
Comments
Comment preview