SQL Injection & Optimizations Path
It is silly, but I just had a conversation with one of our developers on SQL Injection. In RavenDB we support replicating to a relational database, which obviously require using SQL. We are doing things properly, with parameters and everything.
No chance for SQL Injection from there. Great, and end of a very short blog post if it was everything.
As it turned out, there is a significant performance difference between:
@p1 = 'users/1' @p2 = 'users/2' DELETE FROM Users WHERE Id IN (@p1,@p1)
And:
DELETE FROM Users WHERE Id IN ('users/1', 'users/2')
Enough that we added that as an option. The reason why related to the vagaries of the database query optimizer, and not really relevant.
This is off by default, obviously. And we use parameters by choice & preference. But we still added a minimal “protection” by adding:
sqlValue.Replace("'", "''")
Considering that this isn’t meant for user’s input (it is for document ids), that is something that is annoying.
Any suggestions on how to improve this?
Comments
If you use SQL Server you can try OPTION(RECOMPILE), see: http://stackoverflow.com/questions/4459425/sql-server-query-fast-with-literal-but-slow-with-variable
The "significant performance difference"; any chance that is due to parameter sniffing and sub-optimal plan caches? if so, some vendors offer mechanisms to avoid that, for example
OPTION (OPTIMIZE FOR UNKNOWN)
on sql server.But yes, I feel your pain; for similar reasons, I added literal injection into "dapper" recently (http://blog.marcgravell.com/2014/06/dapper-some-minor-but-useful-tweaks.html) - at the moment it is limited to integer-based types, for the same injection concern reasons.
How about building a temp table (ar using a table variable passed in from .NET) and calling DELETE using a join or subselect? Does that also have the same performance problems?
I'd be really interested to see how the execution plan differs for these two operations.
Your Id column in your database doesn't happen to be a varchar does it. Cause if you pass a string in as a parameter variable it will assume its of nvarchar type.
This doesn't seem like a big deal until you realise that sql server will not use the index for the query and do a full table scan when you mix and match these types.
Placing the string inline avoids this issue.
Turns out this minimal protection is strictly enough. You can test this by running
SELECT {literalHere}
for all possible two-char strings. Takes a few hours and confirms that this is safe.The SMO libraries do it just like this.
How about DELETE FROM Users WHERE (Id=@p1 OR Id=@p2) ?
Watch out for HEX injection in that string.
Keep http://siderite.blogspot.com/2013/01/why-doubling-single-quotes-is-not.html in mind - there are unicode strings that will be treated as single quotes, but aren't actually single quotes (thus won't get picked up by the Replace call).
Chris, Thanks, that is great to know. This seems to be 2008 and up only, though.
Ayende, "OPTION (RECOMPILE)" is also supported on SQL Server 2005. See this URL:
http://msdn.microsoft.com/en-us/library/ms181714(v=sql.90).aspx
Could you also tell something about what difference occurs that makes the query with the argument inline vs parameter differ? That would be very interesting.
Frank, Thanks, we'll try it.
Are the sproc parameters of type varchar or nvarchar? If it is the second, the comparable query would be
DELETE FROM Users WHERE Id IN (N'users/1', N'users/2')
Comment preview