Solving an OutOfMemoryException
The following code generates an OutOfMemoryException in certain circumstances (see my previous post about it):
private string ReplaceParametersWithValues(string statement, bool useComment) { if (sqlStatement.Parameters == null) return statement; foreach (var parameter in sqlStatement.Parameters .Where(x => x != null && x.Name.Empty() == false) .OrderByDescending(x => x.Name.Length)) { var patternNameSafeForRegex = Regex.Escape(parameter.Name); var pattern = patternNameSafeForRegex + @"(?![\d|_])"; //static Regex methods will cache the regex, so we don't need to worry about that var replacement = useComment ? parameter.Value + " /* " + parameter.Name + " */" : parameter.Value; statement = Regex.Replace(statement, pattern, replacement); } return statement; }
When trying to resolve this, I had to adjust my assumptions. The code above was written to handle statements of 1 – 5 kilobytes with up to a dozen or two parameters.
But it started crashing badly with a statement of 190 Kilobyte and 4 thousands parameters. It is fairly obvious that this method generates a lot of temporary strings. Probably leading to GC pressure and a lot of other nasty stuff beside. Unfortunately, there isn’t a set of Regex API for StringBuilder, so I had to make do with my own approach.
I chose a fairly brute force approach for that, and I am sure it can be made better, but basically, it is just using a StringBuilder and doing the work manually.
private string ReplaceParametersWithValues(string statement, bool useComment) { if (sqlStatement.Parameters == null) return statement; var sb = new StringBuilder(statement); foreach (var parameter in sqlStatement.Parameters .Where(x => x != null && x.Name.Empty() == false) .OrderByDescending(x => x.Name.Length)) { var replacement = useComment ? parameter.Value + " /* " + parameter.Name + " */" : parameter.Value; int i; for ( i = 0; i < sb.Length; i++) { if(sb[i] != parameter.Name[0]) continue; int j; for (j = 1; j < parameter.Name.Length && (j+i) < sb.Length; j++) { if (sb[i + j] != parameter.Name[j]) break; } if (j != parameter.Name.Length) continue; if ((i + j) >= sb.Length || char.IsDigit(sb[i + j]) == false) { sb.Remove(i, parameter.Name.Length); sb.Insert(i, replacement); i += replacement.Length - 1; } } } return sb.ToString(); }
The code is somewhat complicated due to the check that I need to make, I can’t just use sb.Replace(), because I need to replace the value if it isn’t followed by a digit.
At any rate, this code is much more complicated, but it is also much more conservative in terms of memory usage.
Comments
You could use Regex.Replace on the pattern @"@(? <paramname[\w\d_]{1,128})(?!\d|_)" and in the match handler you figure out which parameter was matched. That would do a single scan over the string.
Tobi,
This is a nice one :-)
Another nice thing with Tobi's suggestion is that you won't get "double replaces". E.g. if the value of @a contains '...@b...', @b shouldn't be replaced and won't be with Tobi's way...
Tobi, [\w\d_] can be \w instead. \w includes both digits and underscore...
Why do you need to check that values aren't being followed by a digit?
Is the problem if you've got a parameter @a1 and another parameter @a10? Because then @a10 would be replaced first anyway.
nice comments in the code too
On a similiar situation, I've used a regular expression and a match evaluator to do the replacing.
My expression looks for patterns that are a reference to a tree (a dictionary of dictionaries) and replaces the references with its values.
Interesting and for many unexpected is the fact that you can actually manually implement what a regex does. It is like building a compiler: Many think it is nearly impossible without a phd but the ones who have actually done it know that it is not brain surgery.
And I want to add that this is a nice candidate to be unit-tested with Microsoft Pex. Testing for all relevant inputs that this code does not crash (at the very least) would be an excercise of 2 minutes with pex.
Comment preview