For Experts Only
There are some things that programmers really shouldn't do. I was once called to help figuring out why a batch process would run for a few hours at 100%. The previous guy has claimed that this was a natural occurance of the task at hand (loading XML file to DB), and that he had already optimized it as far as possible.
int count = 0;
XmlDocument xdoc = new XmlDocument();
xdoc.Load(filePath);
foreach(XmlNode node in xdoc.SelectNodes("data/row"))
{
count++;
new Thread(delegate(object state){
XmlNode n = (XmlNode)state;
using(SqlConnection con = new SqlConnection("... "))
using(SqlCommand cmd = con.CreateCommand ())
{
cmd.CommandText = "INSERT INTO .... ";
cmd.ExecuteNonQuery();
}
Interlocked.Decrement(out count);
}).Start(node);
}
while(count!=0);
After de-optimizing the code, I managed to get 10,000% performance improvement, and you could actually use the server for more than a single task.
Comments
That's some funny code. I'm going to make a wild guess that guy has never heard of "the simplest thing that could possibly work."
I don't use threading very much but wouldn't you run out of threads pretty quickly on large jobs?
not only the thread pool is limited, the Connection pool is, also.
Well, even if the client code was not experiencing any bottlenecks on it's own (an unlimited thread pool and connection pool), the sql server would probably be pushed to the edge with the concurrent inserts to the db.
If you do not know sh5t about threading/reflection/(programming)/etc, you should not use threading/reflection/(programming)/etc on production code/
That's just beautiful. It makes me feel sick and giddy.
I was reeling so much that it took me ages to see the tiny little busy-wait at the end. That's a good way to make sure the CPU usage is nailed to the ceiling.
All the threading/connection issues are huge, but Rik has spotted the really nice piece at the end.
It tries to execute ~1,000 threads (all doing a single insert) while it has a thread that does an endless busy wait.
Can you spot the interesting race condition?
Lovely code.
The count is incremented without using Interlocked.Increment and then decremented in the Thread. The last line attempts to wait until all threads have performed their work, which is not only an inefficient method of waiting, but count may miss 0 altogether due to the lack of locking on the increment.
ken,
perhaps i too am not very familiar with sh5t, but the example does not use a thread pool.
:-)
Oren, there is a problem with your blog app. http://ayende.com/Blog/archive/2007/05/12/For-Experts-Only.aspX
(casing of X).
I'd turn off those remote error msgs!
Pete.
Sumo,
There isn't a use of the thread pool here, so this basically choke the systems with hundreds of threads all trying to run concurrently.
I think that was what Ken meant.
Removed the error messages
Well I guess the main problem is the guy just stand up and say its optimized without writing some simple Console.WriteLine to find out what happen... A debugger just wont tell you what has happened... a good log does.
Comment preview