Benchmarks are useless, yes, again
I was informed about ORMBattle.NET existence a few minutes ago. This is a site that:
ORMBattle.NET is devoted to direct ORM comparison. We compare quality of essential features of well-known ORM products for .NET framework, so this site might help you to:
- Compare the performance of your own solution (based on a particular ORM listed here) with peak its performance that can be reached on this ORM, and thus, likely, to improve it.
- Choose the ORM for your next project taking its performance and LINQ implementation quality into account.
NHibernate is faring rather poorly in the tests there, and I thought that I might respond to that.
First, I want to point out that the company behind the website is Xtensive, which makes another ORM Product. This is not to discredit them, they make it very clear who is behind the site and that they have an ORM product in the benchmark. And they were the one who contacted me about it.
The problem with benchmarks, especially when you are trying to compare a wide variety of products that have different feature sets and different capabilities, is that they are essentially useless. The problem is that in order to be able to measure anything useful, you have to resort to the common denominators, and they are pretty bad.
In the case of the benchmark scenarios used for NHibernate, it shows the problem rather clearly.
I don’t even need to think about anything to know that this is going to perform badly. There are several reasons for that. First, NHibernate was never intended to be a batch processing tool. It is an OLTP tool. The benchmark (all the tests in the benchmark) are aimed specifically at measuring batch processing. That is problem number one.
Second, NHibernate actually contains a lot of features that are aimed to give us great performance in many scenarios, including batch processing. For example, in this case, using a stateless session alone would generate a significant performance boost, not to mention that there is no use of batching whatsoever.
This two tests are also showing examples of “this isn’t how we do things”. For one thing, calling the database in a loop is a bug. For another, I would generate the same result, which greatly improved performance using Executable DML:
   session.CreateQuery("update Simplest s set s.Value = s.Value + 1").ExecuteUpdate();
session.CreateQuery("delete Simplest").ExecuteUpdate();
And so on, and so forth.
To summarize, because I have talked about this before. Trying to compare different products without taking into account their differences is a flawed approach. Moreover, this benchmark is intentionally trying to measure something that NHibernate was never meant to perform. We don’t need to try to optimize those things, because they are meaningless, we have much better ways to resolve things.
And you know what? The problem is that even if you did put those in, they would still be invalid. Benchmarks tend to ignore such things as the impact of the builtin caching features, or the optimization options that are available in the mapping.
In short, if you want me to “admit” that NHibernate isn’t a batch processing tool, I will do so gladly, it was never meant to be. And benchmarks that try to show how it is in batch processing are going to show it being slow. For real world application development, however, NHibernate is a great fit, and show excellent performance.
Oh, and because management told me that I must, if you find perf problems with NHibernate, that is why we have the NHibernate Profiler for you :-)
 




Comments
what tool do you recommend for batching processing?
While I do agree that things can be done better in the code for comparing with Nhibernate, I do not like that you dismiss the whole project as useless because you can't compare different products with different capabilities.
All these products are meant to solve only one and very common problem most of the software project have: data persistence.
And all the developers (and their manager) know that the performance of the solution influence greatly the performance of the whole application.
It's just too easy to say that you should not compare.
Join the project, modify and add some tests, join the battle and show that Nhibernate is better!
The thing I found most interesting was that the high score for DataObjects.Net ... I hadn't heard about it, and it turns out, the creators of the website are also the developers of that product ( http://ormbattle.net/index.php/about.html) ....
So the benchmarks are made in the way the developers of that product think an ORM should perform well... In my opinion this is a bit biased...
I'm not saying there hiding their association, but still it should have been a good idea to attract some outside members to setup the benchmark suite..
Should have read better, you did notice it... (and they did contact you)
That's rather disgraceful way to advertise own tool. I will never use a product of the company who uses such dirty tricks.
fanny, I think comparing ORM 'performance' is a flawed idea because all ORM will have very similar performance in typical scenarios such as load, update or insert, at least they should have because they execute almost the same sql. The test cases should be more tricky, to check additional performance-enhancing features such as caching, query batching, intelligent prefetching and so on, now this would give us an idea of what benefits can be expected by choosing ORM over raw ADO.Net. I would also like to see the comparison of ORM 'usability' like the ease to set up and use in existing projects, 'integrability' with other platform components, learning curve, ugliness of data access code etc.
We use the delete case with the foreach in our project. Just because we did not find any alternative. And off course we suffer the bad performance.
from now on we ' ll use
"session.CreateQuery("delete Simplest where xxx = yyy").ExecuteUpdate();"
Thank for the tip.
With Nhibernate U can do things in several ways and lots of them are wrong.
I d really like to see somewhere a list of Nhibernate begginers common mistakes, because Nhibernate is so not beginner friendly.
@junior programmer: Are you sure you really need to download all that data from your persistent storage, do a little update and than upload it all again?
We use MSSQL2008 in about 98% of our projects. We don't write generic products, so we don't need cross platform database support and that means you can optimize for your database. But for every database their one very important step. Do as much on the database server itself.
Take an import for example. I parse the import file and put all records in an database and use SqlBulkCopy.WriteToServer to send it all to a global temporary table. Than we perform multi queries on that table and finally update the target table.
We use NHibernate in OLTP applications like websites, desktop applications and middleware (ESB, WCF, etc). We don't have much batch processing there.
@Fanny: There nothing wrong with comparing ORM framework, but use REAL tests (like you do with your unittests) and not a batch update benchmark. We didn't they write a simple blog application and written an implementation for every ORM framework that want to compare. That you can write stories and measure how long an certain use case takes. Like creating a blog, creating a blog entry, select the last 10 entries for syndication, etc. Than put all those timings in a matrix table and you have a good comparison of ORM tools because you benchmark something real.
If you want high performance, you don't use a ORM framework to import a massive CSV file. It can certainly make things easier, that than YOU have to trade performance for convenience..
Oren, can I quote my e-mail conversation with you? I disagree with you, and it explains my position very well.
Alex: I also mailed you because you use simply unfair code. I also mailed you how to fix them so you get much higher results with our framework for example.
I've seen these kind of sites before, and although I don't like the results we got (which I know are due to your lack of understanding of our framework, so I am not surprised the results of other frameworks like nhibernate is also poor), I agree with Oren in this: they're actually pretty useless.
Not only can I show you your testing method is seriously flawed, it's also a case of what would the user use in practise? Like Oren showed in this post: if a user has to do ABC, he will use the feature of the framework which suits ABC best. Which very likely will perform much better, because it 1) fits the job to do better as the feature is tailored for the job and 2) doesn't fall into the trap of doing ABC very inefficiently to begin with
However at the end of the day, I can only shrug and move on. I thank you for the 100 extra test queries for linq which showed 3 bugs in our provider which caused many queries to fail (which is a signal you're not benching very good) so I'll fix these for our customers. But you have to understand: these kind of sites have little value, as what Oren described: they don't show how a framework will behave in real-life, in real world projects with real world problems: like updating entities using an expression. Some frameworks can do that with 1 query (like we do, and nhibernate too, can yours?), so a user will do that with ... 1 query in practise not with a loop!
Despite what you might think: customers don't suddenly think "Oh! I have to use dataobjects.net!", as they haven't done that in massive droves in the past, so why would they do that now? You can pat yourself on the back for the most feature-rich framework there is (and I think many of my fellow o/r mapper developers will agree with that), but honestly... if you want to sell your work, you have to invest time in documentation, features your customers want and need, instead of comparison sites which have little value other than "oh look Ma, in my own benchmarks I'm the best!"-kind of ego boosting ;).
Oh, and Alex, where are these 'CUD' tests / calculation methods ? You babble about pages, but you don't use paging code and the Materialization 'test' is equal to the 'Query' test (except a where clause)... which makes no difference for systems other than the ones who use prepared commands.
You see, prepared commands are only useful for these kind of tests, in real life, you don't need them that much, as you need many different commands per batch anyway if you're saving many entities (as they're often changed on different places) or you have many different connections, and a prepared command is gone once the connection is gone :)
Let's hope for the O/R mapper users in the world that we, O/R mapper writers, aren't forced to add optimized code paths for these silly loops, like videocard driver writers have to do to their driver code to show up best in the used benchmarks (which thus says nothing about reallife performance, only that the code does great in the particular benchmark).
Frans, I just e-mailed you about unfair code. Certainly we'll fix all the issues in these particular test scenarios, but won't use anything like single statement table updates in these tests, because they test different case. I'll explain this here further.
And I confirm for LLBLGen there are some, but until now Oren didn't point to any issue to fix for NHibernate.
Concerning NHibernate-like frameworks: do you agree EF is NHibernate-like (POCO, similar DataContext behavior, etc.)? I didn't get any claims related to its tests yet (but let's wait, of course). Anyway, I'm 99.9% sure we know how such frameworks work very well.
Sorry, there must be "And I confirm there are some issues to fix for LLBLGen, but until now Oren didn't point to any issue to fix for NHibernate."
And I'll reply you later in the evening about everything else (e.g. paging). Should get some sleep :(
About prepared statements: for now we don't use them at all, because as you said, it's gone when connection is closed. Later - may be. But for now I'm not sure about this. They don't significantly reduce query time, if it is a query which plan can be cached (in our case this is always true).
AFAIK only OpenAccess uses prepared statements ("sp_prepexec" is frequently shown in SQL Profiler output on its tests).
Fanny,
The underlying premise of the project is flawed. It is trying to measure raw database calls, and that is just not something that is interesting in any OR/M that uses Unit of Work.
A real benchmark would be building a real world application on top of each, and optimize each story, not each specific test that they created. None of the created tests would ever be used in real code.
"About prepared statements: for now we don't use them at all"
err... how can you then be faster than sqlclient (see 30K batch tests for example) ?
That's batching, nothing else. And for batching there 's more clever code to be used. You know, the only FAIR way to test what you want without doing batching is to create for every insert a new session/context, open a transaction, save, commit/flush context/session, repeat.
Anyway, I'll mail you about the legal side of things about this, as management here isn't very pleased with you using our brand name for your marketing and want our name, results and any hint towards our framework removed from your site.
Alex,
Sure, you can use the email converstaion
Alex,
I did point out issues to fix in your NH tests, they are trying to test the wrong thing.
As such, I don't see much to bother with anything else.
When your underlying premise is flawed, you get GIGO
But here is a hint, set the batch size to a high level a see what is going on
Alex, you have email (about the removal of us from your marketing site)
Also, I found the derivatives calculations, but these are wrong: you insert/update/delete in a batch (single session/context, 1 commit/flush), and therefore can use optimized codepaths for that. However, you can't use that for extrapolation over single client actions. E.g. doing 30K updates using a single context, is a batch of 30K updates by a single client. This isn't the same as 30,000 clients doing 1 update however you seem to think it is and your site seem to suggest the tests are about that. They're not (otherwise you can't be faster than sqlclient)
@Alex
"Anyway, I'm 99.9% sure we know how such frameworks work very well."
But you are showing the code and seeing it I'm sure you don't know how work NH. We can change NH results without touch the wrong code of the test (only touching the cfg).
@Oren, have a look to the NH conf. they are using and compare it to some of their results.
Btw, good work @Alex. The time will say us which is the future of DO.NET.
Good luck.
I'll be again out of the blame chorus here.
Just to clarify things I'm not in anyway related to DO.Net and discovered it today with the benchmark site.
I'm an Nhibernate user (and btw also a Nhprofiler user too).
In the past I used several frameworks like Ibatis.Net and LLBLGen.
Reading the comments seems to me that this site scares...
It has touched one pain points we have when using ORM.
The tests may be biased in favor of creators ORM so what?
They released the source code and said you are free to modify and test by your own.
You do not like the code? Change it.
You would like a more real life example: add it. We would like it too.
We would love to have it because we could see all the application of best practices not all of us may be aware of.
On the same damn example.
Now we also have tools to stress test an application and see how it profiles with many different clients.
This would show how to apply an ORM in particular domain and how easy (or not) all the ORM in the .Net space solve it the same situation.
And Ayende yes: Nhibernate is hard (but for you) if you want to do in the best performing way and you know that as you created a tool for this (and btw I bought it).
You are creating an example app if I am not mistaken, use it as the example to test with the other ORM.
Just do not tell me that if I need to do some batch updates I must not use the ORM I am already using because is not intended to do that.
And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained.
The only problem this site has was that it came from a "contestant" ORM producer so it is obviously in favor of that particular ORM.
This can be (and I would say it should be) changed.
But the "battle" is good.
Every battle shows us how to improve: Frans proved it in catching bugs.
Benchmarks are good, they are not all, but it is worth to have them.
fanny,
I think you are missing the main point.
It isn't a case of needing to optimize some code. It is a case of a benchmark that is specifically built to show off non real world scenarios that benchmark well on one ORM.
There is nothing to learn from trying to pass such a thing, except bad ways of building a benchmark.
@ fanny, NHibernate and LLBGen are two proven ORM framework. They both perform well in real world application and that is the most important test there is. While it's certain that Oren and Frans can tweak the test to suit their framework why bother to get caught up in this little game? In the end they are probably comparable to each other with some edge case.
Like most have already pointed out, this is a petty marketing game that being play by the benchmark owner.
"This would show how to apply an ORM in particular domain and how easy (or not) all the ORM in the .Net space solve it the same situation."
The question is how would we define this particular domain to suit the general audience? Then again does it really matter? Unless the ORM, or any frameworks, is crawling by its knee a few millisecond won't matter in most case. While performance matter, premature optimization tend to be a waste of time. Oren already touched on the subject of premature optimization else where on this site and I happen to agree with it.
Since you mentioned that you are a NHibernate user. Are you satisfy with its performance? If not then is NHibernate the bottleneck? Did NProf helped your project in anyway? Is there something else that you wish NProf would do for you? As a current NHibernate user, those are they kind of real world story that I am much more interested in rather than some useless benchmark.
@fanny:
"Reading the comments seems to me that this site scares...
It has touched one pain points we have when using ORM."
Are you joking? It's a misleading website posted by someone who has a clear bias and everything to gain by misleading people. It has nothing to do with pain points, it has to do with dishonesty and false advertising.
"And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained."
So you would be ok with building a brand and then have a competing company use your brand (dishonestly) to make its product look better? I don't think many companies would agree with your point of view.
"And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained."
I'm not threatening anything. Our management just asked them to remove us from their site, KINDLY. That you think we threatened them is your imagination. We didn't threat anyone with anything, why would we do that? It only costs money and time.
What doesn't help us is that this misleading FUD site keeps spreading lies about what o/r mapper is the fastest. We don't want to be in this challenge.
I'll give you a couple of examples why we don't want to be in this challenge:
1) our linq provider doesn't support Skip() without Take(). This is because our framework doesn't support skipping, only paging (in various ways, but that's another story). Adding Skip() to all linq queries make ALL fail and we therefore would score a 0. That must signal our linq provider is really bad, doesn't it?
2) what's the difference between:
a)
for(int i=0;i<30000;i++)
{
}
vs.
b)
// pseudocode, you get the drill
using(var s = new Session())
{
}
?
a) mimics 30000 users saving 1 entity. A real life (albeit a bit lame implemented) scenario. b) mimics a bulk insert, only used rarely in importtools which can't use the native import system of the database at hand.
The 'battle' uses b) because it can show how fast DO is compared to other systems which are optimized for a). You know how you can see that? because it's faster than plain ADO.NET. How can it be faster than 30000 times a plain SqlCommand.ExecuteNonQuery() call? I know how, because it uses an optimized path for b): a prepared query (it's described below the figures).
3) If you're asked to update the salary of all employees of department 3 with 10%, and you have just a sql prompt, would you do that with:
a) an insert in a temp table, a cursor over the temp table updating the rows and then a cursor over the temptable which updates the real rows in the main table?
or
b) you would use a single UPDATE statement?
The battle uses approach a) as it then again can optimize for bulk updates, however SQL offers more advanced techniques to do this, and some frameworks optimized therefore for b). But again these frameworks look like poor performers in this 'battle'.
Do you know all this when you just look at the numbers? no. The numbers show that there's a clear winner. That's not a surprise, this battle has no other winner but the product of the owner of the website. I.o.w.: there's nothing to win and absolutely nothing to gain from any other framework to be on that website.
That's why we asked kindly to get our name, the code of our framework and the results of our framework to get removed. After all, they claim to be 'honest', so I don't see why they won't do what we kindly asked them.
I looked again into the code, and saw that the SqlClient code was actually included.
Now, here's the deal: it uses Prepare() already. So the numbers of the sql client tests are already created using prepared statements. This means that you can't get faster than that in single query execution sequences, you can only get as quick as that. But as the sqlclient test uses a tight loop over a prepared SqlCommand, you really can't get quicker than that, unless you batch statements.
Batching can be done on sqlserver, by packing multiple queries together using ';' and eventually prepare them. It's more efficient if the query is equal every time, so the plan can be re-used for the packed query. You can only utilize this if you're doing batch inserts, if you're mimicing a lot of different actions on the DB you can't use batches, as they're not useful, there's nothing to batch. As I described above, if you move the loop outside the session/context creation, the batching isn't possible anymore and you will see very different numbers which will mimic real world scenarios instead.
There's another way of batching on sqlserver, but that way isn't public. It's the way SqlDataAdapter batches statements using internal code. (using SqlCommandSet).
Lots of databases don't support batching of multiple statements, like Oracle. It's however nowhere mentioned on this 'honest' site. ;)
Frans,
FYI, NHibernate supports batching for both SQL Server and Oracle using SqlCommandSet (and the equivalent for ORA).
You might want to take a look at how this is done and copy it.
Admittedly, [ThereBeDragons] was applied :-)
Heh :). Btw I didn't know odp.net had a batch pipeline under the hood.
I checked if we can implement it in the pipeline (after all, it processes elements in a queue anyway), but it's a little tricky as we have features which take place right after a single entity is saved, like validation after refresh, auditing and error recovery for a single entity failure during save is then out of the window (if packing 10 statements together makes statement 9 fail, how is this reported back so the calling code can recover and retry entity 9... ).
So only if the user doesn't want to use all that, AND bulk inserts are possible, it might be useful. Sounds like an edge case. (as, like you said in the blogpost, in OLTP you're more interested in a lot of features around the persistence, that's what makes using an o/r mapper useful, not bulk inserts, sqlBulkcopy is much faster than any o/r mapper.
hehe I see your comments on the class, my thoughts exactly :)
I've to look into whether this is worth the effort in our case (I'm not going to copy it, it's LGPL-ed code after all), considering the consequences for various features we have (which all have to be disabled or not used). Also reflection would destroy medium trust, some people like to have that due to shared hosting providers.
It's a trade off really, and I think in general users don't really want to give up the features for the batching just for a few extra cycles, as there are always other ways to make it even quicker (like bulk copies).
Frans,
NH uses the UoW model, so we don't have the concept of partial update, just a transaction failure.
With NH, batch updates disable some things, too, mostly related to update validation checking (if you are trying to update instead of insert, frex).
The class is actually taken from Rhino Tools, which is a BSD license, so you can use that instead of the one in NH.
But while this isn't a common scenario, it does show up fairly often.
For example, imagine a feature like copying an entity (deep copy), which may result in many statement. Batching that produces nice perf boost.
Ok guys, I see we've made some of you boil. First of all, let's stay cool ;)
Secondly, now I'm writing the answers to all the critics I got from you. They'll will be published in FAQ @ ormbattle.net in few hours. That's just because I dislike the idea to answer all the same questions for N times.
Good point. And thanks for the hint about the different version :).
I'll look into it later this year.
Yep...Time machine please...6 years ago ..
https://www.hibernate.org/157.html
I do totally agree with Frans, and the same should apply with NHibernate :
It is a shame to publish on a website this kind of information.
Who is stupid enough today to use an 'out of nowhere' commercial Framework, and use it, just because the provider said it is the best using its own benchmarks. Do you really think that .Net developpers are that stupid ?
At least, now .net has its "storm" benchmark...
Managers are...
The first answer:
ormbattle.net/.../...ur-tests-are-unrealistic.html
Others are upcoming.
Boil? Hardly. Disgusted? Definitely. That not the kind of feeling you want from a could-have-been potential customer.
Petty little marketing game disgust me. Programmers, or any other sensible being out there, are sensitively to thing like that :) . So I am glad that both Oren and Frans didn't fall for this dirty little trap.
I am a NHibernate user. While I haven't used LLBGen I would have no problem to recommend to customer that ask me about it. Why? Because both Oren and Frans impress me with their real knowledge and honest conversation. They showed me how much they know about their own product. If a future is missing it's missing for a reason. So if I ever need support from them I know I can expect a clear and reasonable explanation of why abc was done xyz way. It's something that I can appreciate. They won't give me the run around because they don't need to.
"The first answer:
ormbattle.net/.../...ur-tests-are-unrealistic.html"
You aren't seriously believing that, do you Alex?
Thanks Fred for digging up that old post. While we're at it:
ayende.com/.../PissedOffByVanatecOpenAccess.aspx
and my personal favorite:
blog.hibernate.org/.../11#versant
Two examples of other attempts to 'show' how 'crap' the competition is of product X, which actually didn't work that well...
That's one more pitfall you're falling into. Can I show you they have a good knowledge, but as anyone, make some serious mistakes? Guys, sorry about this :)
1) Frans written an excellent sequence of articles about implementing LINQ provider. I've been reading many of them, but frankly speaking, they didn't help us a lot. Mainly, because we've chosen a different way. The proposed one have lots of lacks you simply don't know because you didn't try. And I, as the that guy also passed the same path, can just show this is true: see their results on LINQ implementation tests. Just take a bit more complex case (subqueries, etc.), and it fails. Oren, actually I hardly even can imagine why you released such a "LINQ".
2) Read weblogs.asp.net/.../...rst-doing-is-for-later.aspx - Frans, I'm sorry again for being unkind there. Anyway, look on the facts, not the tone.
3) Oren, let's discuss your replies:
First case:
Me: Btw, NHibernate really drops query rate with the amount of already fetched instances. From ~ 1K/sec for 100 instances and to ~ 100/sec for 30K instances. I feel there is definitely some bug.
Ayende: Again, this is fully expected. NHibernate uses the Unit of Work model.
Can someone explain me, why:
NH's UoW model leads to dramatic performance decrease on read-only operations?
What kind of internals are there? I.e. I perfectly know how to implement UoW without such drametic wastes. Read e.g. about STS implementation in .NET 4.0 - obviously, it's provides much harder warranties, but performance decrease there is ~ constant (4...8 times). So it's simply hardly to imagine how to implement it in such a way!
Second case:
Ayende about our fetch test: Fetch is invalid usage, it is not something that you will ever there except as a bug.
Me: Why does it exist than? ;)
Ayende: For single use scenario. Calling it in a loop is a bug, you use a query for that.
I agree we test this on unusual scenario, but I explained why this is ok in this test. See ormbattle.net/.../...ur-tests-are-unrealistic.html
But let's return to the numbers: Oren, you're behaving like you don't see an obvious fact:
a) No one ever profiled this in NH
b) Ayende is not aware about this flaw
But he recommends this as panacea to the people. Imagine, even if I'd rewrite this test for NH to use just a single query, it will anyway FAIL to SqlClient! (you'd get materialization performance: 16399 op/sec, but SqlClient's fetch performance is 21129 op/sec).
So frankly speaking, I think NH users should simply say thanks to us for these tests. They show there are at least 2 huge flaws. I'd bet they'll be fixed in 6 months, if Ayende won't be so disappointed ;)
Finally, I can't mention the following: you guys have applied so many efforts here... What is your goal? DEPRECIATION OF RESULTS SHOWN AT ORMBattle.NET.
Ok, it's obvious at least NH simply fails there. And actually there are two paths to fix this:
Complex: spend few months on its performance profiling (instead of NHProf , yep ;) ) and make it work at least ~ as EF. I know it can, because they're very similar from the point of architecture & proposed approach - POCO, always "disconnected" graphs (sessions), AFAIK, data can be stored inside entity fields, etc.
Simple: to say it does not reflects the reality to depreciate it and hope this pretty false statement will be accepted by the people just because of your good reputation,
Frans, about Versant's advertisement and comparisons: you just mention FEATURE-BASED COMPARISONS. Have you read http://ormbattle.net/index.php/summary.html ?
Here I wrote why fully feature-based comparisons are initially false, and why we test just very very basic stuff.
@Alex, where have I insulted you? So why are you insulting me? ->
"Can I show you they have a good knowledge, but as anyone, make some serious mistakes? Guys, sorry about this :)"
Which mistakes did I make? I didn't implement silly code paths for some 'benchmark' of some competitor's fud site?
I'm really starting to get fed up by this crap. Remove our name and the results of LLBLGen Pro from your FUD spreading marketing site _NOW_. Ok? Not tomorrow, NOW. I've kindly asked you that yesterday, and also today a couple of times. We don't like being used as a pawn in a competitors ego boosting and self-promotion.
It looks like .NET developers & managers are stupid enough to use something that shows worst performance on very basic tests in comparison to other frameworks.
Concerning DO: yes, it performs very well on these tests now. But I think it easily can loose its positions there in near future:
All you need to compete with leaders on CUD tests is batching. Have I shown you here it's nearly as usual as future queries? Why just we and OpenAccess developers noticed it can be used? Btw, they use even faster option: they automatically build DELETE FROM ... WHERE Id IN (...) queries for removals, but we (currently) use just batches consisting of a set of regular DELETE FROM ... WHERE Id==@Id statements. So OpenAccess is #1 on entity removal test. It does removals almost 1.5 times faster than DO, and 6+ times faster than anyone else. Is this information useless?
You can be #1 on materialization test. We compete well with EF there (although it is definitely #1 on this test) just because our materialization pipeline is likely the most optimized one on the market, but in general it's really hard for DO to compete with EF-like frameworks on this test: if everything is optimized well, all depends on per-entity memory consumption here. So POCO entities must rule on this test. DO entities aren't POCO entities. That's it.
It's difficult to reach good result on LINQ tests. E.g. EF looses some points there mainly because they don't support equality comparisons for references (fields of Entity type; or, better to say, comparisons of non-primitive types) and few other strange limitations like impossibility to use First/Single in subqueries. But in general 6 months are enough to get 100 out of 100 score here; at least our team spend 6 months on LINQ provider for DO.
Finally, you must notice DO isn't explicitly shown as #1 here; moreover, it's easy to prove we didn't specially develop tests for it. If anyone will find something contradictory to this, please write us, we'll remove the test or part almost immediately and update the results.
Frans, you precisely know the mistake you've made. It's explained in my first post there.
So I must ask you here again (since you asked to not quote you), don't you find yourselve hiding important information from your customers? In fact, you prefer to hide the truth to sell you product better. Do you know how does this called?
@ Frans, concerning removal: it will be done tomorrow, if you want. I just need some help with this.
Alex,
Side note: Are you aware that your tone is becoming more and more offensive?
As for the results, as I mentioned, they reflect fake scenarios that has nothing to do with the real world.
As such, the benchmark results are worse than useless, they are misleading. I am willing to give you the benefit of doubt and assume that they are not intentionally misleading, but frankly, your behavior is... not encouraging that assumption.
I am not even going to try to answer your points, your are iterating things that were already answered. Please read the thread and take a look at the STORM reply from Hibernate. Pretty much everything else applies here as well.
As for the FAQ page, when I gave you permission to use my reply to you, I assume you would do it in an informal context, such as a comment or a blog post.
I don't think that the quotes in the FAQ page are appropriate and I am asking that you would remove them. The quotes, as shown, are out of context, both in terms of the actual discussion and the tone of discussion. Please remove them from the FAQ page, as I withdraw my permission to use them.
Well,
I know LLBL for being a customer. I know Nhibernate and using it since the first release. None is perfect.
But if your product is the best product (and it is not only related to performances), then it will be the market reality.
I don't care that Linq NH doesn't pass 99,9% of your tests : they do not represent real life examples.
When I look at your result, I only doubt of one thing : that you have understood the other products.
Let be honnest : You have just promoted your solution using a low moral solution. Not only is it stupid because it is too big and obvious to work, but it is also insulting for many people (commercial or oss) because you use their work as a leverage.
As far as I am concerned, I will probably use both LLBL and Nhibernate, with a preference for the second, but I will never work or spent a cent with your company or your product.
"So I must ask you here again (since you asked to not quote you), don't you find yourselve hiding important information from your customers? In fact, you prefer to hide the truth to sell you product better. Do you know how does this called?"
Important information? Contrary to your product, our framework comes with a 600+ pages manual and the knowledge that many thousands of projects (for example critical applications in banks and oil companies) use the runtime successfully every day, for many years already.
So, I don't think they'll lose anything when they can't read up how a competitor deliberately made our framework look really bad.
About the serious mistake I made... sorry but I don't follow you. Anyway, enough...
"concerning removal: it will be done tomorrow, if you want. I just need some help with this."
Yes, we want that, how many times do I have to repeat that, Alex? Thanks for removing us.
You must know that this whole mess really ruined my weekend and I still feel crap. I know I shouldn't be, after all it's marketing and who cares about marketing. But let me assure Alex, I won't forget this. When LLBLGen Pro v3 comes out with solid model first/ schema management etc. support for NHibernate, EF, L2S, LLBLGen Pro RT, Euss, genom-e and likely some others too, I will not spend a single second writing support code for your framework. You deliberately misused my life's work, and although I shouldn't, it feels like a personal attack. That's perhaps me, but so be it.
@Alex
While your tests may prove that your product is more performant on those exact tests, I think this was a mistake in a few ways:
From what I've seen, the major ORM developers are generally very cooperative and respectful of eachother's efforts.Doing a hand-picked speed comparison on a (misleading) marketing site probably just got you removed from the community as far as future cooperation is concerned.
That site appears to be set up as a impartial test results site. When I stumbled across that site, I saw no obvious implication that this was a site run by the creators of one of the products being testing. ORMBattle.Net? If you didn't intend to deceive, why not put this under your normal DataObjects.Net domain? This approaches astroturfing territory...bad.
There's nothing wrong with contributing ideas by debating techniques which may lead to better performance. Doing this in a public forum (such as this one) does everyone a lot of good. Skipping the debate and just showing raw benchmark scores does no good at all except from a marketing perspective... it is meant to impress managers who like to make major decisions based on charts.
ORM performance is about a lot more than how long it takes that to do a single entity update, fetch, delete, etc...for example, how does it make sure that only what needs to get updated, gets updated? How does it minimize database chattiness? These are issues that I would imagine are difficult to benchmark as there are so many different possible scenarios, but they are as important, if not more important than raw "scores" on simple CRUD operations.
I don't know if your test were meant to be an honest comparison between the different ORM frameworks out there, maybe they were...? But the way you go about it definitely shows, to borrow a phrase from the American legal system, the appearance of impropriety...which is bad in and of itself.
Side note: Do you see the same happens from your side? From my point of view, you say "everything is wrong", and now I must defend my position. Particularly, on such examples.
Don't you see such statements are just FUD as well?
Ok, have you seen my above post about at least two flaws in NH? Is this useless?
FUD again.
You mean this: https://www.hibernate.org/157.html ?
Yes, I've read it. And, frankly speaking, didn't find much similarity except the fact it was made by another one ORM vendor (it's not the flaw: let's think this is acceptable, if benchmark is honest).
There is a large part about fixing bugs related to NH usage in STORM test. I ASKED YOU PERSONALLY TO STUDY OUR TEST FOR NH TO AVOID ANY MISUSE OF NH THERE! I understand we aren't experts in NH, so I asked expert to do this. What did I got? Mainly, common answers like "your test is wrong" - I already mentioned this once here. And the whole this blog post.
The other left part there is this one: "With such small datasets, accessed repeatedly, the database is able to completely cache results in memory; the benchmark never actually involves any real disk access (watch your hard drive while STORM runs). We never get to see what happens once the dataset is too large to fit in memory, or is being updated by another transaction. We never get to see what happens with the database is under load. In fact, this benchmark involves no concurrency at all! We never get to see any joins or any of those other things that happen in realistic use cases. Furthermore, these kinds of benchmarks are often run against a local database, which gives results that are absolutely meaningless once the database is installed on a physically separate machine."
I can explain (and I already explained this in test suite description, but you don't see this) why MINIMIZING database load is GOOD for ORM test rather then BAD: any ORM adds some overhead to all the database operations; to measure it well (i.e. to expose JUST IT), you must MINIMIZE the database load. That's what we do.
In fact, since we measure the efficiency of intermediate layer, we must make all the layers behind it operating as fast as possible - to measure just the efficiency of that intermediate layer. Is this clear, or I must provide an example like with CPUs, add and multiply?
Ok, so I understood you allowed me to use your quotes here? :)
My first intention was to simply do this to show there is nothing like "out of context". But that's too long for a single post. So actually I don't see anything like "out of context" here. That FAQ page reflects exactly the parts we were talking about.
Anyway, since this isn't really important, I'll remove them all. I really tired to argue about stuff like"please remove my ..., I don't permit you to use it!".
We explicitly declared this almost everywhere. The site has separate domain name by two reasons:
It wasn't made to show DO4 is the best. And there is a long true story about this in "About" page. Yes, currently it markets DO4 well. But that's just currently. I don't fear id DO will be an average tool there in future. We'll simply try to keep its results god from this point.
Btw, we even didn't set up any explicit placing there. Again, by the same reason. I'd like such a benchmark site to run with our without DO4 results on it. And making this web site running on X-tensive domain would depreciate it further.
We seriously though (and still think) about making such benchmarks fully community based. E.g. shortly the code will be shared @ Google Code SVN (currently there is just .RAR with it). We didn't do this just because of lack of time.
So honestly, we are even ready to remove DO4 results from it at all (but I leave the permission to mention them on our own web site + make a public vote of adding DO4 back there).
If you have any ideas on how this (i.e. fully public & credible benchmark could be achieved), just share them here. I fully believe our test suite is done well - again, with or without DO4.
Hmm... More than two reasons, but anyway, it's a good explanation.
Shortly, I believe such a web site \ benchmark must exists. We are ready to gave it under the community control. And any ideas on how to achieve this are quite appreciated.
I almost immediately published link to this post there: ormbattle.net/.../...st-suite-and-its-results.html
I'm also for a complete openess of such things. And (actually) that's why I try to pull out any conversation here. Asking Ayende and Frans to run all of them them were was looking as simply impossible.
As you see, our CUD tests partially show how to minimize database chattiness. Btw, we could do the same with query test (there are future queries in some frameworks), but found this isn't fully honest, since this requires utilization of a special API, that leads to side effects. So likely, there will be a separate test for this shortly.
And I understand it's quite difficult to make a test that will be accepted by all of us.
If it will be too complex, people will say "it's just a particular case", Vendors will try to optimize their framework just for the scenario tested there (all of us know what graphics card vendors do to lead in complex benchmarks - i.e. they fake the results there by all possible ways)
If it will be too crude... Well, you see whats happens right now. Although the operations are really simplest from the simplest ones, and anyone can check this, I must answer why it isn't especially designed for us.
@Alex
"We explicitly declared this almost everywhere."
I look at the main page and the summary results page and see this mentioned nowhere. Again, I'm not suggesting impropriety, but this has very obvious appearances of impropriety, so much so that I wouldn't blame readers from completely distrusting the results.
I think its awesome to have a community based benchmark suite for the simple reason that it could improve the science of benchmarking, and may lead to benchmarks that are actually useful in a real-world context. However, I would expect this to be run and maintained by a completely impartial 3rd party, not by someone with such a vested interest in the outcome of the benchmarks!
To go down the "Bouma" road on this, I would suggest that the benchmarks themselves are not what is important, but rather the science that backs up the benchmarks. This thread seemed to have a promising start insofar as a discussion of what a "useful" benchmark are was introduced. Unfortunately, because of the whole appearance of impropriety issue, things turned to venom rather quickly which ended any useful discussion. A site dedicated to discussion of what a useful ORM benchmark is would be absolutely wonderful. Even some benchmark implementations would be nice, which you have obviously put quite a bit of effort into already.
Errr... neither of those are any of fear, uncertainty or death. This conversation has unfortunately degraded to uselessness, though to be fair to both Oren and Frans with the attitude you came in here with I'm not sure much else could have been hoped for.
I suspect some of that has been some miscommunication but much of it has been caused by your agressive attitude towards them and their work. A fact which can be directly traced back to you creating a site literally called "ORMBattle". I think it is pretty clear someone was looking for a fight.
One of the reasons benchmarks aren't that useful is that typically performance is only important if your software isn't meeting a necessary SLA/requirement. Thousands of people use NH and LLBL and it clearly meets this need for them. Largely because they use it correctly and for the types of real-world scenario's that matter.
Even if meaningful benchmarks exist they probably won't matter much to customers. They will be concerned with other issues like the frameworks flexibility and extensibility, or its community and support. You need to win on builidng a community, depth of knowledge, and proven reliability as Frans, Oren and others do. You should want to be seen as a go-to for insight into ORM issues. Issues in ORM are often subtle, tricky and complex as much as they can be simple. Proving you can be a bit faster is just such a small part of it.
You will not win customers by taking an aggressive approach like this. Trying to prove your product "beats" out others will only label you as biased and misleading and if that is FUD so be it, you brought it on yourself.
Even if, as you say, your benchmarks are not misleading, they will still be interpreted as such by most of the communities who now feel as if you are attacking. Remember developers care more about community, knowledge and support than they do about benchmarks which can say one thing on one day and something totally different on another.
@Alex
If you are serious you should open that site to public comments as this, and others sites, are open.
Allows users to leave theirs opinions about your tests, about the product they are using and about your product and, obviously, maintain each opinion public without any kind of moderation.
Note: public mean without a login in your site.
Your site has only your words as "the owner of the true".
+1 for never, ever, ever using this product, and recommending switching to something else if any of my customers are stupid enough to be already using it.
Way to go Alex, in addition to making your company look shady from a marketing perspective, you have made your product itself look bad by making a fool of yourself arguing with two people (Oren and Frans) who are much, much, much, much, much smarter than you are.
"it's quite difficult to make a test that will be accepted by all of us.
If it will be too complex, people will say "it's just a particular case", Vendors will try to optimize their framework just for the scenario tested there (all of us know what graphics card vendors do to lead in complex benchmarks - i.e. they fake the results there by all possible ways)
If it will be too crude... Well, you see whats happens right now. Although the operations are really simplest from the simplest ones, and anyone can check this, I must answer why it isn't especially designed for us."
You've said it yourself. Benchmark like this is useless because it's either too complex to judge or to general to have any value. In engineering there is always a design trade off. Each vendor choose to optimize their product differently for different scenario.
Let take the auto business for example. To show that one car and do 200 m/h vs another can only do 110 m/h is meaningless to the customer. What matter more is how far each company support their product in the long run. How much will it cost me for an oil change? How reliable is the car? In another word, critical stuff that any benchmark will fail to show.
Unless you are so confident that you have achieve an absolute breakthrough that your product will beat out all the other smack down. If that is the case why don't you put some money down if people like Frans and Oren can prove that you are wrong? Since I am sure they don't want to work for free :)))
If in the end with the necessary benchmark all product seem to be comparable to each other then what does that mean to me and the community? A big duh :) tell us something we didn't know.
@Alex:
Frans said:
"When LLBLGen Pro v3 comes out with solid model first/ schema management etc. support for NHibernate, EF, L2S, LLBLGen Pro RT, Euss, genom-e and likely some others too, I will not spend a single second writing support code for your framework."
Pwned.
We'll do this. Frankly speaking, initially there were comments. But we found two modules for Joomla! we used for this more worse than good. Basically, there were serious issues with appearance we couldn't accept.
But shortly they'll be back again. For now there only are public forums.
I won't even comment this ;)
Do you see they are comparable now? I don't see this. I see a huge disbalance there, which only proves that many of them were not profiled at all.
Did you know:
NH is 25...30 times slower on materialization that EF? So reading a large object sequence in it will take 30 times more time.
NH runs a simple query fetching 1 instance 100+ times slower when ~ 10K instances are already fetched into the Session? No one else exposes similar behavior.
Be honest, you didn't. Yep, this ugly truths. If you'd like to hide it, the simplest way is to say "there is nothing new here, don't pay attention to this". Just depreciate this.
And I'm not speaking about LINQ tests.
I suspect they'll continue doing this anyway - there is no need for bets. Moreover, you didn't say what exactly do you mean saying "you're wrong". Please state something particular, that will be easy to proof, not everything ;)
I won't discuss 600+ pages manual - I believe it's good, and actually I don't fully understand why you're trying to mix this here. ORMBattle is about performance and LINQ. You're writing about your manual and marketing results - exactly what ORMBattle isn't about.
In fact, you're saying "there are evidences this is HQ product" - have I ever mentioned it isn't? Yes, but is isn't the best on our test suite. Actually if your product is so good, I'd simply accept this being on your place. Explained, why others are better, and what are advantages of architecture you have developed.
Obviously, it isn't. Did I say something bad about your product here? I said you was wrong and shown few examples of this. Why? You're (as well as Ayende) trying to depreciate my own work by all possible means here; moreover, people believe you, and you're relying on this trying to use the tone "believe me, that's true". What did you expect?
Certainly, I don't dislike you personally. It's all about the opinion here.
Btw, it's even a bit funny you answer to me in such a tone. Actually, I don't expect this. I like to argue - it's a kind of sport for me. And I say "I was wrong" with ease, when I feel this. But currently I feel myself completely differently - and that's only because of your arguments.
Anyway, I don't like the idea of maintaining such a fight here further. I'll provide links to particular test discussions here tomorrow, as well as place to discuss the ideas on any other tests. I think this will be much more productive.
Saying "everything is wrong" is not an argument.
What's "real life" is actually very difficult to define.
We'll enumerate the test (and name the cases) failing for NH. We didn't do this for other products just because it is depreciative. But since NH is fully open source, I hope it is ok to publish this info. So anyone will be able judge by his own, if this is a "real life", or not. Currently I can only assure you lots of tests there are pretty simple.
That's why we contacted their vendors or experts. But I doubt we understood them so wrong.
I just mentioned we're ready to remove DO4 out of results there at all. If this will solve the problem with trustworthy of results, it's definitely ok. I believe such test simply must exists.
On contrary, what you see is what a particular vendor shows to you. So do you really think this is good? There are TPC-C and many other tests (btw, frequently testing very similar operations) for databases, but no any credible tests for the next layer. And you're, as customer, taking the position of your favorite vendor. You're so impressed that protect the idea that is attractive for just few vendors, nor customers.
Ok, forget about DO4 positions there. Let's think we already removed it. Do you still think such a benchmark is useless?
Btw, it's interesting to know what kind of framework would win in our performance tests, yes? It must be:
POCO-based. Its RAM overconsumption must be close to zero. This is needed to win materialization test.
As simple as it's possible, but complex enough to support reference fields and simplest collections. Again, that's about materialization test.
It must communicate with DB quite efficiency. No any chattiness. That's about CUD test.
Provide very good LINQ support.
Be highly optimized. Briefly, if you never profiled your framework, you have almost zero chances to be even #2...#3 there. We know leader of each test seems to be highly optimized at least for this particular scenario.
Looks like all this suits well for NH, yes? But it isn't winner there. EF wins materialization test - because their materialization is written pretty well + they use POCO. OpenAccess wins removal test because they optimized this almost ideally. And the product I can't name here either wins or is #2...#3 on other tests - I hope you understand this couldn't be a result of some particular optimization for these tests, or "just specific tests". Otherwise it would win everything.
It's mentioned at about page: http://ormbattle.net/index.php/about.html
I fully agree with you here.
Btw, we tried to find some standard benchmarks for ORM tools, and actually found nothing credible. E.g. I remember one testing paging implementation, but they didn't check even query plans there. Obviously, if they're too different (= one is just completely wrong), there is no reason to test paging at all.
So if there will be a test for paging (Take\Skip), we'll definitely take this into account. And there must be a separate test for paging without Take/Skip - that's what frequently used as well.
Anyway, we'll try to make more credible out of this - I believe this is more important what we could get for DO out of this. Obviously, we must remove DO for some period there (3...6 months?), and forward full control over the test suite code to the community. We'll think how to achieve this.
Sorry, I mistyped: "Anyway, we'll try to make more credible site out of this".
"I suspect they'll continue doing this anyway - there is no need for bets. Moreover, you didn't say what exactly do you mean saying "you're wrong". Please state something particular, that will be easy to proof, not everything ;)"
To prove that you are wrong for nothing meaning they are being dragged into your little game which is a waste of time.
"Do you see they are comparable now? I don't see this. I see a huge disbalance there, which only proves that many of them were not profiled at all."
Currently the test is being tweaked toward your framework, whether it's intentional or not. Hence the imbalance result. Like I said unless there is a breakthrough in your framework with the necessary optimization toward each framework the end result will be similar for the benchmark. So if you are confident why don't you put a price on the fact that after the necessary tweaking your product will still beat the other hand down? If not I would rather not see the time of the community being wasted.
I haven't read the comments here, but here's what I did...
read this article's topic
clicked the ormbattle link
didn't see any version number in contesting ORMs (like subsonic 3 has LINQ provider already)
saw the result, one (unknown to me to this day, I knew all the others) ORM stood out completely as a big winner
clicked "about us"
read:
__ We are experts in ORM tools for .NET, relational databases and related technologies. Moreover, our company has a product participating there (DataObjects.Net).
laughed, and closed the browser
Aren't comparisons like these useless? I mean, whatever ORM you use, it can be tweaked and finetuned. With the end result that the time spent in SQL server is much larger than the time spent in the ORM itself, and that ORM time is negligible. I think this is true for every good ORM out there.
Every ORM can reduce roundtrips (I love NH futures), and has options for optimizing the SQL code being generated.
People are better off looking at features/ease of use/support/price of the ORM. Don't worry about performance. After a bit of configuring and optimizing your ORM of choice, you can get almost the same SQL output, and performance, as all the other ORM's.
So the purpose of this was just having such a benchmark available? It's pure altruism regarding ORM? I think that's really noble of you.
Just so it's crystal clear: you would have created this separate website and posted the results of all these tests if your product had been the clear loser across the board?
That's very, very believable! I'll take 100 copies of your software! I can't wait to see the tripling of performance of my code!!! I'm using lots of exclamation points so it's clear that I'm being 100% sincere!!!!!
@Alex
__I can explain why MINIMIZING database load is GOOD for ORM test rather then BAD...
In fact, since we measure the efficiency of intermediate layer, we must make all the layers behind it operating as fast as possible - to measure just the efficiency of that intermediate layer.
well I'm having fun here (very interesting discussion) but you're a bit wrong here
2 different ORM will issue 2 different SQL statements to a database (trying to solve the same real-life problem), say ORM-A will issue 3 statements whereas ORM-B can do it in a one statement, because its LINQ provider (join support etc), or CRUD handling is more advanced (say will only issue UPDATE statement on particular changed collumns, not on ALL like Lightspeed is doing for example (amazing ORM btw, v3 will improve this))...
therefore DB engine (say SQL 2008) will spent say 85ms (big numbers just for the record) in the first case, 15ms in the second case...and multiply this.
That is what counts in the real world.
(so the formatted sql and the possibility to alter/change that, thanks to NHProfiler you know which side I'm in, so you should messure efficiency of DB engine as well, because it might be stressed by some dumb ORM these days;-)
And, as I replied to you when you asked, the main problem isn't with the code (there are severe problems there, but that isn't the issue).
The problem is that the permise of the benchmark is flawed. You are trying to testing _bulk data manipulation_, a scenario which is just not something that NHibernate (or most OR/M) are designed to solve.
As such, there really isn't a point in trying to fix code, it is the idea of the test itself that you should fix.
If you want to create a more realistic benchmark, you should try one of the suggestions above. Write an application and test that using things like web load test.
In other words, you benchmark do something like this:
C++
int[] arr = new int[100];
for(int i=0;i<100;i++)
arr[5] = i;
C#
int[] arr = new int[100];
for(int i=0;i<100;i++)
arr[5] = i;
And then pointing the perf difference related to array bound checking in a contrived scenario.
There isn't a point in trying to fix anything, the scenario that you are using is broken.
That is just nonesence. Trying to minimize database load means that you are removing real world concerns, that completely change the behavior of your applications.
It is like trying to show fuel consumption in city driving for a Formula 1 car. The test itself is meaningless for the designed goals.
Just so I'll make the example clearer. No one car about the fuel consumption in 20KPH, all they care about is fuel consumption in 200KPH. And trading off one for the other is a good design goal.
Thank you.
No, it isn't. You test is loading 20,000 objects per loop iteration, where you aware of that?
Didn't I cover that already?
NH uses the UoW model, which means that we manage all loaded objects internally.
That is expected and acceptable, for the simple reason that NH ISession isn't supposed to be used for bulk data manipulation, that is why we have IStatelessSession.
Oh, and as an aside, you might want to track down what SQL is being called for each test, the results should be quite educating.
And that is why your method is flawed. You mention nothing at all about things that are important for real world concerns like lazy loading or eager loading options, caching semantics, transactional behavior, dirty checking, automatic persistence, and more.
Those aren't just bullet points, they truly change the way you write software.
Wrong. There are results for 1K (first page), 5K and 30K (in .XLSX). NH slower at least 25 times.
Do you know that almost any other framework supports UoW and "manages loaded objects internally" as well?
Yes, normally there is a state associated with each loaded entity. But:
1) It is stored in dictionary-like object (or, better, an object internally relying on hash table). Thus state search time is constant - i.e. it does not depend on count of already loaded objects.
2) I don't understand what must happen there to make each query 100 times slower on large set of instances. Is NH iterating all the instances before each query (e.g. looping up if any of them is changed). Actually it looks like exactly this happens.
If you aren't sure there is a REAL problem, I'll show you a sequence for NH LINQ queries separately:
N (number of performed queries): Performance (op/sec)
100: 925
1000: 646
5000: 301
10000: 180
30000: 68
So performance has degraded ~ 15 times. You still don't see the problem? There is no any other framework exposing the same behavior.
Yes, I do not. But:
1) DO supports lazy & eager loading for all the fields by default. The entity tested on this test could be an entity with LL field with just one attribute added. Do you know we maintain "layered" field state? I.e. in fact, we know what was loaded initially, and what was changed? How its presence affects on results. Almost nohow.
Anyway, I believe there must be a separate test for this (LL); moreover, LL must not lead to much more costly while it's off. So in general, I think results must be ~ the same independently of LL support in a particular framework. If this isn't true, likely, there is a bug.
2) Caching was intentionally turned off for any framework, and we wrote about this. There must be a separate test for this. If it won't be off for some frameowork, we'd compare teleportation with regular transportition (i.e. two completely different technologies).
3) Transactional behavior, dirty checking: the same. E.g. DO relates any acquired entity state to a particular transaction, and can automatically re-load it on demand, if current transaction has changed. There are transactional methods - this ensures any operation you run on entities or persistent services will lead to a transaction, if necessary.
And what? All this stuff doesn't make it to operate slower than others.
4) Automatic persistence: oh, that's what exist in many of tested frameworks. Presence of this feature must not significantly affect on results.
Yes, I agree they must be implemented, but this test does not rely on these features. Moreover, I'm sure presence of these features must not significantly affect on test results. And actually test shows this: there are some frameworks implement all of them (I'm 100% sure DO does this), and its test results aren't bad at all.
This shows they can be implemented in a way that does not lead to serious performance degrade.
Any other "that's why your method is flawed"?
Sorry, corrections:
"How its presence affects on results?"
"Moreover, LL must not lead to much more costly operations while it's off."
I'm ready to beat for this: i.e. I'm 100% sure some of tested frameworks can beat DO after necessary tweaking on CUD, query & materialization tests (not speaking about LINQ tests - this is obviously true, but likely, I'll be waiting for too long ;) - it's complex as well).
So who wants? I can personally explain what must be done to achieve this.
Tested version of any framework is mentioned in its own page.
See e.g. http://ormbattle.net/index.php/subsonic.html
What a surprise - DataObjects is a leader. Seems like another ad for me.
Of course it is ad. All record of all product is ad :)
Well i know what i wont be recommending now. DataObjects.
@Alex, Instead of rehashing your own opinion here, you also could have invested 2 minutes to remove us from your website already. It's still up there.
The main problem with your way you look at things, Alex, is that you deliberately want to run these tight loops as 'tests' and use them to create other information like the CUD numbers.
I can prove to you that that route is not correct: because the updates are all done in bulk, you can also use bulk update statements. You ignore them because you then can't run the tight loops which show how great you optimized the batching pipeline. However, for the purpose of 'updating massive entities with the same expression' (as the tight loop implements, as you use 1 context for 30K updates!), using the bulk update statements in some o/r mapper frameworks, the update statements are done much faster: 1 query and they're done. Extrapolating that to CUD numbers therefore would make them the clear 'winners'. Therefore what you use as 'CUD' numbers isn't going to fly.
You don't use bulk update statements (which are just another way of optimizing what you optimized yourself inside the pipeline!) because you find that a 'hack' as you told me. That's your opinion of course, but as you and you alone call the shots on that website, your opinion is the only one that counts. So you can manipulate the tests to get the results you want. And you do that.
Another signal that you compare apples with oranges is the 'Query' test. You have absolutely no clue how our framework works, looking at the code. For starters you fetch inside a transaction. But we don't use a central session object, Alex. That's a deliberate choice. Secondly, your code starts fetch queries on a linq metadata object in a loop. But that simply creates a new expression tree, executes it and returns the object. By contrast, the code for DO creates a single transaction on the already live Session and fetches all queries on that same object. This isn't the same thing as you now can re-use cached objects, be it a query, an expression tree evaluation, anything.
Is this mentioned on your website, that we have a very fast pipeline and therefore don't suffer from problems where queries have to be re-evaluated every time, you know like in 'Real Life'. The website, ironically, mentions that you have setup the tests the way they are to make sure the overhead in the o/r mapper is maximized. However, by re-using session/context objects in a loop, you precisely don't do that, as you can re-use resources, query parsing results etc. which is not giving a good picture. Flawed, one would say.
(part 2)
On to the linq queries. Your '100' number suggests a percentage. The numbers show that 60% of the linq queries thrown at our linq provider failed. However, does that say that 60% of all queries one would write normally would fail? Your website suggests this, (as the linq tests are carefully created, am I right?) but this is utter nonsense. For starters, you exploit lack of support of some features deliberately. For example the linq queries never show a single FK-PK comparison. That might be because you don't support FK fields. Adding 25 FK-PK comparison queries would give you a 25% failure rate. We do support FK fields, in fact our linq provider doesn't support entity instance comparisons, as we don't think it's something that will be used much (and our linq provider is out for more than a year now and we never had that request), as we have fk-pk comparisons, which are closer to what people would do in sql anyway.
Yet, you keep your set of entity comparison queries, which makes sure a lot of the queries fail. You also have queries with multiple Skip() calls. Skipping is something one would rarely use in real life, as one would normally use paging. Did you want to test paging? If so, why didn't you write a couple of them? Oh, Skip(10).Take(50) isn't paging as it is used in real life.
I can go on and on, but enough time wasted. As a final remark, I wanted to add that what I wrote about the personal attack was to make you feel how you ruined my weekend. That you don't seem to understand a single bit of it, that's clear to me. C'est la vie.
You still have something to remove, btw.
We're working on this now. We'll make all fixes in tests suggested by other and remove LLBLGen with today's update.
I'm writing FAQ post about this. Briefly,
We'll add tests for single entity updates
Tests for multiple entity updates will be there.
Obviously, "reality" is somewhere in the middle: some transactions run just 1 update (btw, a rare case), others run many of them. So on general, true CUD average is ~ SingleUpdateTimeSingleUpdateProbability + MultipleUpdateTime MultipleUpdateProbability. E.g. 20%SingleUpdateTime + 80% MultipleUpdateTime.
Obviously, this is a number between SingleUpdateTime and MultipleUpdateTime. Moreover, I'm 100% sure SingleUpdateTime will be approximately the same for all ORM tools, ~ 5..10K op/sec. So this number won't dramatically change real life results (CUD average).
It reminds me one day, some people claimed that using dataset was better than using an ORM because some "benchmark" showed it. In fact they even claimed that it was faster than datareader...
And those, one should use Dataset over datareader or ORM tools. It was lovely and dumb.
I have to say Alex, it's clear that you're in love with your software...and as we all know, love is blind.
Stop wasting our time with your drivel, clean up your site, put a GIANT disclaimer on the main page saying you have a conflict of interest (i.e. you built D.O. and the tests in its favor) and let these two gentlemen (Frans/Ayende) get back to work.
Note, that authors of this web site don't claim that you must use ORM with the best performance. It is just a source of performance-related information, that can be useful in some cases.
OK tests are not perfect, but they are not completely wrong. One can suggest another test. How would good tests on ORM performance look like?
To be fixed. We support this, but likely, in many tests it was written as order.Customer==customer, that isn't supported by you.
You are using the following code:
while (i < count)
foreach (var o in session.Linq <simplest())
There is no paging, and NHibernate doesn't do lazy materialization. You are loading every single row (and there are 20,000 of them) for each while loop iteration.
The code make me think that you believe that it will do lazy materalization, though.
Alex,
I am not believing those numbers for several reasons.
a) Your code and statements about the code shows a big disconnect.
b) It is _not important_, NH isn't trying to optimize loading of a large set of queries. It doesn't matter because any optimization that we may try will be swallowed by the DB time.
c) NH ISession can most certainly handle large number of object in a query, but even then, it is not something that you are supposed to do.
d) State search is O(1), but NHibernate manages query persistence transperantly. What this means is that the following code will work with NHibernate.
tx = s.BeginTransaction();
foo = (Foo)s.Get(typeof(Foo), 1);
foo.Name = "ayende";
var list = s.CreateQuery("from Foo f where f.Name = 'ayende').List();
Assert.Contains(list, foo);
tx.Commit();
When you perform a query, NHibernate will check if it need to synchronize the state of loaded objects with the database (automatic flush) based on a set of rules that aren't really important right now.
Your test happen to force NHibernate to make a check of every object loaded to memory, that is a O(N) operation, by definition.
Now you get it?
Doesn't matter the perf implication, I mention those because you brought up selection criteria for OR/M that were nonsense.
so many words about "dirty marketing games"
so many words about "useless of this comparison"
so many words about "DO got anti-ad for their own"
looks funny...
Frans, cowgaR, GiorgioG, firefly, Fred, Ayende, and others... if discussed site and its idea is useless, if this is silly, if this only leads to depreciation of DO as product - why all of you spend so many time and words here? I would just have a one-minute fun looking at ormbattle.net and then return to my work, if like you've said - "it is not real life" - then we have nothing to worry about, because we are in real life :)
But such hysteria which i can see here, makes me think, that nothing hurts like the truth... Otherwise you would not spent here so many time.
That is extremely inaccurate. Care to name a single real world scenario where you need to update/insert 20K of records that isn't an import task?
Agree. Ayende, all you just wrote about materialization is lie. Or you don't study the test code well. 1 min, writing reply.
Alex,
Let me get this straight, are you telling me how NHibernate works?
I am willing to admit that I may not have the best knowledge about it, but I do believe that 6 years of working extensively with it does carry with them the capacity to understand how NHibernate operates.
So about materialization test for NH:
Look up the last method here: code.ormbattle.net/ (I paste full code, not just a peace of it).
As you see, it may really stop "in the middle". On the other hand, if you take a look on its invocation: code.ormbattle.net/ , you'll find it's always invoked for the same count of entities as this query must return.
So it doesn't matter if there is lazy materialization or not. DOES NOT MATTER AT ALL.
To prove this, I paste a single query from SQL profiler it runs per each measurement:
SELECT this_.Id as Id6_0_, this_.Value as Value6_0_ FROM Simplest this_
Any more ideas why it's so slow?
Oh, I like this phrase :) LOL, but yes.
As you see, we've just especially checked this in SQL Profiler.
Can you stop talking about 20K entities? We tested everything for many cases: code.ormbattle.net/
Scoreboard page reflects results for 1K entities.
Moreover, I precisely explained why big numbers are good here: ormbattle.net/.../...ur-tests-are-unrealistic.html
And concerning the batching: DO4 batches CUD updates using 25 statement buckets. Is 25 statements a good number for you? Results for such sequences will be nearly the same. I tell you once more, forget about 20-30K.
This was about "But such hysteria which i can see here, makes me think, that nothing hurts like the truth... Otherwise you would not spent here so many time."
Yep. They're so "nonsense" that immediately exposed at least two issues related to NH?
Nice. DO supports this stiff from v1.0. I'm curious, did you suspect DO4 or EF doesn't implement this?
Do they expose the same 100 times slowdown as NH on big numbers?
Ok, but now it's clear why it's so slooow. NH traverses the whole object graph before each query to find out what's changed, yep?
If so, is there any way to disable this for this test? Actually I'd prefer to get the results without this "feature", if this is possible.
Houl: "Otherwise you would not spent here so many time."
You don't get it. The site pretends it shows actual real life measurements, or better: the measurements can be used as real life figures. If not, the site is useless to begin with.
However the 'real life' measurements on the website are not real life measurements, but artificial tight loops.
Furthermore, if Alex would have removed us saturday, and had simply corrected the nhibernate code as he should, things wouldn't have gotten so dramatic.
Alex: (FK-PK compares) "To be fixed. We support this, but likely, in many tests it was written as order.Customer==customer, that isn't supported by you."
And by EF.
Linq tests should focus on what statements are supported and what not, should use normal C# keyword syntaxis and should come with a huge disclaimer: the linq scope is so huge that anyone can write 1000 queries which fail on any provider, also yours (as yours fails on 2, which means if I re-use that failing aspect in 100 queries you won't succeed in 1 single query. But, gee, you didn't do that. Just lucky? (while we were unlucky because you didn't use a single fk-pk compare?), I doubt it.
Houl, that's what this is about. You don't have to agree, I just hope you understand now.
To all, I also hope we all realize the more attention this marketing campagne is given, the more it proves to be effective: giving attention to DO. /ignore is a better alternative.
I admire much both Ayende and Frans but is not for this that I will not consider others opinion if they have valid arguments (just if they are exposed in polite and fair manner).
It's also true that I respect more arguments coming from the people that really invested all their careers on the ORM field from long time and already answered long time ago some questions users like me are could have now.
So in some way the I trust their choice.
So let me recap to people that read down to here (as we do at the 102 episode of a soap opera).
1) Alex and others from an ORM vendor started a site called "ormbattle.net" that as the name implies was supposed to compare the performance of different ORM vendors in the .Net space.
Staying in "war" terms this was perceived more like an unfair invasion of others territories without previous war declaration.
2) The tests where unfair and did not compare in the right way.
Both Ayende and Frans seems to have proved that.
The fact that the site was coming from one of the vendor let many of us suppose that was a marketing site more than a fair battle.
This statement was contested by Alex that said was willing to arrive to the removal of their own product to prove that and to give the site to the community.
First suggestion: do it immediately to prove your fairness.
3) On the site the author's connection with one of the contestant (the winner) was not that self evident.
Second suggestion: until you do not release to the community put the DataObject author's origin more evident on the home page,like in the first lines (I mean just after "ORM Comparison Goals").
4) Admittedly by Alex, he likes to argue in a bit aggressive manner ("I like to argue - it's a kind of sport for me") and the comments degenerated pretty quickly from both parts and all ended up "I'm smarter than you..." stuff.
5) Indeed there is an ongoing technical discussion between the lines that is interesting and it is worth to continue debating on that in a polite and cooperative manner.
6) It seems very hard to do a real comparison of different products but a more "real-life" test application would be doable.
The quality of a product cannot be measured only on batch statements but also on features they provide in real word.
7) A site that is supposed to compare products in fair mode would also let users to post comments and have their opinion showed up.
This fairness would be proved also from the immediate removal of Ayende statements from the FAQ (they are still there) as the LLBLGen mentions (still there too).
I, as a user, would much like a site where Orm are compared by their features and progress together in a cooperative manner.
And in honor to Frans I must also say that there is something that is hardly measurable with unit test but it is really important for choosing a product and this is the quality the support a product has.
From my experience, if we could measure this, LLBLGen would beat all the others hands down.
Alex,
You don't understand how NHibernate is working, that is the issue.
On the first iteration on a Linq query we are loading all the results into memory, then give you a List of them.
It doesn't matter if you iterate only on a few, NHibernate doesn't do lazy materialization.
Your code is loading ALL rows into memory, convert them to objects, but only iterate over a few of them.
The iteration part is _free_.
You are making assumptions that are simply invalid for NHibernate, and then trying to draw conclusions from those.
GIGO
That doesn't matter. No one is ideal. You're just thinking about how to be not worse than EF? Than it's your choice.
As I've mentioned, we think it's good to test this because:
If you compare keys, you're "binding" your queries to underlying entity & key structure, that's obviously bad. If you'll change this some day, you must change all the queries.
It's simply more readable.
And finally, LINQ to objects allows this. So we think it's resonable to test this.
Alex,
A quick peek at the NHibernate documentation, Flush, will show you the answer.
session.FlushMode = FlushMode.CommitOnly;
Wrong again. Our code iterates _all of them_, thus it doesn't matter if iteration is free or not.
That's why we included query execution time into this test: if we won't do this, just frameworks as NH would get unreasonably high score on it, because they did all the dirty job (materialization) on execution, and further just iterated the list.
Do you see assumptions aren't wrong?
Concerning session.FlushMode = FlushMode.CommitOnly - we'll add this to test. But as you see, this isn't obvious (other frameworks don't expose the same behavior on queries, yep?). Thanks a lot for the help here.
Frans: "You don't get it. The site pretends it shows actual real life measurements ... "
So what? :) i know that there are a lot of sites, that pretend that real men have at least 25 cm. If i would try to bring them the truth about norma=1x cm, it would take all my life and i suppose without any success :)
How many there were comparisons of intel and amd - with a lot of discussions about stream calculations "good" for intel and game-like calculations, suitable for amd? About NVidia and Radeon - almost the same story, just about shaders calculations precision...
When we speak about Internet, i prefer position "Do not like? Just ignore" and recommend it to anybody :)
Alex,
I really don't know how to try to answer that.
The "test" that you have is loading all objects to memory. You pay for reading the entire table for each loop iteration.
I am not seeing how it is related to the iteration numbers, they are not meaningful.
What is meaningful that you load the entire table each and every time.
Alex,
NHibernate contains a lot of stuff that can be used. Expecting to pick them up without reading the documentation is... unrealistic.
Let's look on it closer again:
Instance count: performance (op/sec)
100: 14267
1000: 16399
5K: 16884
10K: 16650
30K: 16437
So as you see, peak performance is @ 5K entities. The whole table.
Do you see it is lower for 100 entities? Do you understand it will be much lower for 10 entities (remember, we include query exec. time)?
Ok, so what I want to say: since we're showing maximal possible performance (i.e. upper limit), the test we run here is definitely ok. I agree there can be some mistakes, and I think that's what we must discuss.
But your attempt yo say "everything is wrong" is nothing more that LOL - especially, with _such results_.
I'd seriously think about this, if guys from EF team said the same (because they lead e.g. on materialization test, + show good LINQ support). But telling us "the test is wrong" and simultaneously being the worst player there... Do you see this is a bit ironical?
Alex,
You numbers are close enough together to be statistically meaningless.
Sorry, of course, not the worst one... But anyway.
ARE YOU HUMAN AT ALL?
5ALMOST IDENTICAL NUMBERS ARE "statistically meaningless"???
Ok, sorry, I just a bit emotional now ;)
Alex, your a tool.
Well, not "just lucky", of course. We have a large set of LINQ tests (~ 700), and most of these queries were taken from these tests with adoption to Northwind model. So our LINQ tests aren't ideal from this point, but on the other hand, we didn't want to use standard tests like "101 LINQ Samples" - most of players would succeed there.
Anyway, we're ready to add any tests you'd like. Just say what exactly must be added.
P.S. We just added PK-FK comparison tests. LLBLGen got few more scores. These tests pass for DO4 & EF as well.
@Alex:
__
Tested version of any framework is mentioned in its own page.
See e.g. http://ormbattle.net/index.php/subsonic.html
let me educate you a bit. Subsonic 3.0 (current public version is 3.0.0.3) has a LINQ provider.
It might not be the best there is, but it certailnly is better than few of the contestants in the your "test" (both NH and Lightspeed will have its much better LINQ provider soon).
So how it is possible Subsonic scored 0 in LINQ test?
Thank you for honest answer.
__Subsonic got zero score on this sequence because it does not support references (thus we were unable to compile the test for it).
oh I see, it is on the page (I thought it will be somewhere down written with very small (or even smaller) letters, would be more appropriate ;-) but "customers" are lured by numbers in the table, where you should at least put an asterix!
at least! not to mention this statement is a joke...
Well, it depends. Yep, I've spent lots of time here yesterday and today. But have you seen how many people are @ ORMBattle.NET? "We have 27 guests and 4 members online" right now.
Anyway, I always prefer to spend additional time but make everything fully clear.
Yes, we've been testing 3.0.0.3 - there is a mistake on its page.
The problem we've faced with it is that it doesn't support references. I.e. I can't declare order.Category field. I can declare only order.CategoryID. But we expect support of reference properties is a "default" feature.
That's why we couldn't get them compiled. We thought about implementing such reference properties as:
public Category Category { get { throw ...; } }
Could you recommend something to fix this?
Here is our current LINQ test code: code.ormbattle.net/ (T4 template that renders the test for any ORM).
I just explained, why. We're fully ready to fix this ASAP (e.g. right now or tomorrow).
"so many words about..." Most of the words come from one shameless person that don't know when to stop. Clearly he is unfit in the development world. I think he would do well in Washington :). That might be a better fit for him.
I, and probably many other silent observers, know from the outset the outcome of this narrowed discussion. Benchmark like this are useless like Oren said in the beginning of this post. Discussion like this are a waste of time because it get people to focus on a few irrelevance points. Still it was necessary for Oren and Frans to step up and point out what was wrong since Alex don't know when to stop.
"But as you see, this isn't obvious (other frameworks don't expose the same behavior on queries, yep?)." See how he play dumb? Ain't that cute? :) Apparently what obvious to him is his own framework.
Anyway I rest my case. All that needed to be said have been said in the first few posts. All the other post are just there to point out the obvious that obviously wasn't so obvious to one person.
"> And by EF.
That doesn't matter. No one is ideal. You're just thinking about how to be not worse than EF? Than it's your choice."
No, I was referring to the fact that it is convenient for you to use only entity comparisons as you knew they would all fail on EF.
So the tests didn't do:
but you did:
That's not testing real life stuff, that's just making deliberate choices so others would suck. Of course you don't agree, that seems to be in your DNA (remember years ago you argued my framework was crap because using a zip tool would compress the framework much more than yours? :D)
for the people who have no idea what I mean:
// entity comparison:
var q = from o in ctx.Order where o.Customer==c select o;
// vs. fk-pk comparisons
var q = from o in ctx.Order where o.CustomerId ==c.CustomerId select o;
"- And finally, LINQ to objects allows this. So we think it's resonable to test this."
So, did you include some tests using Reverse(), SequenceEqual, Aggregate, ElementAt( Oh, we actually support that one), etc. ? No? why not? Linq to objects seem to do just fine.
"That's why we included query execution time into this test: if we won't do this, just frameworks as NH would get unreasonably high score on it, because they did all the dirty job (materialization) on execution, and further just iterated the list."
Are you saying that your framework keeps open the datareader during enumeration? I find that hard to believe.
@Houl: "So what? :) i know that there are a lot of sites, that pretend that real men have at least 25 cm. If i would try to bring them the truth about norma=1x cm, it would take all my life and i suppose without any success :)"
haha :D Ok, good points.
"P.S. We just added PK-FK comparison tests. LLBLGen got few more scores. These tests pass for DO4 & EF as well."
I doubt EF will succeed, they don't support FK fields. I also don't see how testing our code is still relevant, you are in the process of removing our name/code/results, remember?
""We have 27 guests and 4 members online" right now."
You want to educate a lot of people how the state of O/R mapping is today on .NET? I then would not be satisfied with 27 guests. Let me put it this way: one post on this blog or for example on my blog will reach thousands of readers immediately and many thousands of readers in the following days through google searches.
Even though the numbers are low, I still would like to see you putting less effort in rehashing the same statements but to remove us from your marketing campagne. That's this site, and also 'LLBLGen' as keyword in adsense on google. Look at it this way, Alex: Oren and I don't need competitors' names, their benchmark results or other stuff to explain to users what we have to offer. You apparently have to. That might suggest you are doing something erm... wrong. (and do remember, you're on the market for many years already. Lack of time isn't really the case)
Let me remember you the whole conversation:
We noticed NHibernate query performance degrades with count of fetched entities on our test.
Ayende said "it's OK that default NHibernate query cost has O(CountOfFetchedEntities) factor, because it traverses the whole graph"
I said "Ok, can I do something to get rid of this? Note that this isn't obvious."
He answered "yes", and published a solution.
I said we'll fix this then.
So what do you mean saying I'm playing a dumb? If you mean this is obvious, can you name any framework from tested above that expose the same behavior?
If you're saying this is an "expected feature", but don't name where else it can be found, well... That's what I'd call shameless.
Oren, we're trying to implement IStatelessSession pattern within our performance tests and faced an issue: it's impossible to use LINQ queries with it. There is no overload for Linq <t(this ISession) method that accepts IStatelessSession; IStatelessSession does not implement ISession.
Is there any workaround?
No, we didn't. Our goal was to create ~ the same LINQ test for all ORM tools. And if some ORM (e.g. Subsonic) requires a specific model & test, we couldn't handle this so far. Now we're trying to fit the needs of everyone (i.e. there will more complex T4 template).
Anyway, LLBLGen got almost nothing because of it.
Btw, crying about "they frequently use extension methods instead of C# syntax, and that's why nothing works there " on you web side just proves the fact that 1 step aside from very basic queries lead to a failure.
EF supports everytihng. We too. I'd expect you're, as author of long cycle of articles related to LINQ provider, simply MUST provide an HQ implementation of it.
Yes, DO4 does this for MSSQL (i.e. utilizes MARS), although this is provider dependent.
No, we didn't. If some LINQ feature requires an implementation with unexpectedly high execution cost, obviously it is better to say "not supported here".
As you see, this isn't related to reference properties.
Btw, we don't support ElementAt ;) Will be added to tests.
I think that's ok to have such amount of readers during first working day. Frans, I aware your blog is very popular, as well as Oren's ;)
So I think this is a good result for us ;)
30 simultaneous visitors = ~ 2K visitors per day (just extrapolation). So that's ok.
Do you know it's ok to use competitor's name in many many countries? Check out their trade name policy.
Comparison-based advertisement is legal in many countries as well (likely, the main reason for its legalization was that it is good for customers). Just FYI.
"Btw, crying about "they frequently use extension methods instead of C# syntax, and that's why nothing works there " on you web side just proves the fact that 1 step aside from very basic queries lead to a failure. "
based on what does that prove anything besides your rich imagination? I have hundreds of groupby queries which all work in many complex situations, yet none of your queries work. Apparently some oversight, as groupby() is very tricky to implement and a code path you use isn't optimal, HOWEVER, in real life who writes the group by queries in linq with extension methods? Almost no-one.
If you are saying we only support very basic queries, you are flat out lying. e.g.
var q = from o in metaData.Order
runs fine.
How many queries do you run with this:
var q = from e in metaData.Employee
2?
var q = from c in metaData.Customer
runs fine.
"EF supports everytihng. We too. I'd expect you're, as author of long cycle of articles related to LINQ provider, simply MUST provide an HQ implementation of it."
EF supports everything? No, it doesn't support contains and some other extension methods. Writing 100 of these queries will score them a 0.
As I've said earlier, your queries showed some little bugs (3) which likely tripped up a lot of the tests to fail, mostly due to code paths which aren't used that much if at all. We'll fix them this week. Not that it matters, your '100' number is subjective and misleading anyway.
"Do you know it's ok to use competitor's name in many many countries? Check out their trade name policy.
Comparison-based advertisement is legal in many countries as well (likely, the main reason for its legalization was that it is good for customers). Just FYI."
yes, I know. I was just asking a question. That you are not willing to remove our brandname as word in adsense, says everything about you and your moral standards. The only other company who had similar tactics and moral standards was Vanatec.
Using other brand names should be used with cautious.
If I recall well, that website is not and far from it.
About google ads, they remove them on request if I recall well else it'd put google open to litigation.
Alexis,
No, Linq queries doesn't work with IStatelessSession at the moment.
Of the top of my head, I can't think of a reason that they cannot be made to work.
Writing an equivalent for IStatelessSession would probably just copy the Linq(this ISession) for IStatelessSession)
That is... an interesting debating tactic.
Please be aware that continuing in this vein will get you blocked, I am willing to have a debate, I am NOT willing to have mud slinging in my blog.
You don't get the meaning of statistically meaningless?
Let me put it this way, you have two numbers, N and R.
N is the number of items you work on.
R is the number of operations per second.
If for wildly varying values of N the value of R is pretty much the same, it means that there is no correlation between N and R.
Thanks fanny for the soap recap :)
Another product trying to work up it's sales via attacking well named brands :). We had this with nCover a few days ago and now this.
Looks like nHibernate is becoming new target like what Microsoft used to be :)
Alex, do you comment on blogs for living ? Get a life. Anyone with SANE intelligence is not going to believe your broken english :)
Ayende, I think it's best to close this entry for comments, this will put guys like Alex out of job :)
That's not about moral standards. That's just about advertisement. It's legal, and I really think it's ok. I admit the same activity for any of our own brands.
Exactly ;) Ovciously, I didn't mean number there won't be significantly different from 14K-16K, moreover, likely it will be nearly the same (= nearly as bad as it was for 1K, 5K and so on). So "no correlation" is quite meaningful here.
Frans, I agree it's very subjective. We've made really complex test suite.
But isn't it good? Just think about this. E.g. from the point of customers.
P.S. I'm removing DO from this web site. LLBLGen is also leaving it.
For those who are interested: there is one more thread related to NH benchmarks: ayende.com/.../benchmark-cheating-a-how-to.aspx
Alex,
do you understand why there is no correlation?
And do you understand the difference between moral and legal?
I mean if such test exists, any player there may easily score its own LINQ implementation.
Yes, it's complex to get 100 of of 100 there. Moreover, i think we must modify test in such a way no one will be able to get it. Like with 3D benchmarks: getting topmost results must be almost impossible.
Finally, we're ready to change these tests in the way community will suggest.
Yes. I think using competing brands in ads is moral and legal. Because it's good for competition => good for customers.
I just agreed with you there is _no correlation_, but mentioned exactly this is meaningful (= you have ~ constant operation ratio).
Err... I mean "must be legal worldwide". Just IMHO, of course.
I understand this depreciates the brands, but on the other hand, this allows new ones to grow up faster => more intensive competition.
I have nearly the same opinion on many other points, e.g. patents (know-how is ok, but patents are not, at least in their current state). So generally if something is good for the competition & morally ok, it must be legal.
But that's another story...
"I understand this depreciates the brands, but on the other hand, this allows new ones to grow up faster => more intensive competition."
You're on the market since 2003? or somewhat around that. Since that day, you've gained some marketshare but eventually lost it.
If I may, you mainly lost it due to the lack of progression in your own library (the 3.x branch) in favor of some super-duper framework you've been working on for a long time (v4).
You won't get it back. Not by a long shot. Let me explain: in the past 5-6 years, numerous frameworks have seen the light and have died off, just a few remained. That's logical market progression: several big players (2-3) fight for the top spot, the rest has marginal market share (<10%). Before MS released their frameworks, it was open, but you left more or less the market. MS released their frameworks, and no matter what you think of it, in the end there only will be the following frameworks:
Entity framework
Some open source offering which does things differently. Best cards are in the hands of nhibernate, but their linq provider has to be better.
The rest will die or will have marginal market share. The simple reason for this is that the EF has millions of funding and a very large team of developers, and in the end they will get their act together and fix all the problems in the framework. It's not a competition really, they're included in .NET, you don't have to buy a framework from a 3rd party. If you don't agree with how the EF works, you can use an oss offering, also free, e.g. nhibernate. NHibernate will too in the end implement every single feature there is to implement, simply because all other problems are already solved.
that's also why we didn't decide to compete on framework features, but on the designer level. MS will never catch up at that level and we already have a big advantage in that area.
So where does that leave you? You might have a big framework with lots of features, and perhaps on paper you look better than whatever is out there, but frankly, it doesn't really matter: the people will choose what everybody else is using, what the known brands are. Some small group will pick a small player, but that's it. And you see that happening already today.
I'm not saying this to make you feel bad, I'm not that kind of person, trust me. The reason I explain this to you is that you might, even a little, get a bit of insight in what is happening today in the O/R mapping/data-access market and how the situation is and where you made your serious mistake. No low-level marketing tactic with competitor brandname advertising, rigged benchmarks and other crap will help you with that, markets don't work that way.
Alex, believe in what you can do yourself, instead of basing your success on the downfall of others. The only way to succeed is to take your own destiny in your own hands. Making others look bad in some 'test'/battle/benchmark isn't helping you one bit: it won't get you the marketshare you need to survive against Redmond, as you compete on the framework level, which is a lost race.
Frans, I clearly understand that's quite difficult to get any significant share now. And all the things you wrote are absolutely correct.
But, as you might find, we're looking a bit further than other ORM vendors: in fact, we ship a database integrated with ORM tool. Not sure of you know this - DO4 is capable of compiling queries against its own storage (with open API) & execute them there. Yes, this part isn't fully ready yet - e.g. this works just in RAM for now. But it works as it must work within fully featured database almost everywhere (i.e. there is index statistics, query optimizer relying on it & so on).
Why I think it's important?
MS will never do the same. They sell SQL Server very well, so there are no any reasons to work on such a tight integration there.
Can you imagine how they'll deliver it to e.g. Silverlight? We can, because our engine is fully managed. Do you know any database working @ Silverlight?
Our final AIM is to provide a transparent .NET API for any relational storage. Our own or not - it does not matter. SQL or not - again, this does not matter. E.g. technically we can support BerkeleyDB (no queries except index operations) or Azure Table Services.
So why we focus just on ORM part now? I clearly see we can't market a solution with integrated database right now. We must prove everything works with regular, well-known databases.
Further this must allow us to propose the people using DO4 to use our integrated database (e.g. instead of slow SQLite). And certainly, we'll work on support of such platforms that others simply can't support so easily: Silverlight, Azure, Amazon Simple DB & so on.
That's why we care so much about full transparency: you can do absolutely the same things on our own storage as on any other. Schema upgrade, queries - everything is the same. Sync feature must be a last chain binding all this stuff together.
But as you see, for now we're showing just basic stuff. And I bet it's visible it works well ;)
"AIM" -> aim, LOL :) I don't use AIM ;)
Many mistakes, but I hope I illustrated our position.
Yes, you're right. Frankly speaking, I was more than happy this summer, because a HUGE machine, on which we've spend about 2 years, has finally started to move. Of course it's clear the most complex part is what happens now. But until you don't see the results, it's really difficult to motivate the people & be motivated as well...
So I wish you getting the same point with LLBLGen 3 faster. For me that was really a kind of dream ;) May be that's why I'm a bit nervous protecting it ;)
Today's results: http://ormbattle.net/index.php/blog.html
Interesting to see that 'openaccess' of all frameworks gets the crown handed over to them... who would have thought...
Well, they're really good from the point of CUD operations - their bulk deletes are simply ideal. We didn't knew about this, so this was a surprise even for us :)
But materialization currently is their painful place: they consume RAM much faster than e.g. EF & Lightspeed. Their results for 10K / 30K entities are ~ 2 times worse on "big" tests. Likely, just because they're already "wasted" the whole L2 cache, although competitors are still there.
I hope this explains something :)
I think Alex's site states it best over here: ormbattle.net/.../...ur-tests-are-unrealistic.html
He mentions that it's not fair to test a CPUs ADD operation by using MULTIPLY because that would work better on CPUs that support MULTIPLY.
I think that sums up this debate. Alex wants to compare everything by what he believes to be the lowest common denominator. However, I think you have to agree, Alex, that it's not really fair to compare based on lowest common denominators because that's not how things are used. Going back to your example of CPUs, when choosing a CPU you would typically do it based on the application. So for a for a device that will do a lot of multiplication, would you want to consult a test that specifically excludes that option?
My point is just that this isn't how things work in real life! I know you want to show lots of numbers and in a certain sense your rationale makes sense, Alex, but I think you have to rethink exactly who might read this and what these numbers might mean to them.
Especially with software, which is an art as much as a science, how you use a tool is just as important as which tool you use. I've seen hacks that trick out really crappy frameworks to do the right thing fast, and I've seen great frameworks used in the stupidest of ways.
I think that there is no way to really test 2 frameworks the same way since the way they are used is different. Oren has mentioned this multiple times, basically telling Alex that "You don't do it that way in NHibernate". Even the application route that Oren suggested (build an application and test that) isn't completely fair because different apps should use different frameworks.
Anyway, that's my 2c.
Let's comment EF results as well:
Perfect materialization. I bet they can make it ~ 1.5 times faster, if they'd decide. Just because we could (although we have a bit more "fat" entities).
No CUD batches = chatty communication. No good marks.
Query compilation time is simply huge. Likely, because they compile everything to eSQL. May be I'm wrong, but I feel this is just rudiment they got from ObjectSpaces.
On the other hand, they provide compiled query API. It saves them on queries ;)
"So what do you mean saying I'm playing a dumb? If you mean this is obvious, can you name any framework from tested above that expose the same behavior?
If you're saying this is an "expected feature", but don't name where else it can be found, well... That's what I'd call shameless."
Do you know what irrelevance mean?
You started with an irrelevance but also erroneous argument to begin with. Then you get people caught up in trying to correct your erroneous statement. By focusing on trying to correct the erroneous statement you divert people attention from dismissing your argument, which is bogus to begin with.
Let say that I drive a car with gray interior you'll say that my car is hotter in the summer because I have a black interior. Well the whole argument is baseless to begin with. My car interior color have little to do with whether it's hot or not. Your benchmark number have little to do with the real performance of the frameworks involved.
Yet you get people caught up in arguing with the fact that the car interior is in fact gray. After that said and done you'll act all innocent and play the nice guy and say well I didn't know I'll correct that. If you didn't know that NHibernate work a certain way maybe you should have investigate it further before doing your benchmark. Instead now you said it's not a common future? That's the same as saying gray is not a common color for a car interior. In short it's just another diversion tactic of your.
Fact is correcting your mistake make you look like a nice guy so you have time to circle back to your baseless argument. It's a diversion tactic that politician love to use. They get people to caught up in the details that they forget the big pictures. So please quit trying to get people to focus on this and that while the main issue here is for you to abandon the whole "useless benchmark" all together.
I agree with this. And since we can't make any assumptions on how the things are used, we just provide plain output for each basic operation. No places. I think this is at least more honest than some average points.
I.e. we just tell if each basic feature can be fast or not. Places would break everything.
Fully agree.
On the other hand, I'm almost fully sure pretty simple tool done by few newcomers have zero chances winning on these tests:
Getting some scores in LINQ tests means you must be an expert. I hope Frans & Oren can prove this :) It's simply can't be achieved right on the first attempt. Man-years must pass till you get LINQ passing most of these tests.
To get high score CUD tests, you must solve tons of problems. I can enumerate just some of them: full query parameterization, running only queries with caching plans, cache SQL for these queries or be able to build it fast with new parameters (there batches!), batches themselves, IN optimization (OpenAccess results for removals are impressive, yes?), and usual topological sorting, version checks, etc... Can you believe someone can "simply write" all this stuff?
To get high score on materialization test, you must profile it for relatively long time... Clearly knowing what can and must be optimized. Here is all about RAM consumption & count of allocations. You must know all about .NET application performance to achieve good results. Ok, just one example: we use our own IntDictionary(Of T). It is a dictionary resolving Int32 keys to T. Up to 4 times faster than regular dictionary (no GetHashCode(), fast equals, special hash table structure). Internally we use Int32 TypeIds, and this helps to resolve them faster. Gives us additional 5-10% on such tests.
So... That's why we tested just leading tools. At least as we thought. Other players... Frankly speaking, have almost zero chances.
I agreed we've made a mistake there (i.e. used NH in wrong way). As well mentioned the mistake isn't obvious: no one except NH exposes this behavior, so we really couldn't assume this. Of course, if you don't assume we read the whole NH documentation ;)
Moreover, it is written @ ORMBattle.NET (see "About") that _we are humans_, thus we can make some mistakes, so if you're an expert, please help us to fix them.
Be sure, if they exist, they aren't intentional.
Anyway, now all we could fix (yes, there there some wrong proposals even from Oren - do you know he's human as well?) is fixed. NH got ~ 2 times better score. That's it.
If there is something else that can be done for it, just say this ;)
Btw, so can you name any other tool exposing such an "expected" behavior? ;) You just almost called me "politician", so I'd like to know if you aren't ;)
Seriously, did __any of you guys get any real work done today? ;-P
Funny. Have you noted I was talking right about "useless benchmark", but you've just popped up an absolutely irrelevant part of discussion. Moreover, you said that I'm trying to get people focused on it.
Ok, I like the guys loving to argue :) But not this way. Let's close this topic?
I wasted all the day here ;) But I'm a chief, you know ;)
Worse that most part of DO4 team was reading this stuff today ;) And, as I suspect, mainly to ensure I'm polite enough ;)
Alex, just go back and read my last post again and all the other posts then think real long and hard, you'll figure out an answer on your own.
firefly, you're giving the best advices here. Short and clear. At least one thing is pretty clear: no answers from you.
Let me remember some of your phrases here:
Do you see I said a completely different thing, but you rephrased it as you wanted?
Have you seen I accepted this? Would you still like to bet?
No answer.
So one of conclusions I made: guys like you must be ignored. No proofs, no answers, no responsibility, just so important "IMHO".
hihiks :)
so many argues about speed... from people who are engaged in developing of most draggy type of IT products :)
Assembler is fast. All of other - is compromise :)
Sorry for a bit of offtopic, but really, this discussion is so funny :) if only some of participants would cease heat a bit... :)
Just remembered an obvious example of why e.g. materialization performance is important:
SqlDataReader is able to fetch about 2M rows per second on this test
PosgreSQL and some other readers normally fetch just 500K rows per second.
So 4 times difference. If you'd look at SqlDataReader code, it's quite optimized. There is simply nothing to make faster. Others are not (lots of boxing, etc.).
So it's obvious the team developing SqlClient & SQL Server protocol spent noticeable time on such optimization.
Fast materialization in ORM is the same as fast IDataReader in ADO.NET. So I thunk there are tons of reasons for optimizing materialization.
So numbers to keep:
SqlDataReader is able to fetch about 2M rows per second on this test
EF materializes up to 600K entities per second relying on SqlDataReader (so it is ~ 3 times slower)
NH materializes just ~ 40K entities per second, at least via LINQ. I understand many people find this acceptable. But that's what pushes them & others to use stored procs & server side logic.
Alex,
You seem unable to follow that there are different needs for an ORM vs. IDataReader.
IDataReader is used for _bulk data manipulation_, ORM is used for OLTP
Who said this? If you, I fully understand, why:
EF can be used for bulk data manipulation). LINQ 2 SQL - as well (we know it must be faster than EF on materialization)
NH can not.
Alex,
OR/M is for OTLP.
OTLP is very rarely about bulk data manipulation.
Oren, thank you for the comment about IStatelessSession. It was implemented in our current tests.
Ok, Oren, I just compared incomparables: teleportation with transportition, just for you.
Entity:
[HierarchyRoot]
[Index("Value")]
public class Simplest : Entity
{
}
As you see, there are 2 indexes now - I made this intentionally, to make the test at least a bit complex for RDBMS. I think most of real life cases related to executable DML are even more complex from this point.
Teleportation: I tried SqlClient query + commit of this one:
"UPDATE [dbo].[Simplest]
SET [Simplest].[Value] = -[Simplest].[Value] " +
WHERE [Simplest].[Value] >= 0"
All rows pass WHERE condition.
Transportition: this code on DO4:
<simplest.All);
Result:
Teleportation: 48K rows (updates) / second
Transportition: 15K rows / second
Total rows updated in this test: 1M.
I think if you find 10+ times difference on materialization "acceptable" for OLTP (as well as 2...4 times difference on CUD tests), this 3 times difference is much more than acceptable. Moreover, I suspect real life results will be even closer (take a look at that tiny entity; + index was fully fitting in RAM).
Alex,
Are you truly arguing cursor based vs. set based efficiencies here?
Oh, and RPC vs. Local Work?
Btw, the larger set we take, the smaller is difference. On 10M set teleportation will be ~ 2+ times slower, but transportition will run nearly the same.
The reason is obvious: the larger are indexes, the more costly their updates are. But SQL parsing & communication time is ~ constant.
So about importance of our tests:
CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads.
Oren, look at the numbers. It's pretty easy to prove they must be ~ of same efficiency on large data arrays.
If you don't believe this is true, think about distributed databases. There is no "local" at all.
On the other hand, in all fairness, I don't think the point of the test is to intentionally make it OLAP. It merely magnifies OLTP operations in magnitute of hudrends-of-thousands times to observe how NH performs in high load, compared to other ORM products. Maybe the approach is not designed quite properly for NH, but probably someone could advice how to do it (high-load OLTP test) to provide more accurate result for NH.
Hendry, Oren, we are going to implement TPC-C or TPC-E performance tests where all framework features will be allowed because it's business logic bench with huge DB workload.
First stage is to make infrastructure for these tests (different implementations for particular ORMs should be rather similar). What do you think about it? Please reply.
Alex,
Your "test" is flawed once again, remove the idea, and try again. Try something that resembles real world practices vs. a specially crafted thingie to show how it behaves.
Alex,
I suggest reading about the fallacies of distributed computing.
You code would result in 1M+1 remote calls. Put it on a reasonable network, and assume a 2 ms ping time, and your timing is going to so far off it isn't even funny.
Hendry,
If that is the case, then the test should be using _different sessions_, not a single session
Alex,
Persistence by reachability isn't costly at all.
I think you are confusing dirty checking with that.
As for implementing a different dirty checking method, the hooks are there, and it is quite easy to do.
Alexis,
That sounds like a more reasonable approach for this, but see my comments about the PetShop benchmark stink when 1.0 came out
Oren, I see you don't want to answer just "Yes" or "No" ;)
We've already discussed this. I described this here: ormbattle.net/.../...ur-tests-are-unrealistic.html
Ok, I see you're intentionally trying to hide pure NH performance by other numbers (I agree they might appear in high-stress conditions, but not everywhere), constantly saying "this is real, not what you show to us".
I'll repeat myself: "So about importance of our tests:
CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads."
Do you agree at least with these statements?
It requires the same ~ O(N) scan time. So it's of the same complexity.
True - I imagined just graph traversal & forgot about this. So this must be much more computationally complex, but with the same ~ O(N) dependency.
That's not hiding it - that's putting the numbers into exactly the kind of context that you replied to me saying you agreed with was needed and you were working to fix.
CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads."<<br />
And I will also repeat myself: So what if the room for overall application performance improvment is, say, 5%? Then the points you mention above can convincingly demonstrate in tests that your framework is 100 times faster than mine, yet still you do ot manage to improve overall application performance by more than 1%.
You do agree with my logic, no?
/Mats
Hehe, again my comment was too long - well good thing I have my new blog then ;-)
Oren, I know you are every bit as familiar as I am with the potential for spectacle surrounding "PetShop" benchmarks, but please hear me out...I think you may possibly agree with me that there's a kind of Pet Shop benchmark that could be very useful.
Please let me know what you think:
matshelander.blogspot.com/.../...ti-benchmark.html
Yes, I agree with this. Let's think about other expenses we have now:
Index operations. I wrote Oren any high-performance server is equipped well enough to reply most of queries w/o HDD seeks. Either the whole database or its working set is cached in RAM. So almost no delay here: such operations can be performed with at least 100K/sec rate per second.
Network latencies. If you're caring about performance, 10Gbit channels are right for you. Latencies start from 2.5 microseconds there. Don't paste any links, since they're easy to find. So again, not an issue.
Network stack, or protocol latency. ~ 50 microseconds is latency reachable for WCF, so I suspect specialized protocols, such as VIA for SQL Server, provide much better one. 10-20 microseconds is fully ok for us here.
So what else is left? I don't know. It seems I mentioned all show stoppers.
So querying an enterprise storage with 10K op/sec in a single thread is more than possible (likely, even 20-30K op/sec isn't a limit). But NH can utilize only 1/10 of it on queries. I'm not speaking about CUD: ~ 10K commands per second * batch size (25 for DO) = 250K updates per second. So here we're easily reaching the limits of our framework. The same is true for NH.
So... Have I shown all I shown must be visible even on very large storages (e.g. 0.1...1 TB)? Think about the smaller ones. I admit, 90-95% of databases that are currently used are <100Gb.
Finally, think about the future. Network latency isn't a big problem. RAM isn't as well. But CPU speed is - i.e. they grow up only because of parallelism. But AFAIK neither NH nor any other ORM currently utilizes this (we plan).
"at least 100K op/sec rate" :)
Alex,
All the numbers you throw around are maybe accurate for lab situations.
In practice, the numbers are tens of milliseconds.
Well, the database does work, even if it is in RAM - faster than touching disk, of course, but still work. Incidentally, this work can be considered part of the actual "application logic", which if used as an overreaching term (as contrasted to just running a loop) I would say can be used to cover the bulk of the showstoppery you left out.
But you do raise an interesting point with regards to the network throughput and latency (and an interesting fact - I have to say "wow" to that 2.5 microsec latency!! :-O)
"If you're caring about performance, 10Gbit channels are right for you"
That is just the point - do you care that much about performance? Do you care forever about improving performance or will there be some point in a given application where performance is good enough and other aspects become more important?
What I propose is that for a lot of people, the following applies: 1) The performance of the application is good enough with just a 1Gig Network (anything less and I agree performance is apparently so unimportant that we probably don't need to discuss it at all!) and 2) that if they were to measure it, the actual persistence framework code steals a fairly small bit of the overall application performance, making performance optimizations in the persistence framework a complete non-concern in their case.
Or it could be that, given the requirements of an application, the CPU speed isn't a problem either. This is what I suggest is already the case for many apps. Yes, I realize that improving performance results in economic benefits - but the whole point here is that other efforts (such as improving maintainability, etc) may yield even more positive economic benefits after a point when sufficient performance has been acquired.
(Cont)
You DO have a good and important point in your observation that as network latency and RAM sizes improve (and disc access times improve as well, for that matter), then we will see less resources drained here, and as a consequence the percentage swallowed by the mapper will be larger. But what exactly are those percentages, now and in the future? Without tests, we just don't know. With all due respect, just doing the maths doesn't tell enough - perhaps you have done real stress testing on big iron stuff, and then you already know this, but if not - prepared to be surprised when comparing to your math based projections ;-)
In fact, it does seem to me that some of your arguments/math examples are slightly bitowards inflating the importance of the mapper (when testing only data access, increasing the data volumes may seem like you are making the test more realistic, but it also serves to cement an application abstraction where the application apparently only does data access, which is very misleading)
So I still say actual tests - Pet Shop style, preferrably (see my new blog post at matshelander.blogspot.com/.../...ti-benchmark.html ) are needed to put any of these numbers into any kind of useful perspective.
So, finally:
Let me just make completely sure that you do agree: you do agree, then, that until you actually complement with a test that shows the mapper in context of a real application doing something real, it is impossible to know the actual importance of the numbers from your benchmark? Since, again, if the overall overhead is low enough (say 5%) then any performance benefit is likely pointless (until future breakthroughs in physics that will for some reason be strangely constrained to applying to t´he network and RAM but not to CPU speed (?) which may completely change the equations). I also wonder if you agree that math-based estimates and extrapolations (such as you have provided so far), even if the math itself is decent, is hardly a substitute for a real test? I realize that not yet having had the time to do the tests I ask, you try to provide the next best thing, which is a reasonable estimate. My problem with this is that the whole point of the benchmark you have created (and currently are using in a marketing drive it seems) may in fact stand or fall with the tests I suggest. So the best thing would perhaps have been to not publish at all until you had the tests to put the numbers into context, but imo the next best thing is then NOT to come with estimates, but to confess that currently the relevance of your numbers is entirely unknown and you will have to return when you have actually run some tests.
I hope I don't sound harsh, but you did mention arguing is a sport for you - well, it is for me too ;-)
/Mats
should be: slightly biased towards
@Alex,I'm sorry but you are solving the wrong problem...
Materialization and Query Performance is NOT the problem with O/R mapping.
The problems are:
It's hard to create mappings (e.g. in XML).
Alot of people mess up with Lazy Load and casue ripple load havoc in their systems.
Error messages are often hard to understand since the inner workings of a mapper is quite complex.
The community around NH solves some of those problems.
e.g. Fluent NH or Castle AR (which by the way has the best and most descriptive errormessages of all frameworks that I have ever seen)
IMO, you are just throwing development money away when you optimize for materialization / query performance since the mapper performance is pretty much dwarfed by all other factors in a big app (as already stated by others).
its a quite neat PR ploy tough..
//Roger
Alex,
All the numbers you throw around are maybe accurate for lab situations.
In practice, the numbers are tens of milliseconds.
Yes, I understand we can prove this only by other tests. Theory is just theory. Ok, any ideas on relatively simple, but "real life" tests except TPC-X (likely, we must implement them anyway) are welcome.
Agree with all. Frankly speaking, misusing of a particular ORM is something like normal... And I agree, NH is good from this point: there is a huge community.
Let's see... I like to develop efficient things. As well as show them :)
Heh, leave it to the puzzle boys to nail the point home :D
@Alex, What I wondered today was: now that DO isn't mentioned in the results, what's the incentive for the site now? It was a marketing campaign, but as DO isn't in the list of results anymore (and visitors will only peek at the green/red squares), what's the point of throwing time at this (== money) at all? Or are you re-adding DO in 2 weeks from now? I mean, I can't imagine you're doing all this to give telerik or the EF team more attention...
@Mats, glad you reanimated your blog ;) Interesting read indeed.
Frans: "what's the point of throwing time at this"
i can suppose at least three variants:
1) it is just interesting and has educational value - if something in other ORM is faster - then there is obvious possibility for improvements in this "something" - so speed comparisons allow to reveal points of interests in improvements of algorithms/solutions
2) DO team can "kill three rabbits with one shot" - create optimal benchmark suite for ORMs, have permanent public attention on them, and always know current situation with ORMs speed comparing them to DO - so if there any niche of speed-critical projects for ORM will appear in future - DO will become monopolistic in a moment in that niche without any concurents - if all of you will continue think in the way you've demonstrated here about no need of high-speed materialization etc :)
3) imho speed optimizations are directly connected to the whole code perfection. "Fully optimized" and "Perfect" - are almost synonims - including code style, transparency for understanding, bugs absence, etc. So the company that have fastest solution can be almost 100% sure that it really has best product - if features list is the same of course. Now look at #2 - about "compare others with their own and permanently maintain it to be fastest.
Frans: "I mean, I can't imagine you're doing all this to give telerik or the EF team more attention... "
sometimes russians can be absolutely unpredictable in their generosity :-D
Ayende: "All the numbers you throw around are maybe accurate for lab situations."
Atmosphere in server rooms sometimes much closer to labs, not to real life ;)
And by the way - the more speed reserve you have - the more complicated/useful features you can implement.
I think everybody will agree that for example real-time full speech recognition would be rather easy task if we had 10^12 times more calculation power and RAM space in usual PC :)
So it is also a good strategy - gain the same solution which works twice faster, and then "spend" its surplus speed for additional features, which will make this product more attractive than any other :)
That's what we were talking about on the first page of ORMBattle.NET. Not absolutely true, but in wide majority of cases - yes.
Or think about LINQ at least.
And, guys... I just thought it is simply can't be true you're saying ORM performance isn't important. And asked the Google.
"Ayende NHibernate performance" - first link: ayende.com/.../NHibernatePerformanceConcerns.aspx
Few quotes:
"The first problem that Darrel has is with the memory consumtion. I will start right off by saying that I have been a devot user of NHibernate*for over two years now, and I'm building big, complex systems using it. Memory consumtion was never an issue with NHibernate." - think about our materialization test, which results, as I wrote, are mainly related to memory consumption.
"Optimizing NHibernate's performance is almost solely focused on reducing the amount of queries ..." CUD batches are a good example of reducing amount of queries, yes?
I'm almost sure, Oren, I can find at least 10 more articles in your blog related to NH performance. I just don't want to spend more time on this.
And in the end of all: "nhibernate performance" = 0.5M results in Google. That's definitely not important.
Alex,
Your materialization tests has as much relation to the real world as a swimming competition in the Sahara.
And I think you are missing the point with reducing the # of queries. You are forcing thousands of queries and then try to optimize that.
Don't start with thousands of queries, your life will be better
Oren, I answered to this many, many times. Say something new, please ;) You provided zero facts & evidences. Comparisons like "swimming competition in the Sahara" may work on public conversation, but never works @ forums & blogs.
Ok, my remark about "real life performance" in your "swimming in Sahara" style, if you want:
If I buy tomatoes, I'm interested in their calorie content, but not in calorie content of some soup with tomatoes. But you're suggesting all tomato producers must publish calorie content of some soup with tomatoes. Why? Just because this will make difference between your fat and our slim ones less visible.
Alex,
Because what you are publishing is not tomatoes calories value, you are publishing the nutrient values in the water used to grow a different set tomatoes.
@All,
This thread is getting repetitive, and I don't see any value in continuing the same discussion over & over again.
Comments are closed.
Comment preview