Linq for NHibernate Adventures
So for the last few hours I have been getting back into the Linq for NHibernate project, after having left it for far too long. I am beginning to think that building a Linq provider might not have been the best way to learn C# 3.0, but never mind that now.
It is with great pride that I can tell you that the following query now works:
from e in db.Employees
select new { Name = e.FirstName + " " + e.LastName, Phone = e.HomePhone }
Why is this trivial query important? Well, it is important because about a year ago I said that there is only a single feature in Linq for SQL that NHibernate doesn't have. I also promised to fix it the moment that someone would tell me what it is. No one did, so it waited until today.
The main issue had to do with the Criteria API and handling parameters, no one ever needed to do that, it seems. When they did, they generally used HQL, which did have this feature. Since I have based Linq for NHibernate on the Criteria API*, that was a problem.
Now that ReSharper works on C# 3.0, I can actually get things done there, so I sat down and implemented it. It is surprisingly difficult issue, but I threw enough code on it until it gave up (I wonder if there is a name for nested visitors... ).
At any rate, I strongly recommend that you'll take a look at the project. And bug (fixes) and other patches are welcome**.
* This decision was important for two reasons, the first was that it is generally easier to use the Criteria API programmatically, and the second was that I wanted to ensure that the Crtieria API (which I favor) was a full featured as the HQL route.
** Yes, I know that I'll regret it.
Comments
Good job mate!
"nested visitors"...
I havent seen the code yet, but "nested visitor" gives me a vision of some sort of intermediary expression structure between linq and nahibernate expressions.
Ayende, I am very pleased to hear about the revival of this project. My last version of the trunk gave me failing tests. I presumed that everyone lost interest in this project, but I never figured out why.
Maybe now that resharper finally works with linq, we will see a "spike" of new interest with linq and the other C# 3.5 features.
Is there a support for detached criteria's so that we can use it with Castle Active Record as well ? Or is there anyother way to integrate it with ActiveRecord rather than using ActiveRecordMediator ?
Nested visitors refers to the a visitor that create another visitor (of the same type) to handle additional information.
But yes, it is definitely mapping between Linq concepts tor criteria concepts
Onur,
I'll ensure that it will work with Active Record, have no fear in this regard :-)
Will There be LINQ for ActiveRecord?
Not likely.
There might be a bridge between Active Record and Linq for NHibernate, but I'll try to ensure that even this is not necessary.
Thx for your great work. I'm just in learning linq, nhibernate, activerecord and doing some small test.
WIth Ken Egozis blog http://www.kenegozi.com/Blog/2007/11/18/activerecord-dot-linq-naive-but-working.aspx
I was able to use nhibernate.linq with activerecord.
I have a small problem, not sure if this should work:
I'm always getting the following error:
Index was out of range. Must be non-negative and less than the size of the collection.
The error is occuring in QueryUtil.cs at line 174.
expr is d.Name.ToLower().
Am I doing somthing wrong or this type of query shouldn't work?
Also is there a way to use a query more then one time? Like:
var q = from d in ac.Session.Linq<Country>() select d;
int count = q.Count();
List<Country> countries = q.ToList();
I was trying this but i got error.
Zoltan
Zoltan,
I don't think that we support "d.Name.ToLower().StartsWith("a")" at the moment.
This is a bug.
What is the error you got there?
The error that i'm getting when using d.Name.ToLower().StartsWith("a") ?
[ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index]
System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource) +62
System.ThrowHelper.ThrowArgumentOutOfRangeException() +12
System.SZArrayHelper.get_Item(Int32 index) +2653952
System.Collections.ObjectModel.ReadOnlyCollection`1.get_Item(Int32 index) +50
NHibernate.Linq.QueryUtil.GetMemberNames(Expression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\QueryUtil.cs:174
NHibernate.Linq.QueryUtil.GetMemberName(Expression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\QueryUtil.cs:16
NHibernate.Linq.WhereArgumentsVisitor.GetLikeCriteria(MethodCallExpression expr, MatchMode matchMode) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\WhereArgumentsVisitor.cs:226
NHibernate.Linq.WhereArgumentsVisitor.VisitCallExpression(MethodCallExpression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\WhereArgumentsVisitor.cs:61
There is more, not sure if I should post here
Let us take this discussion to the rhino tools dev mailing list, ok?
Yep, np
Mapping String.Concat calls to a db function isn't that hard. The fun begins when people start mixing in-memory calls with db calls in the projection. :) (Yes this is doable, try:
var q = from o in nw.Order
This is nasty for a couple of reasons:
1) you have multiple fields in the query which are used to produce a single end value
2) you have to create in-memory delegates which are called to produce the end result.
This is only solveable with an all-purpose approach, otherwise you'll run into problems when someone instantiates an in-memory object, passes values from the projection to the ctor and calls a method on it to obtain the Real value etc.
Why would you need a nested visitor? The only reason I needed a new visitor inside a pass (I use 6 passes over the tree, rewriting it in every pass) is to lookup things at that level by traversing a subtree, but only in some occasions. (I think 'handler' or 'crawler' is more appropriate. 'Visitor' suggests the visitor pattern is implemented by MS, which isn't the case, Expression doesn't have a Visit virtual method. :( )
Frans,
The complexity was that NH didn't have the concept of doing this through the criteria API, not the Linq stuff itself.
Hibernate is really biased toward using HQL, which means that the Criteria API side has been functional, but not as full featured.
Broadly, Lind for NHibernate is using two major parts of NH, ICriterion and IProjection.
ICriterion is used for booleans, IProjection for selects.
The problem is that Linq mix them fairly freely, and that was never in the plan for Hibernate.
I added support for IProjection to consume ICriterion and I am working on making ICriterion consume IProjection now.
Fun stuff, and it significantly enhance NH criteria query abilities.
Of course, I still think that who ever designed Linq was mad.
The example that you gave is a good example.
There is no way to know where it is going to run, and that is a bad mojo.
What happen if it was:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
This means that you have to load the ENTIRE TABLE to memory to do this.
Crazy, crazy, crazy.
I don't want something as trivial as that causing that much trouble.
And I built my own primitive visitor for that.
I think that the fact that they didn't provide an expression visitor is a flat out shame.
It is not like anyone else needs that, right?
Frans,
You mean an abstract (not virtual) Visit method, right ? ;-)
But I'm not sure I understand this:
"otherwise you'll run into problems when someone instantiates an in-memory object, passes values from the projection to the ctor and calls a method on it to obtain the Real value etc. "
Could you elaborate?
/Mats
"Broadly, Lind for NHibernate is using two major parts of NH, ICriterion and IProjection. ICriterion is used for booleans, IProjection for selects.
The problem is that Linq mix them fairly freely, and that was never in the plan for Hibernate."
Yes, you need projections which represent 'derived tables' (SQL term) a LOT. I had to add it to LLBLGen Pro as well to make things work.
"Of course, I still think that who ever designed Linq was mad."
haha :D I agree. Some stuff is OK, but other things are flat-out stupid. I mean: who came up with the lame idea of deferred execution of linq queries, but ONLY a part of the queries is deferred executed!
This one is:
var q = from c in nw.Customer select c;
but this one isn't:
var q = (from c in nw.Customer select c.Country).Contains("USA");
That last one is executed immediately...
"The example that you gave is a good example.
There is no way to know where it is going to run, and that is a bad mojo.
What happen if it was:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
This means that you have to load the ENTIRE TABLE to memory to do this.
Crazy, crazy, crazy."
True, that's the problem. However IGNORING this isn't helping as people will want to execute it in the projection.
The trick is that you should have a generic way to map a call onto a database function. If such a mapping isn't found, you keep the call around. When handling the projection, you have support for Call and MemberAccess, all other places you don't. It then ends up in tears in an exception, what you want in this case. This still isn't fail proof though, a nested select in a join branch for example with a method call in the projection will cause problems.
"And I built my own primitive visitor for that.
I think that the fact that they didn't provide an expression visitor is a flat out shame. It is not like anyone else needs that, right?"
Matt Warren made one available on his blog: http://blogs.msdn.com/mattwar
It's the same as the one inside the .NET framework (which is internal. Joy..)
It has some flaky routines though, so you better write your own (it's fairly straightforward). I peeked into your code this morning and I saw you use a different approach than I do: you try to handle everything at once instead of re-writing the tree element by element. This is cumbersome, as with joins for example (groupjoin etc.) you need to refer to parts of the tree already processed, so calling out into different handlers isn't going to cut it: you need one big handler to merge everything together (which handles a tree which is pre-processed a couple of times by rewriting elements.)
@Mats: I meant a method which the visitor calls by passing itself to it :). Yes, abstract is fine, virtual doesn't make sense indeed, as you have to override it in all cases indeed.
"Could you elaborate"
var q = from c in nw.Customers
I'm not completely done with this scenario, my 'new' handler finds this a projection to new Foo instances, which isn't the case: it's a list of resultvalues from GetSomeValue(). I'm not sure if this is doable though, as it's pretty tough to distinguish if it's a list of Foo's, or a list of resultvalues from GetSomeValue().
I don't think it's a common scenario, but it illustrates the point. ;). (Haven't tried it if linq to sql can handle this though ;))
Frans,
I just tried the following example using LINQ to SQL:
var q = from c in db.Customers select new Foo(c.Country, c.City).GetSomeValues();
with the following implementation of Foo:
public class Foo
{
}
It resulted in the following SQL:
SELECT [t0].[Country] AS [country], [t0].[City] AS [city]
FROM [dbo].[Customers] AS [t0]
And the following output (I had one customer):
Mr_Doki
I'm still not sure what the problem is, exactly?
Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case.
/Mats
@Ayende
"Of course, I still think that who ever designed Linq was mad."
LOL - you say that like it was a BAD thing! :-P
@Frans (more),
"This one is:
var q = from c in nw.Customer select c;
but this one isn't:
var q = (from c in nw.Customer select c.Country).Contains("USA");
That last one is executed immediately..."
Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly.
"It then ends up in tears in an exception, what you want in this case."
I don't agree. If I write:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?
/Mats
Mats,
"Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"
There are two things going on there. The first is the technical feasibility of this. The second is the gross violation of the principal of least surprise.
You really can't look at this statement and tell me what it does:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
For Linq to NHibernate, we have decided that all DB methods are extension methods of IDbMethods interface, which make it easier to distinguish between in memory and in DB methods.
This is important because you really want this kind of thing to be clearly visible for you.
@Mats:
"I'm still not sure what the problem is, exactly?
Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case."
The problem is that the query feeds data to a delegate which is executed on the raw resultset coming from the db and the RESULT of that delegate is the result value for each row.
EVERY linq provider has to implement this scenario, otherwise the query you tested won't work at all, you'll get a crash somewhere, as the methodcall to GetSomeValues is inside the expression tree. You can't ignore it, you've to implement code to execute it.
So it's:
generate SQL to produce the input values for the in-memory delegate you're going to execute in the projection engine
projection engine applies delegate onto input to produce the projection results.
"Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly."
No it's not! It's a DB query! :) It results in something like:
SELECT CASE WHEN NOT EXISTS (.... ) THEN 1 ELSE 0 END FROM <...>
NONE of the queries I posted executes ANY linq to objects code. None. That's the hard part of writing a linq provider: you get an expression tree, you have to convert EVERY bit to sql, otherwise the WHOLE query will fail.
An exception is the stuff which can be converted to in-memory code, like:
var q = from c in nw.Customer
here, the new string[] { "USA", "Germany"}.Where(x=>x.StartsWith("U") part is an in-memory construct. You can find these with a funcletizer (do a google search, you'll find the 3 entries about it and the code) and compile it into a delegate.
"
I don't agree. If I write:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"
Good luck with that. It can't be done. The problem is: you need results of the in-memory query INSIDE the db! Check:
var q = (from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o) join c in nw.Customer on o.CustomerID = c.CustomerID
select c;
you need the in-memory query result back in the db. You can imagine that it's possible to create a query where you need to pass back-forth multiple times resultsets to be able to produce the results (if applicable at all). This is not doable.
This is the weak side of Linq: the developer can tie things together which actually can't be tied together. In Linq to objects I can group on boolean expression results, in the DB I can't. So the same linq query can't run on the DB. For a developer it's not obvious why this is.
Im new to .net, but all this remember me ( a Déjà vu?) to ms access and access sql with linked tables (and passthrow sql).
I solved this btw:
var q = from o in nw.Order
where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30
select o;
with a custom mapping. See my latest blogpost. The SQL expression is horrible to look at, but who cares :)
Just took a look at the source - looks like the solution file needs to be changed for VS 2008 instead of the Orcas beta.
Comment preview