Comments on DNR #226: Entity Framework
I have listened with interest to the DNR episode about the entity framework, I am still not convinced on what they are doing. Daniel says that after explaining the grand vision to the code better guys, they agreed that this is what they wanted, but he doesn't disclose this grand vision. Update: I went back and listened again, Daniel is talking about Jeffrey Palermo specifically, not the rest of the code better guys. Sorry for the confusion.
A unified logical model for entities is nice in theory, but I already talked about why this is a hard problem to solve. Further more, it looks like they are focusing on the grand vision too much, and leaving aside the real users right now.
There are going to be a lot of users that would buy into the entity framework, lock, stock & barrel. But if is it hard to use for the common scenarios, it is going to generate a lot of ill-will to the framework. As far as I understand, they are nearly at feature-freeze for Orcas, and there seems to be a lot of rough edges all over the place.
That aside, here are a few more comments from the episode:
-
The Entity Framekwork is cheating. They are movig the burden of the actual query generation from the ORM layer to the database provider layer. Basically, it is sending the query AST to the database provider. The problem with that is that this would make it much harder to build an EF compatible database provider. You would need to handle not only the wire protocol, but also an optimizing provider for this AST for your database. I am pretty sure that SQL Server will have a good one, and Oracle would have a working one, but that leaves a lot of other databases out of the loop.
-
Something that I haven't heard so far is optimizations. There are scenarios where I have to write my own SQL to get the data back, is this something that the EF support?
-
Code generation - I have seen the amount of code that is required to make the EF happy. It is not pretty. You need to implement a lot of interfaces, handling properties changes, etc. This is a lot of code that really shouldn't be there.
I just spiked what it would take for NHibernate to handle the INotifyPropertyChanged implementation for the entities, and that one is less than 100 lines of code. And that is for doing it across the board! -
Automatic assoication wiring. By that I means that if I do customer.Orders.Add(new Order()), the Order's customer is set to the customer we just gave it. This is typically a thorny issue with regard to the mismatch between the relational model and the OO model, there are no one way assoications in the relational model, and there are only one way assoication in the OO model. The entity framework supports this with special collections, and lamda expressions. NHibernate had the same about a year ago in NHibernate Generics.
I stopped supporting this functionality a while ago, because I do not believe that this is a good way to handle assoications in the model. There are several reasons for that.-
There are various scenarios where I want a one-way assoication, usually for transient instances that I want to use for business logic calculations.
-
There is a business logic assoicated with assoications :-).
If we will take the customer.Orders example, adding a new order should verify that the customer has the credit to pay for it. Where does this logic exists now?
The best practice for NHibernate has always been to have an AddOrder(Order o) method, which would handle the business logic and the wiring of the assoications
-
-
Innovation: using both Linq and eSql to get dynamic queries.
This one really annoys me. There is zero innovation is the ability to have several ways to query the same source, using the best way for the scenario. As far as I have seen, the dynamic querying capabilities of the EF are fairly week, relying on either string concat or building the expression tree manually. Neither of which is very conductive to maintainability. -
Lazy Loading - it is possible that I got it wrong, but it seems like Daniel said that lazy loading is not something that is support in EF. (31:40)
We never make a query unless you know it is going to happen, very explicit.On the surface, this is very good, but what exactly does that mean? If it means that I can not do this:
Customer customer = EntityFramework.GetCustoemer(15); //not the real way of doing it
foreach(Order order in customer.Orders)
Console.WriteLine(order);Then this puts the burden of knowing what to bring in the hand of the developer, and that is a real PITA to handle explicitly. This is not a good place to be in. This is one of the basic features of an OR/M, is it really missing?
-
Daniel also talks about dropping the relational model and storing the entity model directly in SQL Server, this smells a lot like OODB to me, and I wonder if this is a good idea. The major issue with OODB is that the tools to work with them were limited. Just about anything can work with relational data, very little can work against some random entity model. Microsoft certainly has the resources to do this, of course, and Daniel mentions that several teams are working on this, but I don't think that we will see this any time soon.
Comments
"Daniel says that after explaining the grand vision to the code better guys, they agreed that this is what they wanted"
No, no, no! That is not what I said or heard out of that meeting he was referring to on DNR. You may not pin that one on us collectively. One CodeBetter guy came away all bright and shiny maybe, but definitely not all of us. I spent the whole time trying to convince them to go towards a true PI approach with a Unit of Work strategy for change tracking. We also spent a lot of time trying to talk them into a different configuration concept to make mapping smoother and hide the gory details. The Entity team did not get a free pass from the CodeBetter guys.
I went back and listened, and you are right, 14:45
"
There is this impression that this group of people is that we completely don't get their style of working.
We don't get what they do with NHibernate, we are trying to build this other product.
The more we talk about it, at one point I stopped Jeffery Pallermo and I sat down and I talked for an hour straight, here, let me a vision for a few releases out, this is what it could be like.
And I sort of like describe it, and he is "Yeah, perfect, you got it"
"
Right off the bat, the "CodeBetter" guys did not give any sort of collective "seal of
About automatic association wiring:
We added that 2 years ago I think. It was a feature which was requested a lot. I agree that there are restrictions, indeed a 1-way relation and validation/authorization/auditing.
1-way associations are easy to implement and you can take that info into account when syncing the opposite association. Validation/authorization/auditing on these association sets: that's easy as well, just implement events for this or virtual methods or call into a validator object you inject with whatever DI framework you use (we use that last option). This is flexible, as it gives the developer the oppertunity to utilize this feature if he needs to and also gives the developer the freedom to indeed add validation etc. on that association without the necessity to add methods like AddOrder.
The downside of 'AddOrder' is that it doesn't work in databinding scenario's, you have to write plumbing code call AddOrder yourself. Ok, you can get away with it by simply declaring databinding something you should never use, but most business app developers simply utilize it to no end (and beyond, trust me, you don't want to know what people try to be able to use databinding instead of handwritten plumbing). It doesn't work in databinding scenario's in such a way that when you add a new row in the orders grid of a customer, you simply want everything setup automatically and validated when required, and you can set this up in the framework so everyone can use it without writing plumbing.
Why didn't you opt for it in the dyn. proxy nhibernate creates? I mean: you could add it / keep it for the people who want it and simply leave it as is for the people who don't want it.
About lazy loading: I can understand why they don't implement it. In general lazy loading is more of a burden than a blessing. The reason for this is that it leaks persistent storage access to different tiers via the lazy loadable associations. If you want to prevent your UI developers to utilize lazy loading, or are sending entities across the wire to a service, how are you preventing that lazy loading is called under the hood? We support 2 models, one has lazy loading, the other one doesn't (and is more geared towards disconnected environments). You don't really miss lazy loading in the second model really, as long as you have prefetch paths to specify prefetching what you want (also into an existing graph) The thing is that the model then forces you to write more service oriented software: make the call to the data producer and tell the data producer (or repository, whatever you want to call it) what to get and you get the data and work with it. there's no leaky lazy loading under the hood bypassing the repository, you need to call the dataproducer to get the data, period.
About INotifyPropertyChanged -
I looked briefly at implementing this with DP2 using a Mixin. I thought it would be fairly straight forward until I realized that Mixins hadn't been implemented yet. I wasn't up to the task to implement Mixin support, but once there I think INotifyPropertyChanged should be pretty easy?
I did it in DP1 for NHibernate, it was less than 100 lines of code, and very slick
I attended Code Camp 7 in Waltham, MA, USA a few weeks ago and went to a presentation by Julie Lerman, where she said that the Entity Framework will do lazy loading, by default. According to her, it can be turned off.
I still don't really see the model behind EF. A very simplistic view is that it allows one to assign projections to classes, which seems, ahem, a bit basic. Then again, I am not aware of any comprehensive list of the problems faced by ORM frameworks and how different ORM tools/frameworks solve said problems. I think I like data normalization a bit too much. :-D
Luke: the basic idea of the EF is this:
there's a mismatch between an OO model and a relational model, for example when you use inheritance in the OO model. The EF creates a new kind of relational model which does implement this, so there's no mismatch anymore effectively. Basicly you map a class 1:1 to an EF entity, and the EF then takes care of inheritance etc. for you.
Is this really different? No, I don't think it is really different, except that the model internally in an o/r mapper is now on the outside of the mapper, namely the EDM.
What do you mean by DP1 and DP2?
How do you implement INotifyPropertyChanged without sticking it into every entity?
@Ayende,
Please correct the spelling: "Jeffrey Palermo".
I did spend the entire time talking with Daniel while others worked on others on the team. Daniel did explain the vision in high level terms, and it sounds very good, but the vision is a LONG way out and not even close to Orcas, so it's nothing but a vision at this point. Visions always sound good, and part of the vision was persistent-ignorant domain objects mapped to a relational database. Great, but Orcas doesn't have this. We split the customer base up into 3 pieces with DDD and ORM being 1/3 of the customer base (we're estimating). They are seeking to service the other 2/3 or so with Orcas by being able to work in a relational and very data-driven way. DDD folks are out of luck for Orcas. I stressed the importance for domain objects mapped to the database without having to implement a certain interface and without having to fire any events. In fact, I argued for the domain model project to not even have a reference to any Entity Framework assemblies.
We won't have anything great in Orcas, and I've accepted that, but, hey, I'm loving NHibernate 1.2!!! The vision sounds great, but my kid (who isn't born yet) will be in Pre-K before the vision is realized.
I wish them all the luck, and I'll try to help Microsoft provide a product that serves companies I work with.
@Jeffrey,
Sorry about that, fixed.
@Frans: I still don't see how the EF avoids the problems mentioned in the Vietnam of Computer Science: http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx . People act like the EF magically avoids problems typically experienced by O/RMs, but I'm not seeing it. Perhaps I'm also not impressed because I've built a SQL generation engine that does what the EF seems to do and more...
I'm two weeks into a new project in midtown Manhattan, and I'm using the train rides to do side project
Comment preview