Comments on DNR #226: Entity Framework

time to read 9 min | 1606 words

I have listened with interest to the DNR episode about the entity framework, I am still not convinced on what they are doing. Daniel says that after explaining the grand vision to the code better guys, they agreed that this is what they wanted, but he doesn't disclose this grand vision. Update: I went back and listened again, Daniel is talking about Jeffrey Palermo specifically, not the rest of the code better guys. Sorry for the confusion.

A unified logical model for entities is nice in theory, but I already talked about why this is a hard problem to solve. Further more, it looks like they are focusing on the grand vision too much, and leaving aside the real users right now.

There are going to be a lot of users that would buy into the entity framework, lock, stock & barrel. But if is it hard to use for the common scenarios, it is going to generate a lot of ill-will to the framework. As far as I understand, they are nearly at feature-freeze for Orcas, and there seems to be a lot of rough edges all over the place.

That aside, here are a few more comments from the episode:

  • The Entity Framekwork is cheating. They are movig the burden of the actual query generation from the ORM layer to the database provider layer. Basically, it is sending the query AST to the database provider. The problem with that is that this would make it much harder to build an EF compatible database provider. You would need to handle not only the wire protocol, but also an optimizing provider for this AST for your database. I am pretty sure that SQL Server will have a good one, and Oracle would have a working one, but that leaves a lot of other databases out of the loop.

  • Something that I haven't heard so far is optimizations. There are scenarios where I have to write my own SQL to get the data back, is this something that the EF support?

  • Code generation - I have seen the amount of code that is required to make the EF happy. It is not pretty. You need to implement a lot of interfaces, handling properties changes, etc. This is a lot of code that really shouldn't be there.
    I just spiked what it would take for NHibernate to handle the INotifyPropertyChanged implementation for the entities, and that one is less than 100 lines of code. And that is for doing it across the board!

  • Automatic assoication wiring. By that I means that if I do customer.Orders.Add(new Order()), the Order's customer is set to the customer we just gave it. This is typically a thorny issue with regard to the mismatch between the relational model and the OO model, there are no one way assoications in the relational model, and there are only one way assoication in the OO model. The entity framework supports this with special collections, and lamda expressions. NHibernate had the same about a year ago in NHibernate Generics.
    I stopped supporting this functionality a while ago, because I do not believe that this is a good way to handle assoications in the model. There are several reasons for that.

    • There are various scenarios where I want a one-way assoication, usually for transient instances that I want to use for business logic calculations.

    • There is a business logic assoicated with assoications :-).
      If we will take the customer.Orders example, adding a new order should verify that the customer has the credit to pay for it. Where does this logic exists now?
      The best practice for NHibernate has always been to have an AddOrder(Order o) method, which would handle the business logic and the wiring of the assoications

  • Innovation: using both Linq and eSql to get dynamic queries.
    This one really annoys me. There is zero innovation is the ability to have several ways to query the same source, using the best way for the scenario. As far as I have seen, the dynamic querying capabilities of the EF are fairly week, relying on either string concat or building the expression tree manually. Neither of which is very conductive to maintainability.

  • Lazy Loading - it is possible that I got it wrong, but it seems like Daniel said that lazy loading is not something that is support in EF. (31:40)

    We never make a query unless you know it is going to happen, very explicit.
    On the surface, this is very good, but what exactly does that mean? If it means that I can not do this:

    Customer customer = EntityFramework.GetCustoemer(15); //not the real way of doing it
    foreach(Order order in customer.Orders)
       Console.WriteLine(order);

    Then this puts the burden of knowing what to bring in the hand of the developer, and that is a real PITA to handle explicitly. This is not a good place to be in. This is one of the basic features of an OR/M, is it really missing?

  • Daniel also talks about dropping the relational model and storing the entity model directly in SQL Server, this smells a lot like OODB to me, and I wonder if this is a good idea. The major issue with OODB is that the tools to work with them were limited. Just about anything can work with relational data, very little can work against some random entity model. Microsoft certainly has the resources to do this, of course, and Daniel mentions that several teams are working on this, but I don't think that we will see this any time soon.