Persistence Ignorance in the Entity Framework

time to read 3 min | 541 words

Dan Simmons is talking about persistence ignorance in the Entity Framework. It looks like Microsoft is going to make an effort to make the EF support at least some level of PI in V1, and pure PI in following versions. This is a Good Thing.

I would like to talk about some of the details in his post:

There's no question, though, that complete persistence ignorance comes at a price--both in the performance of applications built with "pure POCO", persistence ignorant domain models and as a result in the complexity of the entity framework which enables them.

There is a question, actually. I can't speak of the complexity of the EF, but I can speak about the framework requirements from OR/M solution that wants to support persistence ignorance. From the rest of the post, it looks like trying to do PI in the EF is going to have some issues with the existing infrastructure.

Dan is bringing up two issues:

  • They must store a copy of an EntityKey.
    Proposed solution: Choosing not to store EntityKeys on the entities, for instance, means that navigating from an entity to the ObjectStateEntry which matches it either requires a brute-force search of the ObjectStateManager or for the ObjectStateManager to maintain a dictionary mapping from CLR reference to ObjectStateEntry which is a significnat expense.
    My reaction: You are going to need to keep track of loaded objects anyway, because you need to do identity resolution, this almost guarantees that you are already keeping track of this somewhere, why not utilize it for the same purpose?
    Also, what is the significant expense in keeping a dictionary of <EntityKey, Entity> ?
  • They must provide change tracking notifications through a prescriptive interface.
    Proposed solution: Not supporting the change tracking mechanism means that the ObjectStateManager must cache a copy of the original values for each entity (all original values and they must be cache whether or not the entity is modified).
    My reaction: The way you do it now is basically the same, only they are not in the state manager, but in the entity itself, what is the big difference? There is a case here because of value types copy semantics (entities are usually composed of values types and strings), but I don't think that it is a big issue, in most cases, the lifetime of both entities and the session cache is short.
    If it bothers you, you can implement change tracking by interception, and add only the modified values to the ObjectStateManager.

One thing that is really encouraging is the suggestion that users of the EF will be able to use custom collections. They are extremely helpful in a number of scenarios. I already mentioned the temporal aware collections that hooked into NHibernate in order to give us much better domain model. The last I have heard, it was not supported in the EF. So it is good to hear that they are thinking about it.

One thing that I haven't heard anything about is the extensibility mechanisms that are exposed. Specifically, custom types, persistence approaches, etc.