Phsycic Debugging from Blogs
Well, I got my phsycic hat today, and I wanted to refer to Frans' post and try to answer the question that Frans is asking:
Roger:
9. Provide read-only access to foreign key values.
Mike:
This is actually a feature I’m fighting to get into our final milestone for V1. Can you describe the scenarios where this is used? Do you need the ability to query on the foreign key value, or simply expose it on the domain object?
If I have an Order entity object in memory, if I want to obtain its CustomerID and I only have the ability to get that ID is by fetching the related Customer entity into memory, instead of doing Order.CustomerID, I lose performance which is unnecessary. It's simple data-access stuff and it puzzles me a bit why Mike has to put up a fight to get this into their framework in the first place. Especially since the underlying object context knows that Order.CustomerID as it has to save it to the DB if you change the Customer related to that particular Order instance.
I should mention that I know very little about the way the Entity Framework internals work, but I would like to make an educated guess about the reasons for this issue, and why it is appernatly hard to do for V1.
The problem is basically this:
Order order = GetOrderFromDatabase();
Console.WriteLine("Order's Customer ID: {0}", order.Customer.CustomerID);
Now, the Entity Framework is apperantely not able to handle this case without loading the entire customer entity, which isn't really not needed, since you already have loaded the CustomerID column when you loaded the Order entity.
The reason that this is likely hard (again, I am making guesses here) is that the EF uses code generation where other frameworks use interception, and I assume that there isn't really a good way to detect when the identifier property is access and when it is a normal property (that does require loading the object). This likely means that trying to access the Customer entity when it is not loaded would throw, and trying to deal with that would force the EF to handle lazy loading, which is something that they seem to want to push to the client, rather than the framework.
Comments
I think your guesses are good! ;-)
/Mats
It's a bit different. The issue is this:
Order o = GetOrderFromDatabase();
Console.WriteLine("CustomerID: {0}", order.CustomerID);
This isn't possible if the CustomerID is hidden in Order, which it might be, as it's an FK field and thus redundant info IF you also have order.Customer at your disposal. (for example, they have it illustrated in the last EF designer video (of the one they won't ship with Orcas))
Now, it's hidden, but also it's there. The EF knows this value. So there must be a way to obtain this customerid without refering to the referenced customer entity. This is apparently hard for them, which I find odd.
I don't see what code generation has to do with it though :)
Frans,
If you have order.Customer then I consider order.CustomerID a smell when you are talking about OO OR/M. See here:
http://www.ayende.com/Blog/archive/2006/06/13/7488.aspx
It is the responsibility of the tools to handle this transparently:
order.Customer.CustomerID
And in order to handle this scenario effectively, you really need to have interception.
OR/M that are more DB focused can have order.CustomerID, because that is how the expected usage is.
When your framework is capable of dealing with Ghost Objects (objects where only the identity properties have been filled out, accessing any other property will cause the object to lazy load) you don't need to (redundantly) store the CustomerID inside the order - you store it in the ID property of the customer ghost object.
If the framework doesn't support ghost objects (which, as Ayende notes above, requires interception) then you're out of luck and would have to store CustomerID inside the order. But that is nowhere near as nice imo, and exposing that CustomerID is just even worse imo. If I want the ID of the customer that the order belongs to, I should be able to do:
order.Customer.ID
Without incurring any additional load. If you can't do that, it's almost back to the drawing board time :-P
Mats,
I like the term Ghost Objects.
I first read the term in Fowler's PoEAA but I think it has been around for a while. I also like it! :-)
Ok, I wasn't familiar with the ghost object pattern, but indeed in that case, 'customerid' isn't part of the order entity per-se, as it's indeed an FK.
the thing is though: most developers will assume and expect customerid to be part of order. The reason is simple: they want to be able to set the customerid on an order entity without refering to customer. It's more clear what the intention is. This code:
myOrder.Customer.Id = "CHOPS";
might tell the highly-skilled developer 'Oh, 'Customer' isn't loaded because this o/r mapper uses interception under the hood and because of the implementation of the pattern, it won't fetch customer'. However, a random developer picked up from the street will say "This code requires an instance of Customer, which likely will trigger lazy loading".
This code:
myOrder.CustomerID = "CHOPS";
is very clear. Every developer will understand: this sets the customerid field.
order.Customer.Id = "CHOPS";
with ghost objects has another drawback in the clarity department: what if I create a new order and a new customer? Order of the statements then can become important if you're not careful.
But indeed, code generation won't bring you this instantly, though it's not as if it's not possible with a code-generated system IMHO, however I don't think it's that useful, pragmatically speaking: IMHO it makes things less clear for the average developer.
Sure, 'Order' entity in a NIAM diagram doesn't have the attribute 'CustomerID', however usage patterns of how data is used in applications will likely dictate the necessity of having customerid as a property on the order entity.
I've to add:
myOrder.CustomerID = "CHOPS";
brings responsibilities to the o/r mapper as well if fk fields are placed on the fk side of a relation:
if the FK field changes, myOrder.Customer has to be dereferenced. For example, if
myOrder.Customer points to the "BLONP" customer, which for example also means (if available) that myOrder is in the customer.Orders collection, and I then do:
myOrder.CustomerID = "CHOPS";
it means that myOrder.Customer has to be dereferenced, and also that myOrder has to be removed from the Orders collection of the referenced customer.
Similar:
Setting myOrder.Customer to a different customer reference should reset CustomerID in myOrder to the PK value of the new customer instance and should also dereference the old customer reference.
So, it's not as if placing the FK field inside the FK side entity is 'easier', it's actually a large pile of complex code underneath, but it makes things IMHO more natural for a lot of people (hence the reason I for example took that route, I never considered the fact to have ghost objects :)).
I agree with Mats, if the EF doesn't have this already in it, they HAVE to have it in v1 and they have to go back to the drawing board to design it in UP FRONT. After first release, if this isn't in the framework, you can't add it anymore, as it will break existing code as the behavior of existing code changes.
I have to admit, after more than 5 years writing O/R mapper engine code full time, I've never had the realization that indeed, if you have a class Order, which looks like:
public class Order
{
private int _orderID
private string _customerID
private Customer _customer;
//...
}
you'll run into the potential inconsistency where _customerID != _customer.CustomerID and that therefore you should always rely on _customer.CustomerID;
However, looking at it after an hour of thought or so, I do realize that if you don't have the sync stuff in place like I described in my previous reply, it is indeed a smell. If you do have the sync stuff in place, it's not that bad, as the ghost object is also a workaround to have a container to store the ID somewhere till it's either materialized into a real new instance or loaded from the db.
And then we have of course option 3 ;) RelationObjects! ;).
@Frans,
I would consider code like thing weird, since it is using the order to set something on the customer:
order.Customer.Id = "CHOPS";
The way that NH does it is:
order.Customer = Session.Load("CHOPS");
Where it is understood that Load() will return a ghost object without hitting the DB, and Get() will hit the DB and return null if the object does not exists.
It is a matter of convention in the API, really.
Ah, thanks for the clarification :). I indeed agree it's a way the API works and is build, either way the 'CustomerID' has to be stored somewhere. If it's on the order as well, inconsistency has to be prevented, and if it's stored in a related object, ghost objects have to be created.
It's now interesting if we could determine which of the two the EF will do. Mats?
Frans,
That's the nature of the O/R poison, isn't it - after 5 years one can still learn new stuff in this field! I love it! :-)
I agree with Ayende that the natural (I'd almost say "correct") way to do it is:
order.Customer = session.Load("CHOPS");
NPersist uses the same approach and also gives you the option of getting a ghost object back or hitting the database. In the first case (getting a ghost object) if the object is already loaded (it is in the cache/id map) you get the loaded object. In the second case, if the object is loaded, the database is not hit). I think the normal way is to request a ghost object, assuming you feel certain enough that the customer with ID CHOPS is going to exist that you're willing to risk finding out via an FK exception that it didn't.
I strongly agree with you Frans that there is a real risk for inconsistency when storing the CustomerID inside the order object. Code can certainly be put in place to mitigate the problem, as you point out. But I do not really agree that it is a workaround to use ghost objects, that it is just a container for storing the id. The ghost object is the real object, just in a certain state. After it has been loaded with data there is no difference between that object and any other - it is no longer a ghost object. It is not a question of switching instances or anything, a ghost object is just the object before its fields have been loaded with data (except the identity properties).
So I think storing the id value in the id property of the object it belongs to is actually to store that id value in its right place. Whether you A) use ghost objects or B) doing this would force the related object (the customer) to be loaded (yuk :-P) is really a separate issue.
However, to be really, really strict, ghost objects do not actually presuppose interception - only automatic lazy loading of the ghost objects presupposes interception. You could have a ghost object solution and still ask the users to manually send their ghost objects to be loaded whenever the need arises. I am not sure as to what will be the final take of EF on aop-features such as interception, but even without such features, they could have ghost objects, just not autmoatically lazy loading ones.
/Mats
Comment preview