More on change tracking...

time to read 4 min | 728 words

Frans Bouma has an excellent post about Why change-tracking has to be part of an entity object. I like the level that he has gotten into, but I don't agree with his conclusion.

The main issue is about a disconnected scenario, where you object went somewhere, and then came back, and you don't know what hapepned to it in the middle. Frans gives an excellent description about how it works in the infrastructure level using the ASP.Net data source, which I will continue to use.

The main problem here is not change tracking per-se, it is state management. Many of the things that we do are inheritently stateless. The web is stateless, web services are stateless, etc. The entire idea of change tracking has the underlying assumption that I am tracking the state in some manner.

As Frans points out, there are few options to handle this state.

  • You can put it in the view state, which is what the default GridView over DataSet does, and watch how the page sizes reaches the megabytes levels.
  • You can put it in the user session object, in which case you are suffering from increased memory usage and have a lot more headache if you want to scale up.
  • You can serialize the original state somewhere, to a file or to the database, and load it after ward.

At any rate, this is not a trivial manner, and there are plenty of design decisions that have wide ranging affect on your project. In Frans words:

The 'solution' Microsoft provides for this, which is also a problem of Linq to Sql btw, is the following: you first have to attach to the context/session object an entity object with the original values, then pass the entity object with the new values and then it can perform change tracking. Yes, of course, but where do you get that entity object with the original values from? From the DB? oh no sir, you can't do that, as it might be that the entity was changed in between, and if the user would have seen THOSE values, the user might not have altered the entity at all.

And right now we moved from change tracking to concurrency tracking. There are solution for those, optimistic concurrency is probably the best idea here, and that requires some sort of a flag to notify us if the row has changed or not. In SQL Server, this is usually handled by TIMESTAMP column, which is guranteed to change to every time you modify the row.

The process now is to:

  • Get the objects from the database
  • Set the new values, including the old TIMESTAMP value
  • Save to DB
  • Handle the concurrency violations

This is still not trivial, but it have moved from having to handle infrastructure issues to having to handle business logic issues (what to do when there is a concurrency violation is a business decision).