Sometimes it's better to be stupid
If you recall, I posted yesterday about a possible way to
implement a rule in a context bound way. I also said that the example I gave
was simple enough to easily run in the database itself. Today I encountered a
situation where it’s actually much better
to run those rules in memory. The issue is actually fairly simple; it has to do
with running several rules together.
Let’s think for a moment on our payroll sample. We have
a whole slew of rules there, payment by hour, global, overtime, etc. Some of
the rules are attached to an employee; some are attached to all employees with
the same type of contract, etc. In a real business, you will probably have several
rules per employee, whatever it is that it was in the contract, in the contract
type (all class 12 employees must …) or the local laws. Now, it’s
reasonable to assume that each of these rules can be expressed in the terms of
a database query, which would return whatever the rule is valid or not.
Please flip your envelops, we’re going to do some
calculations. Let’s assume that we have 100 employees, and that each
employee has a dozen or so rules (mostly inherited ones). We’ll also
assume that each rule means a database hit to look for the information. How
many hits are we going to make?
1,200 queries are not
good, I would say.
One word of caution before I proceed: I don’t have
real life experience in these matters, I’m busily acquiring it as you
read it, but I’m speaking theory so far.
If we look at the problem, we can see that the data set we
are working on is small. 100 employees & 12 rules are very small, actually. So what is the
problem? The problem is that we keep asking the database a lot of questions. It
can answer each one very quickly, but not all of them together.
So what is the solution? One solution is to try and do all
the rule validation in the database. I think that you can guess what I think
about that. SQL is a very good language for queries, it’s not a good
language for writing business rules. Oh, it’s possible, and it is certainly been done before. But doing it
this way lands you in a lonely place, without any of the usual comforts of an
OO. If you’re using any of the newest version of the major databases
(Oracle, DB2, SQL Server), you can
run your code in the database, combining the expressiveness of OO and the speed
of running in the database. I don’t think that this is a good way to go,
but there isn’t enough data yet to say for certain, I’m afraid.
Myself, I’m going to take advantage of a feature most
ORM provides, and that is the local data caching. Considering that the data set
that we’re talking about it small, an ORM will turn the potential 1,200
queries to ~5 at the most, and may reduce this even further with joins. After
loading the data once, the ORM will keep it in memory, so any further queries
will be served immediately. Then we can run the 1,200 rules in memory, and be
done with in a couple of seconds (including data access) instead of several
minutes (best case?).
This is a case where the simplest thing (run the rule in
memory) is also the simplest.
Comments
Comment preview