Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 516 words

There is a level of tension between what a developer wants and what the client wants. As a developer, I want to use the coolest technologies, the freshest methodologies, to be so far on the bleeding edge that they have an ER dedicated just for my team.

As a client? I want to be as conservative as possible, for the most part, I don’t care for new technologies unless they have some significant feature that I can’t get anywhere else. I value speed of development over having the devs geek out.

I spoke about before, the tendency of developers to start building castles in the sky in favor of getting things done. I call it Stealing From Your Client, because you are introducing additional things that aren’t necessary, won’t bring additional value or just plain make things hard.

For example, in just about any project that I have seen, putting an IRepository in front of an OR/M or putting an IRepository in front of RavenDB was a mistake. It created no value and actually made things harder to use.

Other pet peeves of mine was raised in a discussion about architecture in a recent lecture that I gave. One of the participants asked me for my opinion about CQRS. But when I asked him for what sort of an application, he answer was somewhere along the lines of: “I want to build my next application using CQRS and I like to hear your opinion.”

This sort of thinking drives me bananas. (Editor note: That sounds like a very painful thing, and I am not sure why I would do that, but it appears that this is an English expression that fit this location in the post).

Dictating the architecture of an application, before you even have an application? Hell, making any decisions about the application (including what technologies to use) before you actually have a good idea about what is going on is a Big No! No!

Here is the most important piece of this post. For the most part, all those cool technologies that you hear about? They aren’t relevant for your scenario. And even if you are convinced that they are a perfect fit, there is a cost associated with them, which you have to consider before attempting them. Especially if this is the first time that you are using something.

The really annoying part? Most people who come up with these exciting new technologies spent an inordinate amount of their time saying not how to build apps using them, but when you shouldn’t. Evans’ DDD book quite explicitly state that DDD shouldn’t be your default architecture, that it has a cost and should be carefully considered. The industry as a whole (me included) just ignored him and started to write DDD applications.

Greg Young just lectured in Oredev about why you shouldn’t use CQRS. For much the same reason.

At the end, we are there to provide value to the customer. Yes, it is cool to work on the newest thing on the block, but that is why you have hobby projects.

time to read 3 min | 540 words

With regards to my recommendation to avoid the repository, Stan asks:

You returned to explain why repository pattern is evil. Just interesting to know what are you doing when you need in your model to access another aggregate. Do you reference NH from your model? I prefer to leave my model POCOed and wrap DB calls by repository pattern. Sleeping good with it.

This is an interesting question. There are several ways to answer that. To start with, assuming that we are using DDD model (which the usage of a repository would imply), you don’t have references between aggregates.

But let us assume that we somehow need that, Stan seems to suggest something like this proposed solution:

public class Person 
{
    public static readonly DateTime ImportantDate;
    public BirthPlace BirthPlace { get; set; } 

    public DateTime BirthDate 
    { 
        get; private set;
    } 

    public void CorrectBirthDate(IRepository<BirthPlace> birthPlaces, DateTime date)
    {
        if (BirthPlace != null && date < ImportantDate && BirthPlace.IsSpecial) 
        { 
            BirthPlace = birthPlaces.GetForDate(date); 
        }
    }
}

Here we have a business rule that states that this is required.

But do we actually need a repository here? What if we just said that whoever calls us need to provide us with a way to get the birth place by date?

public void CorrectBirthDate(
        Func<DateTime, BirthPlace> getBirthPlaceFordate, 
        DateTime date)
{
    if (BirthPlace != null && date < ImportantDate && BirthPlace.IsSpecial) 
    { 
        BirthPlace = getBirthPlaceFordate(date); 
    }
}

This can be done with a simple delegate, no need to introduce a heavy weight abstraction. This is a local solution for a local problem. It keeps the database out from your entities and more importantly, it allows you to actually craft the appropriate response to this at the time of the call.

time to read 2 min | 302 words

…when all through the house. Not a creature was stirring, not even a mouse. Because everyone was away on a Stag/Hen weekend.

Phil Jones has just notified me that a site he has been working on for a while went live.  Escape Trips is a site that provides Stag and Hen trips in the UK. A Stag/Hen weekend is apparently a bigger version of a stag party, I assume that this is the origin for movies like this (well, not really).

At any rate, this site is actually powered by RavenDB throughout. We actually have several videos in the pipeline of me and Phil hashing out some details about the site when it was built.

The site is fast, and Phil was kind enough to give me some interesting stats.

Some interesting performance stats compared to SQL Express which was running on the VPS. SQL Express was eating ~500MB of RAM (limited), RavenDB has been sat at ~80MB since launch last night! I think EF was eating CPU as well, CPU usage is way down as well.

Performance comment wise, I don’t know if EF was to blame but IIS process CPU usage is down even though traffic has doubled since the launch (mainly crawlers and a new adwords campaign). After running since the launch on Thursday night, the RAM usage has increased to 100MB, still a really great number though as I plan to scale down the VPS’s RAM saving money, RavenDB will actually be paying for itself!

The website looks quite simple from the public side but most of the development has gone into the private administrative website for dealing with sales, customer support and content editing. Performance wise, the system is more responsive and users are very pleased!

Pretty cool!

time to read 1 min | 199 words

In my previous post, I showed the database schema and the UI and asked what was wrong with that. Before we move on, here is what I showed.

image

image

If you looked carefully, you might have noticed that there are duplicate PO# and Tracking# in the UI. More than that, we somehow double charged the customer for shipping.

What is going on?

It is actually fairly obvious, when you think about it. Look at the schema, there isn’t actually any association between the Tracking # and the PO #. In most orders, we have only 1 PO #, so it was easy to just add this information by just pulling it from the DB and adding a few columns. But when we got an order that has multiple POs… that is when all hell breaks lose.

This is a classic Cartesian Product problem.

The solution?  Actually model the UI to avoid suggesting that there is a relationship between the tracking and purchase orders, like this:

image

time to read 1 min | 125 words

This case, it isn’t code that I am going to show, rather, I am going to show the final UI and the database structure, and let you figure:

  • What is wrong.
  • How to fix this.

Here is the database schema:

image

And here is the problem:

image

Hints, this used to work for a long time, and suddenly it doesn’t, and the customer is pissed, annoyed and threatening to sue.

time to read 2 min | 206 words

You are probably aware that you need to monitor your production systems for errors, and to add health monitoring for your servers.

But are you monitoring negative events? What is a negative event, stuff that should have happened and didn’t.

For example, every week you have a process that runs to update the tax rates that applies to your customers. This is implemented as a scheduled process, but for some reason (computer was just being rebooted, the user’s password expire, etc) that process didn’t run. There isn’t an error, pre se. You won’t get an error because nothing actually had a chance to actually happen.

Another example would be getting a callback confirmation that an order payment has been correctly processed. That usually happen within 1 – 5 minutes, and you get an OK/Fail notification. But what happens if that notification just never came?

This is a much more dangerous scenario, because you have to not only be prepared for handling errors, you have to be prepared for… nothing to happen.

What it means is that you have to have some way to setup expectations in the system, and act on them when you don’t get a confirmation (negative or positive) within a given time frame.

time to read 1 min | 140 words

We are planning of doing a lot of training on RavenDB, and I wanted to make sure that everyone is aware of it.

Next week , we have a public event in London. Where Itamar will explain all about indexing:

RavenDB Indexes explained (public event) - Feb 28, London, UK

Then there is the full 2 days workshop about RavenDB, here is the current schedule for the next few months.

There is also going to be a course in the states around August, probably New York again and maybe around July in Berlin.

time to read 7 min | 1202 words

On my previous post, I explained about the value of pushing as much as possible to the infrastructure, and then show some code that showed how to do so. First, let us look at the business level code:

[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Register(string originUnlocode, string destinationUnlocode, DateTime arrivalDeadline)
{
    var trackingId = ExecuteCommand(new RegisterCargo
    {
        OriginCode = originUnlocode,
        DestinationCode = destinationUnlocode,
        ArrivalDeadline = arrivalDeadline
    });

    return RedirectToAction(ShowActionName, new RouteValueDictionary(new { trackingId }));
}

public class RegisterCargo : Command<string>
{
    public override void Execute()
    {
        var origin = Session.Load<Location>(OriginCode);
        var destination = Session.Load<Location>(DestinationCode);

        var trackingId = Query(new NextTrackingIdQuery());

        var routeSpecification = new RouteSpecification(origin, destination, ArrivalDeadline);
        var cargo = new Cargo(trackingId, routeSpecification);
        Session.Save(cargo);

        Result = trackingId;
    }

    public string OriginCode { get; set; }

    public string DestinationCode { get; set; }

    public DateTime ArrivalDeadline { get; set; }
}

And the infrastructure code, now:

protected void Default_ExecuteCommand(Command cmd)
{
    cmd.Session = Session;
    cmd.Execute();
}

protected TResult Default_ExecuteCommand<TResult>(Command<TResult> cmd)
{
    ExecuteCommand((Command) cmd);
    return cmd.Result;
}

You might have noticed a problem in the way we are named things, the names on the action and the infrastructure code do not match. What is going on?

Well, the answer is quite simple. Let us look at how our controller looks like ( at least, the important parts ):

public class AbstractController : Controller
{
    public ISession Session;

    public Action<Command> AlternativeExecuteCommand { get; set; }
    public Func<Command, object> AlternativeExecuteCommandWithResult { get; set; }

    public void ExecuteCommand(Command cmd)
    {
        if (AlternativeExecuteCommand!= null)
            AlternativeExecuteCommand(cmd);
        else
            Default_ExecuteCommand(cmd);
    }

    public TResult ExecuteCommand<TResult>(Command<TResult> cmd)
    {
        if (AlternativeExecuteCommandWithResult != null)
            return (TResult)AlternativeExecuteCommandWithResult(cmd);
        return Default_ExecuteCommand(cmd);
    }

    protected void Default_ExecuteCommand(Command cmd)
    {
        cmd.Session = Session;
        cmd.Execute();
    }

    protected TResult Default_ExecuteCommand<TResult>(Command<TResult> cmd)
    {
        ExecuteCommand((Command)cmd);
        return cmd.Result;
    }
}

What?! You do mocking by hand and inject them like that? That is horrible! It is much easier to use a mocking framework and ….

Yes, it would be, if I was trying to mocking different things all the time. But given that I have very few abstractions, it make sense to not only build this sort of infrastructure, but to also build infrastructure for those things _in the tests_.

For example, let us write the test for the action:

[Fact]
public void WillRegisterCargo()
{
  ExecuteAction<CargoAdminController>( c=> c.Register("US", "UK", DateTime.Today) );
  
  Assert.IsType<RegisterCargo>( this.ExecutedCommands[0] );
}

Lego-mob 2The ExecuteAction method belongs to the test infrastructure, and it setups the controller to be run under the test scenario. Which allows me to not execute the command, but to actually get it.

From there, it is very easy to get to things like:

[Fact]
public void WillCreateNewCargoWithNewTrackingId()
{
  SetupQueryResponse<NextTrackingIdQuery>("abc");
  ExecuteCommand<RegisterCargo>( new RegisterCargo
  {
    OriginCode = "US",
    DestinationCode= "UK",
    ArrivalDeadline = DateTime.Today
  });
  
  var cargo = Session.Load<Cargo>("cargos/1");
  Assert.Equal("abc", cargo.TrackingId);
}

This is important, because now what you are testing is the actual interaction. You don’t care about any of the actual dependencies, we just abstracted them out, but without creating ton of interfaces, abstractions on top of abstractions or any of that.

In fact, we kept the number of abstractions to a minimum, and we can change pretty much every part of the system with very little fear of cascading change.

We have similar lego pieces, all of them move together and interact with one another with complete freedom, and we don’t have to have a Abstract Factory Factory Façade Factory.

time to read 3 min | 547 words

In my previous post, I discussed actual refactoring to reduce abstraction, and I showed two very interesting methods, Query() and ExecuteCommand(). Here is the code in question:

[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Register(string originUnlocode, string destinationUnlocode, DateTime arrivalDeadline)
{
    var trackingId = ExecuteCommand(new RegisterCargo
    {
        OriginCode = originUnlocode,
        DestinationCode = destinationUnlocode,
        ArrivalDeadline = arrivalDeadline
    });

    return RedirectToAction(ShowActionName, new RouteValueDictionary(new { trackingId }));
}

public class RegisterCargo : Command<string>
{
    public override void Execute()
    {
        var origin = Session.Load<Location>(OriginCode);
        var destination = Session.Load<Location>(DestinationCode);

        var trackingId = Query(new NextTrackingIdQuery());

        var routeSpecification = new RouteSpecification(origin, destination, ArrivalDeadline);
        var cargo = new Cargo(trackingId, routeSpecification);
        Session.Save(cargo);

        Result = trackingId;
    }

    public string OriginCode { get; set; }

    public string DestinationCode { get; set; }

    public DateTime ArrivalDeadline { get; set; }
}

What are they so important? Mostly because those methods [and similar, like Raise(event) and ExecuteLater(task)] are actually the back bone of the application. They are the infrastructure on top of which everything rests.

Those methods basically accept an argument (and optionally return a value). Their responsibility are:

  • Setup the given argument so it can run.
  • Execute it.
  • Return the result (if there is one).

Here is an example showing how to implement ExecuteCommand:

protected void Default_ExecuteCommand(Command cmd)
{
    cmd.Session = Session;
    cmd.Execute();
}

protected TResult Default_ExecuteCommand<TResult>(Command<TResult> cmd)
{
    ExecuteCommand((Command) cmd);
    return cmd.Result;
}

I have code very much like that in production, because I know that in this system, there are actually only one or two dependencies that a command may want.

There are very few other dependencies, because of the limited number of abstractions that we have. This makes things very simple to write and work with.

Because we abstract away any dependency management, and because we allow only very small number of abstractions, this works very well. The amount of complexity that you have is way down, code reviewing this is very easy, because there isn’t much to review, and it all follows the same structure. The implementation of the rest are pretty much the same thing.

There is just one thing left to discuss, because it kept showing up on the comments for the other posts. How do you handle testing?

time to read 11 min | 2032 words

So in my previous post I spoke about this code and the complexity behind it:

public class CargoAdminController : BaseController
{
  [AcceptVerbs(HttpVerbs.Post)]
  public ActionResult Register(
      [ModelBinder(typeof (RegistrationCommandBinder))] RegistrationCommand registrationCommand)
  {
      DateTime arrivalDeadlineDateTime = DateTime.ParseExact(registrationCommand.ArrivalDeadline, RegisterDateFormat,
                                                             CultureInfo.InvariantCulture);

      string trackingId = BookingServiceFacade.BookNewCargo(
          registrationCommand.OriginUnlocode, registrationCommand.DestinationUnlocode, arrivalDeadlineDateTime
          );

      return RedirectToAction(ShowActionName, new RouteValueDictionary(new {trackingId}));
  }
}

In this post, I intend to show how we can refactor things. I am going to do that by flattening the architecture, removing useless abstractions and creating a simpler, easier to work with system.

The first thing to do is to refactor the method signature:

[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Register(string originUnlocode, string destinationUnlocode, DateTime arrivalDeadline)

Those are three parameters that we need, there is no need to create a model binder, custom command, etc just for this. For that matter, if you already have a model binder, why on earth do you store the date as a string, and not a date time. The framework is quite happy to do the conversion for me, and if it can’t, I can extend the infrastructure to do so. I don’t need to patch this action with date parsing code.

Next, we have this notion of booking a new cargo, looking at the service, that looks like:

public string BookNewCargo(string origin, string destination, DateTime arrivalDeadline)
{
    try
    {
        TrackingId trackingId = BookingService.BookNewCargo(
            new UnLocode(origin),
            new UnLocode(destination),
            arrivalDeadline
            );
        return trackingId.IdString;
    }
    catch (Exception exception)
    {
        throw new NDDDRemoteBookingException(exception.Message);
    }
}

The error handling alone sets my teeth on edge. Also, note that we have a complex type for TrackingId, which contains just a string (there is a lot of code there for IValueObject<T>, comparison, etc), all of which basically go away if you use an actual string. The same is true for UnLocode (UN Location Code, I assume), but at least this one has some validation code in it.

Then there is the lovely forwarding call, which translate to:

public TrackingId BookNewCargo(UnLocode originUnLocode,
                               UnLocode destinationUnLocode,
                               DateTime arrivalDeadline)
{
    using (var transactionScope = new TransactionScope())
    {
        TrackingId trackingId = cargoRepository.NextTrackingId();
        Location origin = locationRepository.Find(originUnLocode);
        Location destination = locationRepository.Find(destinationUnLocode);

        Cargo cargo = CargoFactory.NewCargo(trackingId, origin, destination, arrivalDeadline);

        cargoRepository.Store(cargo);
        logger.Info("Booked new cargo with tracking id " + cargo.TrackingId);

        transactionScope.Complete();
        return cargo.TrackingId;
    }
}

And now we got somewhere, we actually have something there that is actually meaningful. I’ll skip going deeper, I am pretty sure that you can understand what is going on.

From my point of view of the common abstractions in an application:

  1. Controllers
  2. Views
  3. Entities
  4. Commands
  5. Tasks
  6. Events
  7. Queries

Controllers are at the boundaries of the system, they orchestrate the entire system behavior. Note that I have no place for services or repositories in this list. That is quite intentional. Instead of going that route.

Take a look at the code that I ended up with:

 [AcceptVerbs(HttpVerbs.Post)]
 public ActionResult Register(string originUnlocode, string destinationUnlocode, DateTime arrivalDeadline)
 {
     var origin = Session.Load<Location>(originUnlocode);
     var destination = Session.Load<Location>(destinationUnlocode);

     var trackingId = Query(new NextTrackingIdQuery());

     var routeSpecification = new RouteSpecification(origin, destination, arrivalDeadline);
     var cargo = new Cargo(trackingId, routeSpecification);
     Session.Store(cargo);

     return RedirectToAction(ShowActionName, new RouteValueDictionary(new {trackingId}));
 }

As you can see, the entire architecture was collapsed into a single method.

And what kind of abstractions do we have here?

Well, we have the usual things from MVC, Controller, Action, parameter binding.

We have the session that we are using to load data by id, and to store the newly create cargo.

And we have the notion of a query. Generating a new TrackingID is a query that happen on the database (actually implemented as a hilo sequence). That is something that is definitely not the responsibility of the controller action, so we moved it into a query. Note that we have the Query() method there. It is defined as:

protected TResult Query<TResult>(Query<TResult> query)

And NextTrackingIdQuery is defined as:

public class NextTrackingIdQuery : Query<string>

Pretty simple, overall. And I can hear the nitpickers climb over the fences, waving the pitchforks and torches. “What happen when you need to reuse this logic? It is not in the UI and …”

There are a couple of things to note here.

First, there isn’t anywhere else that needs to book a cargo. And saying “and what happen when…” flies right into a wall of people shouting YAGNI.

Second, let us assume that there is such a need, to reuse the booking cargo scenario. How would we approach this?

Well, we can encapsulate the logic for the controller inside a Command. Which gives us:

[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Register(string originUnlocode, string destinationUnlocode, DateTime arrivalDeadline)
{
    var trackingId = ExecuteCommand(new RegisterCargo
    {
        OriginCode = originUnlocode,
        DestinationCode = destinationUnlocode,
        ArrivalDeadline = arrivalDeadline
    });

    return RedirectToAction(ShowActionName, new RouteValueDictionary(new { trackingId }));
}

And then we have the actual RegisterCargo command:

public abstract class Command
{
    public IDocumentSession Session { get; set; }
    public abstract void Execute();

    protected TResult Query<TResult>(Query<TResult> query);
}
public abstract class Command<T> : Command
{
    public T Result { get; protected set; }
}

public class RegisterCargo : Command<string>
{
    public override void Execute()
    {
        var origin = Session.Load<Location>(OriginCode);
        var destination = Session.Load<Location>(DestinationCode);

        var trackingId = Query(new NextTrackingIdQuery());

        var routeSpecification = new RouteSpecification(origin, destination, ArrivalDeadline);
        var cargo = new Cargo(trackingId, routeSpecification);
        Session.Save(cargo);

        Result = trackingId;
    }

    public string OriginCode { get; set; }

    public string DestinationCode { get; set; }

    public DateTime ArrivalDeadline { get; set; }
}

Note that the Command class also have a way to execute queries, in fact, it is the exact same way that we use when we had the code in the controller. We just moved stuff around, not really made any major change, but we can easily start using the same functionality in another location.

I generally don’t like doing this because most functionality is not reused, it is specific for a particular place and scenario, but I wanted to show how you can lift some part of the code and move it to a different location, otherwise people would complain about the “lack of reuse opportunities”.

On my next post I am going to talk about the Query() and ExecuteCommand() methods, and why they are so important.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}