Distributed transactions with remote queuing system

time to read 3 min | 529 words

I got into an interesting discussion about that two days ago, so I thought that I would put down some of my thoughts about the topic.

Just to put some context into operation here, I am NOT talking about solving the generic problem of DTC on top of remote queuing system. I am talking specifically about sharing a transaction between a remote queuing system and a database. This is important if we want to enable “pull msg from queue, update db” scenarios. Without those, you are running into danger zones of processing a message twice.

We will assume that the remote queuing system operations on the Dequeue/Timeout/Ack model. In which you dequeue a message from the queue, and then you have a timeout to acknowledge its processing before it is put back into the queue. We will also assume a database that supports transactions.

Basically, what we need to do is to keep a record of all the messages we processed. We do that by storing that record in the database, where it is subject to the same transactional rules as the rest of our data. We would need a table similar to this:

CREATE TABLE Infrastructure.ProcessedMessages
(
   MsgId uniqueidentifier primary key,
   Timestamp DateTime not null
)

Given that, we can handle messages using the following code:

using(var tx = con.BeginTransaction())
{
    var msg = queue.Dequeue();
    try
    {
        InsertToProcessedMessages(msg);
    }
    catch(DuplicatePrimaryKey)
    {
        queue.Ack(msg);
        tx.Rollback();
        return;
    }

    // process the msg using this transaction context
    tx.Commit();
queue.Ack(msg);
}

Because we can rely on the database to ensure transactional consistency, we can be sure that:

  • we will only process a message once
  • an error in processing a message will result in removing the process record from the database

We will probably need a daily job or something similar to clear out the table, but that would be about it.

Thoughts?