Soft Deletes aren’t Append Only model

time to read 3 min | 450 words

There seems to be some confusion regarding my post about soft deletes, in particular, people brought up the idea of append only models.

I had the chance to work on both types of systems, and I can tell you that I would much rather work with append only model than with soft deletes. The append only model means that you can only ever insert, never delete or update.

Thing about the way your bank account works. If I had the clerk transfer money from one account to another, and he had a typo and send tens times the amount that I wanted, the bank will not “delete” the transaction. What will happen is that there will be a separate transaction, canceling the first one.

There are several reasons for going with this approach, which Jim has brought up in his post:

automatic audit logging, since nothing is ever UPDATE'd or DELETE'd, you've got a constant trail of changes

automatic support for infinite undo/roll-back support of data, as you simply load a prior version and then save as usual

automatic support for labeling of versions, much like in source/version control systems, at an individual record level, table level, "aggregate root level", or database level

automatic support for "back querying" a system, in search of what the situation looked like last month, last year, etc. (though raising this "aspect", as in AOP, to the ORM level would be crucial)

As I said, this makes things much simpler from a lot of aspects. It does mean that you have a more complex data model (because all associations are now using: Id + max(version) ), but that is manageable.

But, as I said, there is a distinct difference between that and soft deletes. Soft deletes, as I refer to them, portend to IsDeleted columns that perform a logical deletion in the database. I don’t really like those, and I explained my reasoning in my previous post.

Append Only models represent some complexity with regards to managing things, but in general, they force you to think in a very different fashion than CUD models. For one thing, you are almost always going to have a different reporting model, instead of trying to query the append only model directly (which gets to be complicated).

There is one thing that I want to emphasis, using Append Only model should be reflected in your API. Trying to abstract that away is going to lead to a world of pain.

Tweet Share Share 14 comments

Tags:

Comments

06 Sep 2009
10:14 AM

Nima

A great post as always :)

Some times using an append only approach can't be avoided specially if a financial solution is involved and each record in database means real money but as always the DB size and performance issues are my concerns in this kind of approach . I'd like to know if there are any guidelines/best practices/advices on designing such kind of databases.

As I realized Microsoft uses separate tables for processing instances and completed instances (in BizTalk for example) although I'm not sure if they are using apply only approach.

Thanks again

06 Sep 2009
10:16 AM

Mark Nijhof

How about only doing mutations? So you would still only do inserts but not with the whole information but only with what got changed.

-Mark

06 Sep 2009
15:21 PM

Giorgio Sironi

It's a paradox that a record is "more" deleted by editing out all its fields than by deleting it in the application which use a isDeleted column.

06 Sep 2009
17:02 PM

Eyston

Greg Young has a good talk about this:

www.infoq.com/.../greg-young-unshackle-qcon08

Its very interesting stuff.

Taking CQS to the architecture level is something I find very interesting but also daunting. It would make code so much cleaner, but the level of infrastructure is rarely required for the simple things I do -- more than CRUD, less than DDD.

06 Sep 2009
17:35 PM

Michael L Perry

I've been working in this area for a while. I call this technique "historic modeling". I've written a set of rules, a walkthrough, and a library in support of this idea.

http://historicmodeling.com
http://correspondence.codeplex.com

06 Sep 2009
20:53 PM

Harry

Since you brought it up, Oren, can you shed some light from your experience how do you approach the 'Append Only' model? And, how do you use NHibernate to tackle that (if you use NHibernate at all)? You mentioned the reporting should be done differently, how should we do that without all the labors of ADO.NET / Stored Procedures /SQL ...

07 Sep 2009
04:50 AM

Eyston

Harry:

That Greg Young talk above talks about it some. Udi Dahan has a talk that also addresses this:

www.infoq.com/.../Making-Roles-Explicit-Udi-Dahan

(I watched the NDC 2009 video, not sure if this is all the same stuff).

Anyways, the root of most of this is working in Domain Events. Moving to events allows you to have multiple handlers with each (potentially) having their own tailor designed model (domain, reporting, logging, etc).

Udi Dahan is a good source on this topic: www.udidahan.com/.../domain-events-salvation/

08 Sep 2009
04:49 AM

Bertrand Le Roy

Immutable rows... I'm wondering what it would be like if such constraints could be modeled at the database level, and what kind of optimization could be applied to the database engine to favor last version reading for example, or efficient and transparent storage. Also wondering about whether the transaction log has aspects of a different implementation of something similar. Vague questions, I know, thinking out loud and way out of my area of expertise...

08 Sep 2009
14:02 PM

Ayende Rahien

Bertrand,

In the DB level, you can't really use constraints.

But something that would be easy to do is to have a separate reporting model (physical one) where you do updates.

You only read from it to show stuff.

08 Sep 2009
16:13 PM

Ayende Rahien

Harry,

I'll have a post about it.

08 Sep 2009
21:57 PM

Neil Kerkin

Append Only also allow different systems to collect information about the same entity.

This is the approach taken by openEHR (openehr.org), which allows any number of systems to collect information about a person without fear of conflicts when consolidating records. Not a standard SQL Data Model though.

09 Sep 2009
04:16 AM

Steve Py

Ah yes, Append-Only with NHibernate.. This is something I'm tinkering with but have been running into a few hurdles. I'm trying to avoid having to do stuff like Evicts. So far meaningless PKs are a must, but this raises issues on how to tie in locking & ensure you aren't appending to an already appended object, and how to ensure cached references reflect the latest entities reliably. I'm looking forward to that post.

Building audit trails after the fact with NHibernate around a CRUD system is a really, really messy operation. (Been there, debugged that.)

09 Sep 2009
11:01 AM

Ayende Rahien

Steve,

I have a post about that that will show up in the 16th

11 Sep 2009
18:51 PM

Jim Sally

Been a busy week... anyway, thanks for posting this Oren, as obviously I couldn't agree more. And I'm definitely looking forward to the post on the 16th that describes how you would tackle this problem via NHibernate.

Here's to hoping that you spending some time writing about this subject will actually bring it closer to the forefront, much as I'd love to see it!

Take care...

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB