Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 246 words

imageOne of the primary reasons why businesses chose to use workflow engines is that they get pretty pictures that explain what is going on and look like they are easy to deal with. The truth is anything but that, but pretty sell.

My recommended solution for workflow has a lot going for it, if you are a developer. But if you’ll try to show a business analyst this code, they are likely to just throw their hands up in the air and give up.  Where are the pretty pictures?

One of the main advantages of this kind of approach is that it is very rigid. You are handling things in the event handlers, registering the next step in the workflow, etc. All of which is very regimented. This is so for a reason. First, it make it very easy to look at the code and understand what is going on. Second, it allow us to process the code in additional ways.

Consider the following AST visitor, which operate over the same code.

This took me about twenty minutes to write, mostly to figure out the Graphviz notation. It take advantage of the fact that the structure of the code is predictable to generate the actual flow of actions from the code.

You get to use readable code and maintainable practices and show pretty pictures to the business people.

time to read 3 min | 407 words

In my previous post, I talked about the driving forces toward a scripting solution to workflow behavior, and I presented the following code as an example of such a solution. In this post, I want to focus on the non obvious aspects of such a design.

The first thing to note about this code is that it is very structured. You are working on an event based system, and as such, the input / output for the system are highly visible. It also means that we have straightforward ways to deal with complexity. We can break some part of the behavior into a different file or even a different workflow that we’ll call into.

The second thing to note is that workflows tend to be long running processes. In the code above, we have a pretty obvious way to handle state. We get passed a state object, which we can freely modify. Changes to the state object are persisted between event invocations. That is actually a pretty important issue. Because if we store that state inside RavenDB, we also get the ability to do a bunch of other really interesting stuff:

  • You can query ongoing workflow and check their state.
  • You can use the revisions feature inside of RavenDB and be able to track down the state changes between invocations.

The input to the events is also an object, and that means that you can also store that natively, which means that you have full tracing capabilities.

The third important thing to note is that the script is just code, and even in complex cases, it is going to be pretty small. That means that you can run version resistant workflows. What do I mean by that?

Once a workflow process has started, you want to keep it on the same workflow script that is started with. This make versioning decision much nicer, and it is very easy for you to deal with changes over time.  On the other hand, sometimes you need to fix the script itself (there was a bug that allowed negative APR), in which case you can change it for just the ongoing workflows.

Actual storage of the script can be in Git, or as a separate document inside the database. Alternatively, you may actually want to include the script itself in every workflow. That is usually reserved for industries where you have to be able to reproduce exactly what happened and I wouldn’t recommend doing this in general.

time to read 4 min | 760 words

I talked about some of the requirements for proper workflow design in my previous post. As a reminder, the top ones are:

  • Cater for developers, not the business analysts. (More on this later).
  • Source control isn’t optional, meaning:
    • Multiple branches
    • Can diff & review changes
    • Merging
    • Multiple people can work at the same time
  • Encapsulate complexity

This may seem like a pretty poor list, because if you are a developer, you might be taking all of these as granted. Because of that, I wanted to display a small taste from what used to be Microsoft’s primary workflow engine.

image

A small hint… this kind of system is not going to be useful for anything relating to source control, change management, collaborative work, understanding what is going on, etc.

A better solution for this would be to use a tool that can work with source control, that developers are familiar with and can handle the required complexity.

That tool is called… code.

It checks all the boxes required, naturally. But it does have a distinct disadvantage. One of the primary reasons you want to use a workflow engine of some kind is to decouple the implementation of your business from the policies of the business. Coming back to the mortgage example, how you calculate late fees payment is fixed (in the contract itself, but usually also by law and many regulations), but figuring out whatever late fees should be waived, on the other hand, is subject to the whims of the business.

That is a pretty simple example, but in most businesses, these kind of workflows adds up. You can easily end up with dozens to hundreds of different workflows without the business being too big or complex.

There is another issue, though. Code is pretty good when you need to handle straightforward tasks. A set of if statements (which is pretty much all most workflows are) are trivial to handle. But workflow has another property, they tend to be long. Not long on computer scale (seconds), but long on people scale (months and years).

The typical process of getting a loan may involve an initial submission, review by a doctor, asking for follow up documentation (rinse – repeat a few times), getting doctor appraisal and only then being able to generate a quote for the customer. Then we have a period of time in which the customer can accept, a qualifying period, etc. That can last for a good long while.

Trying to code long running processes like that require us a very unnatural approach to coding. Especially since you are likely to need to handle software updates while the workflows are running.

In short, we are in a strange position: we want to use code, because it is clear, support software development practices that are essentials and can scale up in complexity as needed. On the other hand, we don’t want to use our usual codebase for that, because we’ll have very different deployment strategies, the manner of working is very different and there is a lot more involvement of the business in what is going on there.

The way to handle that is to create a proper boundary between parts of the system. We’ll have the workflow behavior, defined in scripts, that describe the policy of the system. These tend to be fairly high level concepts and are designed explicitly for the rule of business policy behaviors. The infrastructure for that, on the other hand, is just a standard application using normal software practices, that is driven by the workflow scripts.

And by a script, I meant literally a script. As in, JavaScript.

I want to give you a sneak peak into how I envision this kind of system, but I’ll defer full discussion of what is involved to my next post.



The idea is that we use the script to define our policy, and then we use that to make decisions and invoke the next stage in the process. You might notice that we have the state variable, which is persisted between invocations. That allow us to use a programming model that is fairly common and obvious to developers. We can usually also show this, as is, to a business analyst and get them to understand what is going on easily enough. All the actual actions are abstracted. For example, life insurance setup is a completely different workflow that we invoke.

In my next post, I’m going to drill down a bit into the details of this approach and what kind of features do we need there.

time to read 5 min | 815 words

One of the most common themes I run into when talking to customers, users and sundry people in tech is the repeated desire to fire developers.

Actually, that is probably too loaded a statement. It actually come in two parts:

  • Developers usually want to focus on the interesting bits, and the business logic portions aren’t that much fun.
  • The business analysts usually want to get things done and having to get developers to do that is considered inefficient.

If only there was a tool, or a pattern, or a framework, or something that would allow the business analysts to directly specify the behavior of the system… Why, we could cut the developers from the process entirely! And speaking as a developer, that would be a huge relief.

I think the original name for that was CASE tools, and that flopped. In fact, literally every single one of the attempts to replace developers by a tool has flopped. They got such a bad rap that people keep trying to implement them using different names. Some stuff can be done fairly easily, though. WYSIWYG for GUI is well established and Wordpress and WIX, to name the two examples that come to mind immediately, show that you can have a non techie build a proper website. In fact, you can even plug in some pretty sophisticated functionality without burdening the user with too much.

But all that takes you to a point. And past that point, the drop off is harsh. Let’s take another common tool that is used to reduce the dependency on developers, SharePoint.

You pay close to double for actual developer time on SharePoint, mostly because it is so painful to work with it.

In a recent conference, I got into a conversation about business workflows and how to best implement them. You can look at the image on the right to get a good idea about what kind of process they were talking about.

To make things real, I want to take a “simple” example, of accepting a life insurance policy. Here is what the (extremely simplified) workflow looks like for issuing a life insurance policy:

image

This looks good, and it certainly should make sense to a business analyst. However, even after I pretty much reduced the process to its bare bones and even those has been filed away, this is still pretty complex. The process of actually getting a policy is actually a lot more complex. Some questions don’t require doctor evaluation (for example, smoking) and some require supplemental documentation (oh, you were hospitalized? Gimme all these records). The doctor may recommend different rates, rejecting entirely, some exceptions in the policy, etc. All of which need to be in the workflow. Actuarial tables needs to be consulted for each of those cases, etc, etc, etc.

But something like the diagram above isn’t going to be able to handle this level of complexity. You are going to get lost very quickly if you try to put so many boxes on the screen.

So you need encapsulation.

And you’ll probably want to have a way to develop these business workflows, which means that they aren’t static.

So you need source control.

And if you have a complex business process, you likely have different people working on it.

So you need to be able to review changes, and merge them.

Note that this is explicitly distinct from being able to store the data in source control. Being able to actually diff in a meaningful fashion two versions of such a process is anything but trivial. Usually you are left with diffing the raw XML / JSON that store the structure. Good luck with that.

If the workflow is complex, you need to be able to understand what is going on under various conditions.

So you need a debugger.

In fact, pretty soon you’ll realize that you’ll need quite a lot of the things that developers do. Except that your tool of choice doesn’t do that, or if they do, they do it poorly.

Another issue that even if you somehow managed to bypass all of those details, you are going to be facing the same drop that you see elsewhere with tools that attempt to get rid of developers. At some point, the complexity grows too large, and you’ll call your development team and hand of the system to them. At which point they will be stack with a very clucky tool that attempt to be quite clever and easy to use. It is also horribly limiting for a developer. Mostly because all of the “complexity” involved is in the business process itself, not in the actual complexity of what is going on.

There are better ways of handling that, and the easier among them is to just use code. That can be… surprisingly versatile.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 4 days from now
  3. What happens when a sparse file allocation fails? - 6 days from now
  4. NTFS has an emergency stash of disk space - 8 days from now
  5. Challenge: Giving file system developer ulcer - 11 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}