Concurrency Solutions

time to read 4 min | 611 words

I am not sure why, but recently I have became interested in concurrency again. I blogged about both the pipeline and the tasks scheduling approaches recently, and I was repeatedly pointed to Retlang, Parallel FX and the Concurrency and Coordination Runtime.

I thought that it would be interesting to compare those approaches.

The pipeline

The pipeline can be easily parallelized, because there is an explicit handoff of objects as they pass from one stage of the pipeline to another. The only synchronization needed for this would be in terms of passing the objects between the stages.

Task Scheduling

Task scheduling is interesting, mainly because the way it operates is fits the concurrent model so nicely. You execute something, you wait for it, you don't take any resources while you do this. To me, it looks like quite a natural way of programming, especially with the ability to wait on other tasks.

Retlang

Retlang's concurrency model is based on message passing. As such, it is ideally suited to supporting cross thread communication, event driven systems, pipeline processing, etc. Bind Retlang with the Task Scheduling solution, and you are nearly at Erlang's capabilities.

Parallel FX

I get to run things in parallel, that is about it. The ability to scale a for loop to the number of processors on the machine is nice, but I am seeing it as a low level utility more than anything else. PLinq is much more interesting.

Then again, I am currently in search for higher level concurrency solutions than what Parallel FX supply.

Concurrency and Coordination Runtime

This looks a lot more like what I had in mind. The concurrency primitive that we have here are high level enough to be really useful, and it looks like it has implemented the same ideas that I had, both message passing and task based. I haven't played with it, just read the docs, so I am not sure how useful this really is.

State

The real problem with concurrency is state. Erlang gets away with dealing with it because it has no mutable state. In most imperative languages, we have to deal with this somehow.

The usual answer for that is don't share state. The pipeline example is a good one, you don't get two processes dealing with the same object at the same time.

In my task scheduling code, I am ensuring that a task can run on a single thread at a time (but not that it run on the same thread all the time. This should take care of most state issues, since as long as you keep to your own instance, you are safe. If you start passing around mutable variables, then you need to thread more carefully.

Ordering

Another issue that you have with concurrency is that you usually have some requirements about in what order you want to deal with operations. You can't really approve an order before it was created, now can you?

Different approaches for concurrency has different guarantees in regards to ordering. Retlang will always preserve ordering, which make some things extremely simple. My Task Scheduling library will not preserve ordering (it will schedule the first task that is ready to go), and Parallel FX doesn't preserve ordering unless you explicitly request it (couldn't find out about the CCR).

An interesting observation about the task scheduling library, it doesn't really need ordering, because of the way we yield in the method, we can get away with just waiting for the values to be completed.

I don't think that there is a right approach, but I strongly lean toward tasks and message passing as a default concurrency approach.

Tweet Share Share 11 comments

Tags:

Design

Comments

08 Jan 2008
08:14 AM

Darius Damalakas

Nice

08 Jan 2008
10:33 AM

Jon Skeet

One thing to note about state: MS is clearly thinking quite hard about immutability and how to deliver it in both languages and libraries. This may well have been accelerated by F#.

Just as a few examples:

http://www.bluebytesoftware.com/blog/2007/11/11/ImmutableTypesForC.aspx

http://www.bluebytesoftware.com/blog/2007/11/17/ImmutableTypesCanCopyTheWorldSafely.aspx

http://blogs.msdn.com/lucabol/archive/2007/12/28/creating-an-immutable-value-object-in-c-part-iv-a-class-with-a-special-value.aspx

http://blogs.msdn.com/ericlippert/archive/tags/Immutability/default.aspx

Of course, now that you've looked at the CCR etc, you should really have a look at... nah, just kidding, I'm done ;)

Jon

08 Jan 2008
15:54 PM

Judah

Oren, you mentioned that PFX is "getting to run things in parallel, that's about it".

Last time I looked, PFX had the concepts of tasks and task scheduling, work stealing, some pretty advanced stuff.

08 Jan 2008
22:07 PM

Ayende Rahien

Judah,

What the PFX provides is a (much) better thread pool alternative, in terms of the API given.

What I am thinking is concurrent tasks, which can pass around control among them without difficulties.

08 Jan 2008
22:28 PM

Ayende Rahien

Judah,

Do you have a reference for "work stealing"? I couldn't find it.

09 Jan 2008
08:24 AM

Jon Skeet

http://en.wikipedia.org/wiki/Cilk#Work-stealing

09 Jan 2008
09:14 AM

Ayende Rahien

I was thinking about work stealing for the PFX.

This is basically what I am doing in the task scheduling.

One interesting thing here is the ability to stop a task in the middle.

Clik supports it, and so does the work scheduling, but in my search, PFX doesn't.

09 Jan 2008
09:21 AM

Jon Skeet

It definitely does - I only heard about it in CILK because of PFX :)

From

http://www.bluebytesoftware.com/blog/2007/11/30/ParallelExtensionsCTPIsAvailable.aspx

<quote>

Not only that, but we’ve expanded the scope of the original project significantly, from PLINQ to Parallel FX, to include new imperative data parallel APIs (for situations where expressing a problem in LINQ is unnatural), powerful task APIs that offer waiting and cancelation, all supported by a common work scheduler based on CILK-style work-stealing techniques developed in collaboration with Microsoft Research.

</quote>

(Side note: Joe Duffy rocks extremely hard. Can't wait for his concurrency book.)

I'd written the stuff below before your comment came in - included just for extra info:

Actually, it looks like that's not terribly helpful as a URL. If you Google for

"work stealing" CILK

you'll find plenty of hits.

I believe the basic idea is that each thread has a queue of work it intends to do - and it can access that work queue very efficiently without locking etc. If it runs out of work, it can steal work from another queue. This is less efficient, but hopefully happens relatively rarely.

I don't pretend to understand the details of it, but I think it's meant to be significantly more efficient (particularly for short items?) than a single pool of work which all threads steal from. I suspect it also makes PFX work nicely when the system is heavily loaded with other programs, although that's pretty much a guess!

Jon

09 Jan 2008
09:36 AM

Ayende Rahien

Jon,

Note the collaboration technique that I have for the tasks. They can yield and resume without blocking threads.

What you are talking about is usually used to maintain the current process cache intact, as well as keeping locks down.

09 Jan 2008
10:06 AM

Jon Skeet

Indeed - all I was doing was making good on the work stealing references :)

16 Jan 2008
22:49 PM

Ori Folger

"Another issue that you have with concurrency is that you usually have some requirements about in what order you want to deal with operations. You can't really approve an order before it was created, now can you?"

If you can undo order approval, maybe you can get started on approving an order before it's created?

If you have two processes, say Order Creation and Order Approval, you can start on the Order Approval before Order Creation is really done, by making reasonable assumptions on the missing data. If any of the assumptions turn out wrong you undo the order approval and start over.

It's called speculative execution and it's at the heart of modern CPUs, which are based around a pipeline design.

The trick here is with making reasonable assumptions. It's a game of statistics - if you're very rarely wrong, this is a good way to go. To give an idea of how good you need to be , one of the assumption making components of a CPU is the branch predictor, which is right about 95% of the time.

As an example more suitable for high level software, consider the pipeline for an HTTP response. The final response depends on many different processes - user authentication, cache lookup, ad selection, etc. . Instead of just waiting for every bit of these, the pipeline proceeds based on an assumption (i.e. assume the user is an anonymous visitor) and in the rare case the assumption is wrong, the pipeline throws away the work done based on that assumption and starts over. But in the highly common case, the response is generated faster and there is greater resource utilization because there's no need to wait for user authentication to finish before doing any useful work.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB