The cost of counting: The silliest optimization…

time to read 1 min | 162 words

For a lecture that one of the guys is giving, we started to talk about the approach to take to explain a certain feature (distributed counters).

I suggested that we’ll measure how fast a single thread can count, which resulted in the following code:

On my machine, this results in 25,943,321 in 00:00:01.0000060. Which is great, but seems to be very low. So we set out to see how fast we can count things. Which is about the silliest thing ever. However, it is fun.

It is obvious that the cost here is the Stopwatch calls. So let’s try reducing them. I changed it so it will only check every so often.

This one gives us 205,840,000 in 00:00:01.0010240, so that is much nicer. Can we do better? How about is we changed how we check?

This give me 644,081,297 in 00:00:01.0005354. And I don’t think that we can do better than this very easily.

Tweet Share Share 18 comments

Tags:

performance

Comments

07 Mar 2017
10:25 AM

dmitry_vk

You can do better, e.g., by removing division from 2nd variant and replacing it with bitwise operations:

 long count = 0;
 var sp = Stopwatch.StartNew();
 while (true)
 {
     if ((count & (0xFFFFL - 1)) == 0 &&
         sp.ElapsedMilliseconds > 1000)
         break;
     count++;
 }
 sp.Stop();
Console.WriteLine(count + " in " + sp.Elapsed);

On my machine this is faster than threaded solution by a couple percent and significantly faster than 2nd variant (with division).

07 Mar 2017
10:38 AM

dmitry_vk

this is faster than threaded solution by a couple percent

Oops, that was on debug build. In release mode my variant is 3x faster than threaded variant:

562282964 in 00:00:01.0012022 for 3rd (thread-based) variant

1610678272 in 00:00:01.0010230 for my variant

07 Mar 2017
11:27 AM

Peter

Even faster on my computer is the combination of dmitry_vk and the threaded version

byte done = 0;
Task.Factory.StartNew(() =>
{
    Thread.Sleep(1000);
    Volatile.Write(ref done, 1);
});

long count = 0;
var sp = Stopwatch.StartNew();
while (true)
{
    if ((count & (0xFFFFL - 1)) == 0 && Volatile.Read(ref done) == 1)
        break;
    count++;
}
sp.Stop();
Console.WriteLine(count + " in " + sp.Elapsed);

(also changed done to a byte seems to give better precision but im not 100% sure) dmitry_vk: 1526071296 in 00:00:01.0010249 combined: 1675231232 in 00:00:01.0011874

07 Mar 2017
13:05 PM

Philippe Cadieux-Pelletier

So I guess unrolling the loop to leverage the fact CPUs can do more than one operation per clock and are bad at branching would be cheating?

07 Mar 2017
13:55 PM

Federico Lois

@Philippe, not really... It is a very valid optimization. We do that a lot :) https://github.com/ravendb/ravendb/blob/v4.0/src/Sparrow/Memory.cs#L247

07 Mar 2017
14:06 PM

Harold

This has the most throughput per second on my machine:

''' var max = 1000000000; var sw = Stopwatch.StartNew();

        var i = 0;
        while (i < max)
        {
            i++;
        }

        sw.Stop();

        Console.WriteLine($"{sw.ElapsedMilliseconds} ms for {max} items = {max / ((double)sw.ElapsedMilliseconds / 1000)} /second");

'''

in vscode in debug mode: 2073 ms for 1000000000 items = 482392667.631452 /second

07 Mar 2017
14:13 PM

Federico Lois

@Harold, never run in debug mode for optimization (the assembler code is highly inefficient). We have processes here (because of highly tuned routines) that can have up to 4x performance difference between debug and release code. In fact, non optimized code is usually faster in debug mode than highly optimized one.

07 Mar 2017
14:30 PM

Harold

for the release build I get these results: 338 ms for 1000000000 items = 2958579881.6568 /second

Thats ~ 6x better. And its about the same as my clock speed of 3GHz (not sure if thats coincidence).

07 Mar 2017
14:38 PM

Harold

ok, one last comment then I stop spamming. When I change var i = 0; to long i = 0;

it is about half the speed (1800 000 000)

07 Mar 2017
14:57 PM

Diego G. Fritz

I'm confused. The first version elapsed 25,943,321 in 00:00:01.0000060. Second version elapsed 205,840,000 in 00:00:01.0010240. The first version is better than second version. So, "that is much nicer"?

07 Mar 2017
15:00 PM

Peter

Diego G. Fritz higher is better:

First:   25,943,321
Second: 205,840,000

The second version has 10x the performance on your computer.

07 Mar 2017
15:14 PM

Thomas Freudenberg

@Harold The point is to stop the loop after ~1 sec.

07 Mar 2017
15:40 PM

Diego G. Fritz

Sorry. I am a fool. I do not read well the statement.

07 Mar 2017
18:38 PM

Harold

@Thomas, no one says stopping after one sec is a requirement. The original requirement is: "I suggested that we’ll measure how fast a single thread can count,"

But maybe I'm being a smartass ;-)

07 Mar 2017
20:43 PM

Mihailik

Why oh why do you check the time in the cycle?

Even volatile read across threads is wasteful and wrong. CPU caches don't like it.

Just run it for a fixed few billion times, measure time at the start and end, then you get your speed.

Effectively what you're doing is giving Usain Bolt a stopwatch. How the hell is he meant to focus on running, whilst looking at the watch face? Silly! Just let Usain run his 100m at his 100% speed, and you do the measurement.

08 Mar 2017
14:19 PM

Duckie

Mihailik, if you read the title of the post, you might understand why they have to check the time. A performance counter is worth little if you cannot read it continuously...

08 Mar 2017
17:36 PM

Paulo Morgado

Have you tried with ThreadPool.QueueUserWorkItem instead of Task.Factory.StartNew?

12 Mar 2017
10:01 AM

MaLio

Another approach, though certainly not recommended for 'real' code, but I was curious to see how it fared. (it is slower than the fastest method above by 20% +- but as the seconds increase the difference should decrease)

        long counter = 0;

        System.Threading.Thread t = new System.Threading.Thread(() => {
            counter = 0;

            try {
                while (true) {
                    counter++;
                }
            }
            catch (System.Threading.ThreadAbortException) {
                System.Threading.Volatile.Write(ref counter, counter);
            }
        });

        t.Start();
        System.Threading.Thread.Sleep(1000);
        t.Abort();

        System.Console.WriteLine(System.Threading.Volatile.Read(ref counter));

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB