Digging into the CoreCLR: Some bashing on the cost of hashing

architecture (612) rss
bugs (451) rss
challanges (123) rss
community (380) rss
databases (481) rss
design (895) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1085) rss
raven (1450) rss
ravendb.net (534) rss
reviews (184) rss

2025
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Think inside the database - RavenDB with native GenAI integration

Nov 25 2016

Digging into the CoreCLRSome bashing on the cost of hashing

time to read 2 min | 372 words

Note: This post was written by Federico.

Recently at CoreFX there has been a proposal to deal with the typical case of everyone writing their own hash combining logic. I really like the framework to push that kind of functionality because hashing is one of those things that unless you really know what you are doing, it can backfire pretty bad (and even if you know, you can step over the cliff too). But this post is not about the issue itself, but it showcases a few little details that happen at JIT land (and that we usually abuse, of course J). This is the issue: https://github.com/dotnet/corefx/issues/8034#issuecomment-260733285

Let me illustrate with some examples.
ValueTuple is a new type that is being introduced that is necessary for some of the new tuples functionality in C# 7. Because hashing is important, of course they implemented the ability to combine hashes. Now, let’s suppose that we take the actual code hashing code that is being used for ValueTuple and use constants to call it.

Now under an optimizing compiler chances are that there shouldn't be any difference, but in reality there is.

This is the actual machine code for ValueTuple:

So now what can be seen here? First we are creating an struct in the stack, then we are calling the actual hash code.

Now compare it with the use of HashHelper.Combine which for all purposes it could be the actual implementation of Hash.Combine

I know!!! How cool is that???
But let’s not stop there... let’s use actual parameters:

We use random values here to force the JIT to treat them as parameters and not being able to eliminate the code and convert it yet again into a constant.

The good thing, this is extremely stable. But let’s compare it with the alternative:

The question that you may be asking yourselves is: Does that scale?. So, let’s go overboard...

And the result is pretty illustrative:

The takeaway of the analysis is simple: even if the holding type is a struct, it doesn't mean it is free :)

Tweet Share Share 13 comments

Tags:

Comments

25 Nov 2016
14:54 PM

Dalibor Carapic

I've read this post twice and TBH I do not understand the point that the author is trying to make. Is the point that ValueTuple should not be used just for purposes of generating hashes because it is slower then custom coded method? OK - I was not even expecting that it should be faster or same as custom coded hashing method.

Also:

And the result is pretty illustrative Sorry I do not read MSIL, I do not understand what you are trying to say or what 'Awesome code :D' is supposed to be here.

25 Nov 2016
18:22 PM

IJesús López

I have also read this post twice, but I cannot understand it because I don't speak x86 assembly. But, know what? I don't want to speak it either. I speak,T-SQL, c#, c, c++, CIL, Typescript, Pascal and many others..but I don't want to speak asm.

25 Nov 2016
20:39 PM

Oren Eini

Dalibor, This relates to common optimization pattern of using structs to allow the JIT to inline the code. The problem is that it isn't always able to do so, and in some cases there is a cost to just creating the struct and then making a method call.

The reason that the last image is awesome is that the JIT was able to optimize away multiple calls to this method so the whole thing is nice and tight without any method calls.

This is much faster than having to jump around, push things into the stack, spill registers, etc.

25 Nov 2016
20:40 PM

Oren Eini

IJesús, I agree that you don't want to speak it, but if you care about performance, you need to understand what is actually being executed, and what the impact of that is. In this case, the key takeaway from this post is to understand that while structs are cheap, they aren't free.

25 Nov 2016
20:47 PM

Jesús López

@ayende Oh! Is there something free in computing? How much costs structs vs classes? I would like to see some numbers instead of asm code.

25 Nov 2016
21:10 PM

Oren Eini

Jesús, It is impossible to answer this.

For example, class allocated on heap, structs on stack. So passing structs is expensive, but creating them relatively cheap. Passing classes around is cheap, but they require garbage collection.

Generics with interface constraint build using struct variables can have inline methods, while classes cannot, etc.

25 Nov 2016
21:20 PM

Dalibor Carapic

Dalibor, This relates to common optimization pattern of using structs to allow the JIT to inline the code. The problem is that it isn't always able to do so, and in some cases there is a cost to just creating the struct and then making a method call.

You learn something new every day. Can you provide a link or something regarding this pattern? This is the first time I've heard about it.

The reason that the last image is awesome is that the JIT was able to optimize away multiple calls to this method so the whole thing is nice and tight without any method calls.

IMHO this is something that should be stated in the post. Just placing bunch of MSIL and stating 'it is awesome' does not tell me much. I just don't know what should be awesome ... compiler inlining single line methods ... some other optimization which I miss because I'm not good at reading MSIL code?

25 Nov 2016
21:24 PM

Oren Eini

Dalibor, The JIT inline magic is complex and not really well documented that I have seen. A lot of what I know is from observing its behavior, not formal docs. This is hard to do, because it is based on a lot of small optimizations and they keep making it better.

In particular, see: https://blogs.msdn.microsoft.com/davidnotario/2004/11/01/jit-optimizations-inlining-ii/#comment-243 https://blogs.msdn.microsoft.com/vancem/2008/08/19/to-inline-or-not-to-inline-that-is-the-question/

And note that this isn't MSIL, but x86 ASM

25 Nov 2016
21:39 PM

Dalibor Carapic

I've looked at the links that you've provided but I do not see any reference to 'common optimization pattern of using structs to allow the JIT to inline the code'. Perhaps I'm missing something there.

25 Nov 2016
21:45 PM

Oren Eini

Dalibor, It is about inlining, and it isn't something that you really need to know unless you are dealing with micro optimizations. When you do need to know that, you start figuring out what will trigger inlining so you can reap the benefits.

For more reference: https://en.wikipedia.org/wiki/Inline_function http://www.greenend.org.uk/rjk/tech/inline.html

25 Nov 2016
22:08 PM

Dalibor Carapic

I am aware what inlining is. As far as I can tell you are claiming that there is something special about using structure/value types which makes their methods somehow 'special' when it comes to inlining. From your post I assume that when instantiating a structure type and then calling its method and not using the instance again the instance should be optimized away and just the inline 'body' of the method should remain?
If this is so then I'm simply asking on what are you basing this assumption on. If this is something that you've personally observed then I would encourage you to maybe make another post regarding this optimization technique so that people would get better acquainted with it. As it is I've found no mention of this particular optimization after taking some time to google it up.

26 Nov 2016
20:25 PM

Federico Lois

@Dalibor The underlying problem is that the JIT is ever changing. Not long ago we had a piece of code that made a few copies of its parameters, work with them and then end. See: https://github.com/dotnet/coreclr/issues/6014 . We could have written about it, but it was solved a few days later (took a couple of months to be available, but its fine). This was more general, it shows, that the opportunities for the JIT to optimize code exist but you have to be aware of certain restrictions. The best way to know what are the current restrictions is to look over the area-CodeGen label at CoreCLR https://github.com/dotnet/coreclr/issues?utf8=%E2%9C%93&q=is%3Aissue%20label%3Aarea-CodeGen%20

@Jesus C# tuples are going to be a great addition and they will be able to optimize automatically a more than a few of common patterns, but with great power comes great responsability. Tuples are going to start stressing your stack allocations and passing tuples around is going to be a major hog on your system. The point is that nothing is free, everything has a cost and therefore you must know where those costs are; for those not wanting to go straight to the assembler the root of those costs are hidden. There are many sources of costs so that's probably why it is not a simple topic to show numbers, you can easily build an example where structs are fast, then add a few methods and some logic and then your classes are way faster. You grow the size of your struct and now those extra initialization instructions will hurt you on tight code. Those and many other behaviors make the problem of comparing them hard.

There are also other optimization opportunities that wont be accesible if you dont want to go deep down into the assembler; for example in a commit yesterday we achieved 40% improvement in the LZ4 compression just controlling the assembler we wrote. https://github.com/ravendb/ravendb/commit/3ef1d60a9f43b5daa38f7071d7819ce249a78f70 . Granted, this is not required for the 99.9% of the projects out there (and it is just fine), but when you are doing high performance programming, you need to know how to deal with that. There are a lot of people that do want to speak C# fluently at this level because they may end up needing to write this kind of code eventually.

27 Nov 2016
04:52 AM

Avi Farah

Oren,

Thank you and very impressive.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Digging into the CoreCLRSome bashing on the cost of hashing

More posts in "Digging into the CoreCLR" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

More posts in "Digging into the CoreCLR" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication