Electric fenced memory results

time to read 1 min | 77 words

A few days ago I posted about electric fence memory and its usages. Here is one problem that if found for us.

Do you see the bug? And can you imagine how hard it would be for us to figure this out if we didn’t hotwire the memory?

Tweet Share Share 18 comments

Tags:

bugs

Comments

02 Jan 2017
10:34 AM

Dennis

Quick guess. The allocation is in bytes, but the indexing is in char.

02 Jan 2017
11:18 AM

John

Encoding issue?

02 Jan 2017
11:24 AM

Uri

the exception indicates that you somehow overflow the destChars, hence the size is probably wrong. so the only issue i can think is there might be a problem with the linq Sum, if its lazy it might give wrong calculation, but I'm not sure.

02 Jan 2017
12:54 PM

wqw

GetNativeTempBuffer allocates less than Size bytes?

02 Jan 2017
16:36 PM

Stuart Turner

Assuming that query.FieldsToFetch is a series of IList<char>, then: you are assembling the total length of all characters to put them into a single buffer area to hash. The problem is that char is two bytes wide. You are counting the number of chars, but context.GetNativeTempBuffer() is expecting the number of bytes, which is twice the number of chars. This means you overrun your buffer exactly halfway through loading the string into the buffer.

This would definitely take a while to identify without your electric fence. Nice catch!

03 Jan 2017
08:47 AM

Oren Eini

Stuart, Yes, that is indeed the issue. The actual problem is even worse, leaving aside the fact that the code read correctly. The native buffer works in power of 2, so most of the time, this would be fine, and then we'll have silent heap corruption...

03 Jan 2017
11:25 AM

Dennis

This ought to be a trivial matter for a static code analyzer to find. Isnt there any good ones available for unsafe c# code?

03 Jan 2017
13:04 PM

Oren Eini

Dennis, I haven't seen any for C#, no.

03 Jan 2017
22:32 PM

Alois Kraus

Nice catch. I would love to see you more going into the BlittableJson stuff. How do you deal with updates to a large document and will be there a BlittableJsonWriter?

04 Jan 2017
10:03 AM

Oren Eini

Alois, There is :-)

https://github.com/ravendb/ravendb/blob/v4.0/src/Raven.NewClient/Json/BlittableJsonWriter.cs

Can you explain about large documents?

04 Jan 2017
12:54 PM

Alois Kraus

I have played around with the Blittable classes a bit. The managed heap is basically empty if I deserialize a 200MB json file. That is great. Even when I access the strings as LazyStrings it is still quite fast. But what is the story when you have deseralized the json into a BlittableJsonReaderObject and you want to keep it? You cannot add or change items in it since it is read only. Is this class meant as intermediary only to later materialize a normal CLR object or is it possible to mutate the BlittableJsonReader..... objects. If you still need to copy things into normal objects you loose the gained speed while reading very quickly. My current test looks like

                using (var jsonContext = new JsonOperationContext(20 * 1024 * 1024, 30 * 1024 * 1024))
                {
                    var json = jsonContext.ReadForDisk(inFile, "state");
                    var list = (BlittableJsonReaderArray)json["LargeList"];
                    List<string> data = new List<string>();
                    
                    var strSearch = "List Element 9999999";
                    for (int i=0;i<list.Length;i++)
                    {
                        var tmp = (LazyStringValue) list[i];
                        if(strSearch == tmp )
                        {
                            Console.WriteLine($"Found at {i}");
                        }
                        
                    }
                    json.Dispose();
                }

which uses this definition

    [ProtoContract]
    [DataContract]
    public class Data
    {
        [ProtoMember(1)]
        [DataMember(Order=1)]
        public string Name
        {
            get;
            set;
        }

        [ProtoMember(2)]
        [DataMember(Order = 2)]
        public string Description
        {
            get;
            set;
        }

        [ProtoMember(3)]
        [DataMember(Order = 3)]
        public List<string> LargeList
        {
            get;
            set;
        }

        internal DynamicJsonValue ToJson()
        {
            return new DynamicJsonValue
            {
                ["Name"] = Name,
                ["Description"] = Description,
                ["LargeList"] = new DynamicJsonArray(LargeList),
            };
        }
    }

04 Jan 2017
14:41 PM

Oren Eini

Alois,

Yes, blittable is meant to be immutable, that make both the structure and working with it much simpler. You are not meant to really mutate a blittable, you would typically send it to the user (CLR class, network, etc) and then get a whole new object back.

While we do have support of applying mutations, they require re-generating the blittable.

Note that stuff like searching inside the blittable like you do would typically be the other way around. Take the CLR string and turn that into a LazyString, then compare it directly to the blittable value.

04 Jan 2017
15:10 PM

Alois Kraus

Ahh ok. So this whole Blittable thing is only there to make queries over many small or some large documents cheap for the managed heap to prevent long GC pauses if the next Gen2 or Gen1 from the finalizer due to low memory kicks in? I was hoping to get something faster than Protocol Buffers or JSON.NET which also allows mutation of the read objects. Of course it is possible to come up with hybrid objects which store the modifications in normal objects but that is not something for the generic case. No matter how hard I try I seem to hit a wall at ca. 50-80 MB/s with managed code. The only thing left would be to deserialize at different locations from the stream in parallel. But that would impose severe limitations on the object design.

04 Jan 2017
15:13 PM

Oren Eini

Alois, No, blittable is how we work with json in 4.0 Server & client side. However, we generate don't mutate json objects directly, we either pass them around (From server to client, etc) or we build them directly and sending them over the network.

04 Jan 2017
15:14 PM

Oren Eini

Alois, Let us go back a few steps. What is it that you are trying to do ?

04 Jan 2017
15:50 PM

Alois Kraus

I currently have a large (up to 200 MB) Xml document serialized with DataContracts on disk which needs to be deserialized. I am searching for something significantly faster than plain DataContracts. If I redesign that stuff I want to use the fastest library out there. Blittable Json seems like a good idea if I can easily round trip the data with modifications. I care less about serialization but mostly about deserialization performance. So far it looks like JSON.NET would be significantly faster without requiring many changes to the current object model. While reading Blittable JSON is very fast I would loose the gained speed when I need to convert the Blittable JSON into the original object model back again. Since Blittable JSON is read only I cannot toss the original object model out of the window since the whole thing needs to be modifyable as well.

04 Jan 2017
16:51 PM

Oren Eini

Do you need to read & write, or just read? Note that we provide APIs to both read & write, but it is meant for streaming, not ongoing mutation

04 Jan 2017
20:50 PM

Alois Kraus

Hard to tell since ther are quite some types serialized into that container. It will be mostly read only but for some state machine states I am not so sure if they will not change anymore after they have been deserialized back. I mainly edit other peoples code where I am seldom 100% sure what it exactly does ;-).

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB