Ayende @ Rahien

Sep 09 2015

Buffer allocation strategiesBad usage patterns

time to read 3 min | 532 words

Tags:

design

In my previous post, I discussed a potential implementation for a buffer pool, and commented that it would cause issues if we had particular usage patterns.

As a reminder, here is the code:

    [ThreadStatic] private static Stack<byte[]>[] _buffersBySize;

    private static byte[] GetBuffer(int requestedSize)
    {
        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var actualSize = PowerOfTwo(requestedSize);
        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();

        if(_buffersBySize[pos].Count == 0)
            return new byte[actualSize];

        return _buffersBySize[pos].Pop();
    }

    private static void ReturnBuffer(byte[] buffer)
    {
        var actualSize = PowerOfTwo(buffer.Length);
        if(actualSize != buffer.Length)
            return; // can't put a buffer of strange size here (probably an error)

        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();


        _buffersBySize[pos].Push(buffer);
    }

Now, consider a user who uses this buffer pool, but for some reason decided to move most of the disposal code to a dedicated thread. This can happen when using large files, where disposing of the file stream can take a very long time ( flushing OS buffers, etc). I have seen several places where people had a dedicated disposal threads.

What would happen in such a scenario?

Well, we would allocate buffers in threads #1 – #10, and only return them to the buffer pool on thread #12. That would mean that we would keep allocating new buffers, but all the buffered that were returned to the pool would actually go and sit in the threads that are just releasing them. Welcome memory leak Smile .

Buffer allocation strategiesExplaining the solution

time to read 4 min | 662 words

Tweet Share Share 10 comments

Tags:

design

In my previous post, I threw a bunch a code at you, with no explanation, and asked you to discuss it.

Here is the code, with full discussion below.

    [ThreadStatic] private static Stack<byte[]>[] _buffersBySize;

    private static byte[] GetBuffer(int requestedSize)
    {
        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var actualSize = PowerOfTwo(requestedSize);
        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();

        if(_buffersBySize[pos].Count == 0)
            return new byte[actualSize];

        return _buffersBySize[pos].Pop();
    }

    private static void ReturnBuffer(byte[] buffer)
    {
        var actualSize = PowerOfTwo(buffer.Length);
        if(actualSize != buffer.Length)
            return; // can't put a buffer of strange size here (probably an error)

        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();


        _buffersBySize[pos].Push(buffer);
    }

There are a couple of interesting things going on here. First, we do allocations by power of two number, this reduce the number of different sizes we have to deal with. We store all of that in a small array (using the most significant bit to index into the array based on the requested size) that contains stacks for all the requested sizes.

In practice, most of the time we’ll use a very small number of sizes, typically 4KB – 32KB. The basic idea is that you’ll pull an array from the pool, and if there is a relevant one, we save allocations. If not, we allocate a new one and return it to the user.

Once we gave the user a buffer, we don’t keep track of it. If they return it to us, this is great, if not, the GC will clean it up. This is important, because otherwise forgetting to call ReturnBuffer creates what is effectively a memory leak.

Another thing to notice is that we aren’t requiring that the same thread will be used for getting and returning the buffer. It is fine to use one thread to get it and another to return it. This means that async code will work well with thread hopping and this buffer pool. We also use a stack, to try to keep the busy buffer close to the actual CPU cache.

Note that this is notepad code, so there are probably issues with it.

In fact, there is a big issue here that will only show up in particular usage patterns. Can you see it? I’ll talk about it in my next post.

Buffer allocation strategiesA possible solution

time to read 3 min | 427 words

Tweet Share Share 12 comments

Tags:

design

After my recent posts about allocations, I thought that I would present a possible solution for buffer management.

The idea is to have a good way to manage buffers for things like I/O operations, etc. Here is the code:

    [ThreadStatic] private static Stack<byte[]>[] _buffersBySize;

    private static byte[] GetBuffer(int requestedSize)
    {
        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var actualSize = PowerOfTwo(requestedSize);
        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();

        if(_buffersBySize[pos].Count == 0)
            return new byte[actualSize];

        return _buffersBySize[pos].Pop();
    }

    private static void ReturnBuffer(byte[] buffer)
    {
        var actualSize = PowerOfTwo(buffer.Length);
        if(actualSize != buffer.Length)
            return; // can't put a buffer of strange size here (probably an error)

        if(_buffersBySize == null)
            _buffersBySize = new Stack<byte[]>[32];

        var pos = MostSignificantBit(actualSize);

        if(_buffersBySize[pos] == null)
            _buffersBySize[pos] = new Stack<byte[]>();


        _buffersBySize[pos].Push(buffer);
    }

I’m going to discuss it in my next post, but for now, can you figure out what it is doing, and what are the implications?

Oren Eini

Oren Eini

CEO of RavenDB

Buffer allocation strategiesBad usage patterns

Buffer allocation strategiesExplaining the solution

Buffer allocation strategiesA possible solution

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed