Tuesday, February 24, 2015

Stack allocated containers

No matter what fancy allocator you come up with, memory allocation will always be expensive. In order to reduce memory allocations I have been using stack-allocated containers for the past four-five years and I think it has worked really well, so I thought I'd write a post about them here. These methods have been used in all our games (Sprinkle, Granny Smith, Smash Hit as well as our new project).

Using containers is really convenient and often necessary. Take the array for example:

MyArray<Object> array;

Everybody's got one. The problem with these are that you need to allocate memory when putting stuff in them. In many cases you know on beforehand approximately how many objects you want to put in there, reducing it to a single allocation:

MyArray<Object> array;
array.reserve(50);

But if this code is in a function that is called frequently it would be much nicer if those first fifty objects were reserved on the stack instead of the heap, like this:

MyArray<Object, 50> array;

Now, the array object would have built-in storage for fifty objects and still be able to grow beyond that using regular heap allocations. Great! Or? What if you want to pass such an array by refence to a method? Say, a string splitting method:

void splitString(const String& del, MyArray<Object, 50>& result);

The obviously problem is that the output array now has to be stack allocated for exactly fifty objects. If we try to pass another array it won't compile, because C++ requires arguments with matching template arguments:

MyArray<Object, 8> result;
String str="this is a test";
str.split(" ", result);

Compiler error!

This effectively kills the entire charm of using template arguments for the array size. So how can we design a container class that can be passed around by reference while still offering stack allocation with a template argument?

What I've done is to create a base class, without stack allocation that can be used as a regular array:

template<class T>
class MyArray
{
    ...
    void* mData;
};

And then a subclass for the stack-allocated version:

template<class T, int size>
class MyArrayStack : public MyArray
{
    MyArrayStack()
    {
        mData = mStackStorage;
    }
    ...
    char mStackStorage[size*sizeof(T)];
};

The problem now becomes resizing the array at the base class, because the base class won't know wether the data storage is heap or stack allocated, and we don't want to have virtual methods and a vtable for such a lightweight class. Therefore, the base class needs to be aware of the subclass and avoid freeing memory when it is stack-allocated.

Fortunately this is not very complicated, since the stack allocation will always be placed immediately after the array object itself in memory, so we can just look at the pointer when resizing. If the storage pointer is right after the object itself in memory we have a stack allocation:

void resize(int newSize)
{
    ... allocate heap memory and copy over contents from mData
    if (mData != ((char*)this)+sizeof(MyArray))
        free(mData);  
}

Now we can modify the string splitting method to accept the base class:

void splitString(const String& delimiter, MyArray& result);

This will enable us to pass any stack allocated size array we want:

MyArrayStack<String, 8> result;
str.splitString(" ", result);

or

MyArrayStack<String, 128> result;
str.splitString(" ", result);

Compiler happy!

So the next big question would be – is this safe? There is still a small (well microscopic) chance that a non-stack allocated array object lines up right at the end of the last stack frame, and at the next memory adress is our heap-allocated storage data for it. In practice though this won't happen, because in standard memory allocators there is a header before the allocated data. Even if you use your own allocator without a header, this won't happen, because the memory area it is using would still need to be allocated by the OS, which will put a header in front of it.

I'm currently using this for arrays, sets, hash tables as well as memory streams, memory buffers and a few other things. It has been incredibly useful and probably one of the most important optimizations I have ever added to my framework. This effectively avoids memory allocations altogether without the hassle of using the typical "maxSize" for method arguments. I'm currently not using it for string objects (which all have a fixed stack allocated storage that depends on the project), but I'm kind of tempted to refactor that.

20 comments:

  1. How about move semantics? Do you deep copy in that case or disallow moving all together?

    You also need to be carfull with alignment for stack array. Do you use the new std::aligned_storage<> for this?

    ReplyDelete
    Replies
    1. Stack memory would be properly aligned if we change mStackStorage to be an array of type T rather than char. No need for STL.

      Delete
  2. Great article. Any chance for more complete example of the code ?

    ReplyDelete
    Replies
    1. Check out LLVM's SmallVector:

      http://llvm.org/docs/doxygen/html/SmallVector_8h_source.html

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. This post is so informative and makes a very nice image on the topic in my mind. It is the first time I visit your blog, but I was extremely impressed. Keep posting as I am gonna come to read it everyday!
    music technology news

    ReplyDelete
  5. Most of the queries are clear after reading this article. Meet and greet Luton airport parking

    ReplyDelete
  6. This information is clear and and resolve my queries. luton valet parking

    ReplyDelete
  7. Do you use unity for game development? I use it too. cheap luton airport parking

    ReplyDelete
  8. I'm in no doubt coming back again to read these articles and blogs.

    Click Here

    ReplyDelete
  9. Muchas gracias por tu aporte para conocer más sobre estos juegos de romper cristales en Android.

    ReplyDelete
  10. Ahh.... the effort you have to make for the smallest things...
    buy online tickets
    buy discount tickets

    ReplyDelete
  11. > ...chance that a non-stack allocated array object lines up right at the end of the last stack frame...

    Actually, I don't believe it's possible, ever. Stack grows backwards, from higher addresses to lower. The pointer in your case precedes the stack-allocated buffer, so the pointer would have to be out of stack space first, which would cause an immediate fault. And your object may not be the first thing allocated on the stack because there's at least the stack frame of the thread bootstrap (your thread function needs to return somewhere, right). So you're safe on that front.

    I'm personally a bit wary of allocating on stack directly though, I like to have a separate stack allocator. It gives you finer control over stack size (no need to change the main thread stack size, but annoying in multithreaded environments without fast TLS)

    ReplyDelete