Wednesday 7 September 2011

A sample STL Allocator


Where the hell did my memory go?

No leaks, not enough data to worry about RAM of server still my program sucked. So it all started while looking up for the lost memory. Even with relatively small data and no leaks my program was showing high memory usage. And I suspected something wrong cooking inside STL containers and first thing to check was STL standard allocator.

So, what is a STL Allocator?

A STL Allocator is the class that governs the memory allocation, de-allocation policies for a particular STL container. It is often neglected template parameter to STL container class. For example, in case of map class it is the fourth parameter.

template < class Key, class T, class Compare = less<Key>,
           class Allocator = allocator<pair<const Key,T> > > class map;

As can be seen in above definition of map the Allocator is by default set to the standard allocator that comes with STL implementation.

What does the standard Allocator do?

A STL implementation may provide a complex Allocator working on various strategies but the standard Allocator (as provided on HP Itanium systems, RougeWave) does a simple job. Allocator contains 4 important functions.

1.      Allocate
2.      Deallocate
3.      Construct
4.      Destroy

Function Allocate allocates the new memory for the container. Interestingly, this does not invoke the constructor of the class but only allocates memory for it. So this function in standard allocator is a simple wrapper over malloc.

Function Deallocate does the job of freeing the allocated memory, note that it does not call destructor of the object. In standard implementation, it is a thin wrapper over free.

The job of calling constructor and destructor is done by Construct and Destroy functions respectively.

What is the need of allocating and constructing separately?

Allocating a memory requires the calling process to call brk/sbrk (on Linux for larger blocks, this can be mmap too) system call which does a lot of checking and processing and finally giving the allocated memory. If we allocate memory for an item in container on the arrival of insert request only then a lot of time would be spent allocating memory.

The better way to do it was to allocate the memory beforehand and then creating the objects on arrival of insert request. The creation part is done using “placement new” operator. The same logic applies to Deallocate and Destroy functions.

This approach obviously leads to wastage of memory and that is the reason why application should not blindly use standard allocator. The application developer must recognize the memory usage pattern of the application and then devise strategies which lead to optimal results. But this is not in the scope of this article.

How do I write my own Allocator?

Writing a custom allocator may spawn over thousand lines of code but writing one for debugging is pretty easy. An Allocator must comply with the C++ standard definition for allocators. I will explain the definitions with the help of a sample allocator (similar to the standard one).

A sample STL Allocator code:

#include<memory>
#include<cstdlib>
#include<iostream>

namespace mystd {
 
template<typename T>
class MyAllocator{
public :
    //    typedefs

    typedef T value_type;
    typedef value_type* pointer;
    typedef const value_type* const_pointer;
    typedef value_type& reference;
    typedef const value_type& const_reference;
    typedef std::size_t size_type;
    typedef std::ptrdiff_t difference_type;

public :
    //    convert an allocator<T> to allocator<U>

    template<typename U>
    struct rebind {
        typedef MyAllocator<U> other;
    };

public :
    inline explicit MyAllocator() {}
    inline ~MyAllocator() {}
    inline explicit MyAllocator(MyAllocator const&) {}
    template<typename U>
    inline explicit MyAllocator(MyAllocator<U> const&) {}

    //    address

    inline pointer address(reference r) { return &r; }
    inline const_pointer address(const_reference r) { return &r; }

    //    memory allocation

    inline pointer allocate(size_type cnt,
       typename std::allocator<void>::const_pointer = 0) {
      std::cout<<"Trying to allocate "<<cnt<<" objects in memory"<<std::endl;
      pointer new_memory = reinterpret_cast<pointer>(::operator new(cnt * sizeof (T)));
      std::cout<<"Allocated "<<cnt<<" objects in memory at location:"<<new_memory<<std::endl;
      return new_memory;
    }
    inline void deallocate(pointer p, size_type n) {
        ::operator delete(p);
        std::cout<<"Deleted "<<n<<" objects from memory"<<std::endl;
    }
    //    size
    inline size_type max_size() const {
        return std::numeric_limits<size_type>::max() / sizeof(T);
 }

    //    construction/destruction

    inline void construct(pointer p, const T& t) {
      std::cout<<"Constructing at memory location:" <<p<<std::endl;
      new(p) T(t);
    }
    inline void destroy(pointer p) {
      std::cout<<"Destroying object at memory location:" <<p<<std::endl;
      p->~T();
    }

    inline bool operator==(MyAllocator const&) { return true; }
    inline bool operator!=(MyAllocator const& a) { return !operator==(a); }
};    //    end of class MyAllocator
} // end of namespace mystd


Let’s call file containing the above code as MyAllocator.h.

As said above the custom allocator must adhere to standard definitions.
The standards require the allocator to define types of pointer to T (pointer), pointer to constant T (const_pointer), reference to T (reference), reference to constant T (const_reference), type of T itself (value_type), an unsigned integral type that can represent the size of the largest object in the allocation model (size_type), as well as a signed integral type that can represent the difference between any two pointers in the allocation model (difference_type).

The standard also requires allocator to define a structure named rebind. This is required to change the template class passed to Allocator. Let’s say the STL container is a map of integers to integers. Now when we need to add a new entry to this map we call allocate function of Allocator. But calling allocate of Allocator<pair<int, int> > would get the space required to store a pair of integers whereas container map stores the data as Node. Node will contain pointers to right and left children, color of the node etc information whose memory requirement is not same as pair of integers. Then the rebind will do the magic as allocate will be called of Allocator<pair <int, int> >::rebind< Node<pair<int, int> > > ::other.

The other functions required in the definition are:

1.      Address (): This function returns the pointer to value_type when a reference is passed. Passing constant reference should result in constant pointer.
2.      Allocate () :  explained earlier. Please note the use of ‘operator new’ in code.
3.      Deallocate(): explained earlier. Please note the use of ‘operator delete’ in code.
4.      Construct(): Creates object of the value type in the allocated memory using placement new.
5.      Destroy(): Destroys the object. All objects should be destroyed before deallocation of the memory storing them.

Remaining are the constructor and destructor of Allocator class.

How do I use it?

You can use custom allocator by passing custom allocator class in template of STL container. Below code uses the allocator defined above.


#include "MyAllocator.h"
#include <iostream>
#include <map>
#include <string>
#include <cstdlib>

int main(){
        std::map<std::string, std::string, std::less<std::string>, mystd::MyAllocator<std::pair<const std::string, std::string> > > x;
        long i = 0;
        const std::string key_part = "abc";
        std::string value = "abc";
        for(; i < 10; i++){
                std::string key = key_part + ltoa(i);
                x.insert(std::make_pair(key,value));
        }
}

Using the prints from the custom allocator I could tell where memory went. It was held by allocator in the hope that there will be future inserts to fill in.

Conclusion

When you know beforehand how many records needs to be inserted in STL container use “reserve”. But STL map container does not have a "reserve" method. So if you have to cache something in map, better write your own allocator rather than wondering where the hell did my memory go.





4 comments:

  1. Well written..!! but your doesn't seem to compile. First of all ltoa is not there in C++, although we can use std::to_string in C++11...! Please check back, so that it is helpful to others as well..

    ReplyDelete
  2. Allocating some bulk memory for future use, addresses malloc mutex bootleneck in multithread enviroment. speed v.s. memory size balancing.

    ReplyDelete