1## Custom memory allocator 2 3LibSass comes with a custom memory allocator to improve performance. 4First included in LibSass 3.6 and currently disabled by default. 5Needs to be enabled by defining `SASS_CUSTOM_ALLOCATOR`. 6 7### Overview 8 9The allocator is a pool/arena allocator with a free-list on top. The 10memory usage pattern of LibSass fits this implementation very well. 11Every compilation tends to accumulate memory and only releasing some 12items from time to time, but the overall memory consumption will mostly 13grow until the compilation is finished. This helps us to keep the 14implementation as simple as possible, since we don't need to release 15much memory back to the system and can re-use it instead. 16 17### Arenas 18 19Each arena is allocated in a (compile time) fixed size. Every allocation 20request is served from the current arena. We basically slice up the 21arena into different sized chunks. Arenas are never returned to the 22system until the whole compilation is finished. 23 24### Slices 25 26A memory slice is a part of an arena. Once the system requests a sized 27memory chunk we check the current arena if there is enough space to 28hold it. If not a new arena is allocated. Then we return a pointer 29into that arena and mark the space as being used. Each slice also 30has a header which is invisible to the requester as it lies before 31the pointer address that we returned. This is used for book-keeping. 32 33### Free-lists (or buckets) 34 35Once a memory slice is returned to the allocator it will not be released. 36It will instead be put on the free list. We keep a fixed number of free lists, 37one for every possible chunk size. Since chunk sizes are memory aligned, we 38can get the free-list index (aka `bucket`) very quickly (`size/alignment`). 39For further readings see https://en.wikipedia.org/wiki/Free_list. 40 41### Chunk-sizes 42 43Since arenas are of fixed size we need to make sure that only small 44enough chunks get served from it. This also helps to keep implementation 45simple, as we can statically declare some structures for book-keeping. 46Allocations that are too big to be tracked on a free-list will be patched 47directly to malloc and free. This is the case when the bucket index would 48be bigger than `SassAllocatorBuckets`. 49 50### Thread-safety 51 52This implementation is not thread-safe by design. Making it thread-safe 53would certainly be possible, but it would come at a (performance) price. 54Also it is not needed given the memory usage pattern of LibSass. Instead 55we should make sure that memory pools are local to each thread. 56 57### Implementation obstacles 58 59Since memory allocation is a core part of C++ itself, we get into various 60difficult territories. This has specially proven true in regard of static 61variable initialization and destruction order. E.g when we have a static 62string with custom allocator. It might be that it is initialized before 63the thread local memory pool. On the other hand it's also possible that 64the memory pool is destroyed before another static string wants to give 65back its memory to the pool. I tried hard to work around those issues. 66Mainly by only using thead local POD (plain old data) objects. 67 68See https://isocpp.org/wiki/faq/ctors#static-init-order 69 70### Performance gains 71 72My tests indicate that the custom allocator brings around 15% performance 73enhancement for complex cases (I used the bolt-bench to measure it). Once 74more get optimized, the custom allocator can bring up to 30% improvement. 75This comes at a small cost of a few percent of overall memory usage. This 76can be tweaked, but the sweet spot for now seems to be: 77 78```c 79// How many buckets should we have for the free-list 80// Determines when allocations go directly to malloc/free 81// For maximum size of managed items multiply by alignment 82#define SassAllocatorBuckets 512 83 84// The size of the memory pool arenas in bytes. 85#define SassAllocatorArenaSize (1024 * 256) 86``` 87 88These can be found in `settings.hpp`. 89 90### Memory overhead 91 92Both settings `SassAllocatorBuckets` and `SassAllocatorArenaSize` need 93to be set in relation to each other. Assuming the memory alignment on 94the platform is 8 bytes, the maximum chunk size that can be handled 95is 4KB (512*8B). If the arena size is too close to this value, you 96may leave a lot of RAM unused. Once an arena can't fullfil the current 97request, it is put away and a new one is allocated. We don't keep track 98of unused space in previous arenas, as it bloats the code and costs 99precious cpu time. By setting the values carefully we can avoid the cost 100and still provide reasonable memory overhead. In the worst scenario we 101loose around 1.5% for the default settings (4K of 256K). 102 103### Further improvements 104 105It is worth to check if we can re-use the space of old arenas somehow without 106scarifying to much performance. Additionally we could check free-lists of 107bigger chunks sizes to satisfy an allocation request. But both would need 108to be checked for performance impact and their actual gain.