Described in this paper are two related memory allocators, each with novel properties: PALLOC1 and PALLOC2.
PALLOC1 contributes a unique strategy based on the traversal of a parallel tree data structure for allowing concurrent allocations and frees to proceed within a single thread's heap.
It also provides a novel, provable guarantee limiting the allocator's requests for more memory from the operating system to only those situations where no contiguous block is available to satisfy the allocation request, and a pure bitmap allocation strategy speed-competitive even for sequential codes with the boundary-tag / binning strategy used in dl- malloc.
We find that for larger allocation patterns, our implementation exhibits competitive base performance relative to other parallel allocators, superior scaling, and better resistance in practice to fragmentation.PALLOC2 contributes a second unique strategy for memory allocation based on bitmap allocation into variable-sized superpages.
Our system provides the runtime with the useful ability to, given an arbitrary heap address, find both the start of the heap allocation and the size of the object allocated.
Thus, PALLOC2 provide the capabilities of baggy bounds checking with no performance impact. In fact, we _find that for both sequential and parallel programs, PALLOC2's performance is superior to PALLOC1 and to other state-of-the-art allocators including Hoard, DLMalloc, and Streamflow for allocations of all sizes.
(To read this external content in full, download the paper from the author archives at University of Illinois at Urbana-Champaign. )