Home

A different memory pool in Rust

Written on 2024-06-16

When discussing memory pools with a colleague of mine who writes Rust full time some time back, I was surprised that he had an approach to them that was fairly different from mine. His natural approach was that the user would create objects and move them into the pool, rather than allocating objects from the pool itself. This design was intriguing to me as it is not suitable for most of the use cases for which I would typically reach for a memory pool, which I will discuss later. This started to make a lot more sense to me after I looked for Rust memory pool crates which used my preferred design to show him what I had in mind: they essentially didn’t exist[1]! Since he had never had a need for memory pools himself, he was conditioned by the APIs he had seen when looking into those out of curiosity. After that, I started to wonder why Rust memory pools all look like that: is my preferred design too unergonomic in Rust, or maybe it involves more unsafe than Rust programmers are comfortable with? To find out, I set out to write one for myself and see what would happen.

Memory pool designs

First off, I should clarify that what I mean by memory pool is a data structure that allows for low-latency and low-jitter object allocation and deallocation. It is not a general purpose memory allocator, and it allows for independent per-object lifetimes, as long as those are shorter than the pool’s.

Here is what I want out of a memory pool:

Implementing my memory pool

The plan

The properties I wanted from my design were, in order:

I decided to go with 2 arrays of fixed size: one holding the objects, the other used as a stack, holding the indices of the free objects. To allocate an object, one pops an index and returns a reference to the associated object. To free an object, one pushes the index back onto the stack and resets the object.

Naïve implementation

Here is a naïve implementation, sans the reset:

// We require `T` to be `Default` for convenience, but we could also ask the
// user to provide initial values instead.
struct Pool<T: Default> {
    // `Vec` is more powerful than what we need and slightly wasteful, but works
    // for our purpose.
    data: Vec<T>,
    free: Vec<usize>,
}

impl<T: Default> Pool<T> {
    pub fn new() -> Self {
        Self {
            data: (0..1024).map(|_| T::default()).collect(),
            free: (0..1024).collect(),
        }
    }

    pub fn allocate<'a>(&mut self) -> Option<&'a mut T> {
        self.free
            .pop()
            .map(|i| unsafe { self.data.as_mut_ptr().add(i).as_mut().unwrap() })
    }

    pub unsafe fn deallocate<'a>(&mut self, reference: &T) {
        let ptr = std::ptr::from_ref(reference);
        let element_index = unsafe { ptr.offset_from(self.data.as_ptr()) };
        self.free.push(element_index as usize);
    }
}

Note that deallocate is unsafe because it accepts any reference, even if it was not allocated via the pool. We could make it safe by adding runtime checks to check the reference’s provenance.

OwnedRef

The code above is nice and short (if incomplete), except that it is also… unsound. Unsurprisingly, the issue has to do with our usage of unsafe. Perhaps more suprisingly is the fact that the bug does not really live within the unsafe block in allocate, but rather in the safe deallocate function. What happens is that we hand out manually crafter references in allocate, but we do not take them back in deallocate, we merely use them. Rust ownership system rightfully lets many functions take the same reference, because references are not linear types. Note that this would not be different if deallocate took a mutable/exclusive reference: multiple shared borrows are fine as long as they don’t overlap.

More concretely, what is going on is that after we deallocate an object, the caller still has a reference to it, which it may use. This fine until the object gets reused and handed out again as another mutable reference, which violates Rust’s aliasing rules and introduces undefined behavior:

fn main() {
    let mut pool = Pool::<usize>::new();

    let x = pool.allocate().unwrap();
    *x = 1234;
    unsafe { pool.deallocate(x) };

    let y = pool.allocate().unwrap();
    // If the values are identical, we can reasonably assume that both `x` and `y` point to the
    // same object.
    assert_eq!(*y, 1234);

    // Use after free!
    *x = 2000;
    // This reference was exclusive!
    assert_eq!(*y, 2000);
}

So what do we do? We need to somehow add produce/consume semantics to the references that we hand out to make sure that the caller cannot use an object after freeing it. Since this is just what ownership semantics for structs are in Rust, we can simply use that and wrap our reference in a struct: the compiler will do the rest for us:

pub struct OwnedRef<'a, T> {
    reference: &'a mut T,
}

We just have to wrap the references we hand out…

    pub fn allocate<'a>(&mut self) -> Option<OwnedRef<'a, T>> {
        self.free.pop().map(|i| OwnedRef {
            reference: unsafe { self.data.as_mut_ptr().add(i).as_mut().unwrap() },
        })
    }

… and unwrap them when getting them back:

    pub unsafe fn deallocate<'a>(&mut self, owned_ref: OwnedRef<T>) {
        let reference = owned_ref.reference;
        let ptr = std::ptr::from_ref(reference);
        let element_index = unsafe { ptr.offset_from(self.data.as_ptr()) };
        self.free.push(element_index as usize);
    }

Here is what the caller will look like:

fn main() {
    let mut pool = Pool::<usize>::new();

    let x = pool.allocate().unwrap();
    // We can access the underlying reference through the `OwnedRef`.
    *x.reference = 1234;
    unsafe { pool.deallocate(x) };

    // Does not compile anymore!
    *x.reference = 2000;
}

Note that taking references to that OwnedRef, whether shared or exclusive, adds overhead when accessing the underlying object, since there are two pointers to dereference to get to it. This is not a concern to me:

If the extra reference is required but unacceptable for your use case, I’m not sure how to best deal handle it. If your requirements are simple and static enough, using the typestate to convert an OwnedRef into multiple SharedRefs by consuming it might work for you, but I have not explored that and I can imagine this not being very simple or ergonomic. At this point, it may be fair to introduce more unsafe, though that would not be ideal.

Reset

Now that we have a sound memory pool, let’s make it support the feature which arguably started all this: deep reuse of objects. To do so, we can simply require the type that we pool to implement a trait that let’s us reset them, and call that when deallocating an object:

trait Reset {
    fn reset(&mut self);
}

After adding the Reset trait bound to the Pool struct, we call reset in deallocate:

    pub unsafe fn deallocate<'a>(&mut self, owned_ref: OwnedRef<T>) {
        let reference = owned_ref.reference;
        reference.reset();
        let ptr = std::ptr::from_ref(reference);
        let element_index = unsafe { ptr.offset_from(self.data.as_ptr()) };
        self.free.push(element_index as usize);
    }

We favor running this logic during deallocation since it is typically less latency-sensitive than allocation. We make the assumption that calling reset is cheap, but we otherwise could also buffer it and defer it for later, adding a new API that can be called during idle times.

Since we own this trait, we can implement it for any type we want, including standard ones:

impl Reset for String {
    fn reset(&mut self) {
        self.clear();
    }
}

fn main() {
    let mut pool = Pool::<String>::new();

    let s = pool.allocate().unwrap();
    *s.reference = "Hello, World!".to_owned();
    unsafe { pool.deallocate(s) };

    let s = pool.allocate().unwrap();
    assert_eq!(s.reference.len(), 0);
    assert_eq!(s.reference.capacity(), 13);
}

Depending on your needs, an implementation of reset might be a no-op, a trivial reset (e.g. setting integers to 0), a clearing of a container like String or Vec, or something fancier. In particular, dynamically nested structures that require expensive construction beyond the top level (e.g. Vec<String>) are typically not trivial to handle.

Final code listing

Here is what the toy pool looks like after all the modifications that this article went over:

trait Reset {
    fn reset(&mut self);
}

// We require `T` to be `Default` for convenience, but we could also ask the
// user to provide initial values instead.
struct Pool<T: Default + Reset> {
    // `Vec` is more powerful than what we need and slightly wasteful, but works
    // for our purpose.
    data: Vec<T>,
    free: Vec<usize>,
}

pub struct OwnedRef<'a, T> {
    reference: &'a mut T,
}

impl<T: Default + Reset> Pool<T> {
    pub fn new() -> Self {
        Self {
            data: (0..1024).map(|_| T::default()).collect(),
            free: (0..1024).collect(),
        }
    }

    pub fn allocate<'a>(&mut self) -> Option<OwnedRef<'a, T>> {
        self.free.pop().map(|i| OwnedRef {
            reference: unsafe { self.data.as_mut_ptr().add(i).as_mut().unwrap() },
        })
    }

    pub unsafe fn deallocate<'a>(&mut self, owned_ref: OwnedRef<T>) {
        let reference = owned_ref.reference;
        reference.reset();
        let ptr = std::ptr::from_ref(reference);
        let element_index = unsafe { ptr.offset_from(self.data.as_ptr()) };
        self.free.push(element_index as usize);
    }
}

impl Reset for String {
    fn reset(&mut self) {
        self.clear();
    }
}

Note that this is simple enough that it is in my opinion not worth having a dedicated crate for it. It is generic, so we don’t need to copy paste it for each type that we want to pool, but also simple enough that we can adapt it to our needs by making a new type, rather than adding complexity to a single implementation.

Going further

The pool implemented for this article is a toy. Here are various improvements one could make to it, some of which I consider almost necessary:


[1] I have since had a closer look, and some actually come closer than I initially thought, but none are exactly what I have in mind, and I want to write this article anyway, so here we go anyway. Jump back