Skip to main content

TIL: Testing Secure Zeroization in Zig with Custom Memory Allocators

TIL Programming Zig Systems Engineering Memory Management Security Zeroization
or, How Zig’s explicit memory management makes it easy to test your sanitization guarantee

While spending the summer learning Zig I have come to love and appreciate one of the language’s core philosophies: what you see is what you get.

In Zig, there’s no operator overloading and memory allocations are always explicit. Simply reading the code should reveal exactly what your software is instructing the computer to do. If a library allocates memory, the idiomatic approach is to accept a memory allocator as an input, so callers can pick the memory management strategy that matches the end use-case.

There are exceptions,1 and other languages have reasons they allow such abstractions, but I’ve enjoyed working in this style. It aligns well with my current interests and experience. When I return to other, more “sugary”, languages, using Zig has improved my feel for where the “hidden” allocations and functionality are.

An idea begins to form
#

In particular, this is useful for testing that your custom memory management strategies are, actually, working. My first “real” Zig project, the Zecrecy library, zeros out sensitive data before freeing the memory to mitigate the impact of Heartbleed-style vulnerabilities.

Because Zig is explicit and its docs are still maturing, you are reading the standard library’s source code by the time you build much more than “hello world”.2 I’d read the code for the std.mem.Allocator interface, along with several of the allocator implementations in the standard library, so I felt confident enough to try it when the idea popped into my head: a custom allocator that panics if asked to free non-zero memory.

Of course, this is not a novel idea. I learned it from the Zig Standard Library’s testing Allocator, a nifty implementation that takes advantage of Zig’s memory management paradigm. In testing, use std.testing.allocator when a call requests a memory allocator. The test will panic if the allocator detects any memory leaks during testing.3

To improve Zecrecy’s testing quality, I wrote ZerosOnlyAllocator. It’s pretty simple, it wraps a child allocator to handle the actual allocations. This has a couple benefits:

  1. I don’t have to write my own allocation logic just for a bit of testing, and
  2. If the child allocator is std.testing.allocator, then I get memory-leak detection in the same test run!

The core “functionality” lies in the free() function:

/// Panics if the memory is not zeroed before freeing.
fn free(
    ctx: *anyopaque, 
    buf: []u8, 
    alignment: mem.Alignment, 
    ret_addr: usize
) void {
    const self: *ZerosOnlyAllocator = @ptrCast(@alignCast(ctx));
    for (buf) |byte| {
        if (byte != 0) @panic("non-zero byte freed");
    }
    self.child_allocator.rawFree(buf, alignment, ret_addr);
}

I love this kind of thing because it only gains value the more tests I add. Test setup is only three lines:

const ZerosOnlyAllocator = @import("testing/ZerosOnlyAllocator.zig");
var zeros_only_allocator: ZerosOnlyAllocator = .init(std.testing.allocator);
const allocator = zeros_only_allocator.allocator();

Now, every single test automatically verifies our core security guarantee with zero additional effort.

Debugging detour
#

Except! The first time I ran unit tests using the shiny, new, custom allocator, every single test panicked:

~/zecrecy> zig test src/secret.zig
thread 3381248 panic: non-zero byte freed
~/zecrecy/src/testing/ZerosOnlyAllocator.zig:59:24: 0x104dbc753 in free (test)
        if (byte != 0) @panic("non-zero byte freed");
                       ^
~/.cache/zig/p/N-V-__8AAPWKhxNMNK6YniIioDpRryBzI7DMp0hJB4rExlGU/lib/std/mem/Allocator.zig:147:25: 0x10236238f in free__anon_5332 (test)
    return a.vtable.free(a.ptr, memory, alignment, ret_addr);
                        ^
~/zecrecy/src/secret.zig:120:34: 0x10235e12b in deinit (test)
            secret.allocator.free(secret.data);

Turns out, my assumption that reading the standard library once makes one an expert was, unsurprisingly, wrong. The custom allocator’s logic was correct, but the standard library contained some unexpected functionality that broke my assumptions:

/// From `std.mem.Allocator`:
///
/// Free an array allocated with `alloc`.
/// If memory has length 0, free is a no-op.
/// To free a single item, see `destroy`.
pub fn free(self: Allocator, memory: anytype) void {
    const Slice = @typeInfo(@TypeOf(memory)).pointer;
    const bytes = mem.sliceAsBytes(memory);
    const bytes_len = bytes.len + 
                      if (Slice.sentinel() != null) @sizeOf(Slice.child) else 0;
    if (bytes_len == 0) return;
    const non_const_ptr = @constCast(bytes.ptr);
    @memset(non_const_ptr[0..bytes_len], undefined);
    self.rawFree(non_const_ptr[0..bytes_len], .fromByteUnits(Slice.alignment), @returnAddress());
}

All tests panicked because std.mem.Allocator.free4 writes undefined into freed memory before calling .rawFree. In Debug and ReleaseSafe modes Zig fills undefined bytes with a debug pattern (0xAA) to catch use-after-free bugs,5 which meant my test allocator saw non-zero bytes.

The solution? Bypass the layer altogether and call .rawFree() directly.6 Now, we can verify zeroization at the layer closest to deallocation.

mem.Allocator.free does what it does for a reason, and I don’t skip it lightly. The name .rawFree is meant to make you nervous. For example, in Debug or ReleaseSafe mode, setting all bytes to undefined means you get some use-after-free protection that we could be losing by calling .rawFree instead.

However, in this case, I felt comfortable with this choice after familiarizing myself.

The first three lines of Allocator.free are because memory can be anytype, so the code verifies memory is a slice, handles a potential sentinel, etc. This the only function I call .rawFree in is .deinit, which only accepts slices. No @typeInfo magic needed.

What about the @memset? Zecrecy uses standard library’s crypto.secureZero function to wipe memory, which uses techniques intended to avoid optimization removal. However, those techniques have their own flawed history and corner cases. Test reinforcement is always valuable, never take security for granted.

Outside of actually calling .rawFree, the only change I is to return early if the data length is 0. This has no security/correctness benefits but offers a small performance bump by avoiding the virtual function call when it’s not needed. Due to the nature of the library (how often will someone will be initializing a secret of length 0?), I think even if I had skipped this no-op opportunity, the .rawFree call is fine.

And boom! All tests pass:

~/zecrecy> zig test src/secret.zig
All 15 tests passed.

With each one, I can feel a tiny bit better knowing all secrets were actually zeroed before they were freed.

Conclusion
#

The effort required was low for the value gained, I’ve barely scratched the surface of using custom memory allocators to improve testing.7 Reducing the “black box” opaqueness that sometimes hangs like a fog over unit tests is a nice tool to have. I’m optimistic that Zig’s upcoming async/IO changes will provide similar opportunities to get surgical during testing.

And, of course, building test harnesses that enforce application invariants is far from unique to Zig. It’s a powerful technique with demonstrated value. Zig’s win, for me, is in making every decision explicit and shoving the reality of an implementation into your face at all times. You naturally think deeper about how to wield the verbosity in your favor.

Have any cool, weird, or otherwise nerdy uses you’ve gotten out of Zig’s memory management explicitness? Do you have a favorite method for testing your system’s security guarantees? Send me an email or a message!

TL;DR:

The Drake meme. On top, Drake rejecting the common refrain “RTFM” (Read The
F***ing Manual). On bottom, Drake approving the text “RTFSTD” (Read The
F***ing Standard Library.


  1. Well 🫵, actually ☝️, any language at all is an abstraction and, therefore 🤓, a lie 😌. ↩︎

  2. I recently heard Mitchell Hashimoto on a podcast say when he’s learning a new language, his first step is to read the standard library. After, effectively, being forced to do so to truly learn Zig, I’m ready to co-sign this approach (massive news, I know). I need to try with at least one more language to be sure - Zig, by design, is relatively simple. Reading the standard library too early could be unnecessarily overwhelming in some other languages. ↩︎

  3. So far, this has only caught defer .deinit() statements missing from the tests themselves. But, that helped build the habit of using Zig’s defer keyword and memory management style. It’s because of this, and very nice feedback on Zig’s discord (thank you @silversquirl), that I built the Zecrecy library around a defer .deinit() management pattern. ↩︎

  4. Not to be mistaken with std.mem.Allocator.vtable.free↩︎

  5. At other optimization levels, the compiler probably removes the memset operation completely. This is why libraries like Zecrecy are important, compilers have free reign to do unsightly things to your code in the name of optimization. Even when you try your hardest and follow best practices, it can be impossible to completely stop the compiler from doing things like this. But that’s the subject of a future rant :). ↩︎

  6. I considered modifying .deinit to check if it’s being called from a test, using either .rawFree or .free based on the result. But, I dislike the idea of creating branches designed to be untestable. I could be swayed, but a little responsibility is fine if it means tests follow the same paths as normal library use. ↩︎

  7. Do you test if your system gracefully and securely handles failing memory allocations? Well, in Zig you easily can with std.testing.FailingAllocator↩︎

Eli Grubb
Author
Eli Grubb
I am a privacy-oriented software engineer with a strong foundation in applied cryptography, reliable data systems, and secure system design.