Static Allocation with Zig

(nickmonad.blog)

67 points | by todsacerdoti 1 hour ago

9 comments

Zambyte 1 hour ago
I'm doing pretty much this exact pattern with NATS right now instead of Redis. Cool to see other people following similar strategies.
The fact that the Zig ecosystem follows the pattern set by the standard library to pass the Allocator interface around makes it super easy to write idiomatic code, and then decide on your allocation strategies at your call site. People have of course been doing this for decades in other languages, but it's not trivial to leverage existing ecosystems like libc while following this pattern, and your callees usually need to know something about the allocation strategy being used (even if only to avoid standard functions that do not follow that allocation strategy).
[-]
- nickmonad 1 hour ago
  I have a few cases in this (proof of concept) codebase that require knowledge about allocation strategy, even in Zig, but that's on me and the design at this point. Something I wanted to touch on more in the post was the attempt to make the components of the system work with any kind of allocation strategy. I see a common thing in Zig projects today where something like `gpa: std.mem.Allocator` or even `arena: std.mem.Allocator` is used to signal intent, even though the allocator interface is generic.
ivanjermakov 1 hour ago
Related recent post from Tigerbeetle developer: https://matklad.github.io/2025/12/23/static-allocation-compi...
d3ckard 1 hour ago
Personally I believe static allocation has pretty huge consequences for theoretical computer science.
It’s the only kind of program that can be actually reasoned about. Also, not exactly Turing complete in classic sense.
Makes my little finitist heart get warm and fuzzy.
[-]
- mikepurvis 19 minutes ago
  I'm not an academic, but all those ByteArray linked lists have me feeling like this is less "static allocation" and more "I re-implemented a site-specific allocator and all that that implies".
  Also it's giving me flashbacks to LwIP, which was a nightmare to debug when it would exhaust its preallocated buffer structures.
  [-]
  - d3ckard 3 minutes ago
    Personally, I see dynamic allocation more and more as a premature optimization and a historical wart.
    We used to have very little memory, so we developed many tricks to handle it.
    Now we have all the memory we need, but tricks remained. They are now more harmful than helpful.
    Interestingly, embedded programming has a reputation for stability and AFAIK game development is also more and more about avoiding dynamic allocation.
- dnautics 57 minutes ago
  i think you mean "exactly not Turing complete"
  [-]
  - d3ckard 32 minutes ago
    Nice correction :)
    It’s actually quite tricky though. The allocation still happens and it’s not limited to, so you could plausibly argue both ways.
    [-]
    - chongli 28 minutes ago
      I’m confused. How is a program that uses static allocation not Turing complete?
      [-]
      - skybrian 15 minutes ago
        A Turing machine has an unlimited tape. You can’t emulate it with a fixed amount of memory.
        It’s mostly a theoretical issue, though, because all real computer systems have limits. It’s just that in languages that assume unlimited memory, the limits aren’t written down. It’s not “part of the language.”
        [-]
        dnautics 2 minutes ago
        If we get REALLY nitpicky, zig currently (but not in the future) allows unbounded function recursion with "theoretically" assumes unlimited stack size, so it's potentially "still technically theoretically turing complete". For now.
kristoff_it 11 minutes ago
A coincidental counterpart to this project is my zero allocation Redis client. If kv supports RESPv3 then it should work without issue :^)
https://github.com/kristoff-it/zig-okredis
leumassuehtam 1 hour ago
> All memory must be statically allocated at startup. No memory may be dynamically allocated (or freed and reallocated) after initialization. This avoids unpredictable behavior that can significantly affect performance, and avoids use-after-free. As a second-order effect, it is our experience that this also makes for more efficient, simpler designs that are more performant and easier to maintain and reason about, compared to designs that do not consider all possible memory usage patterns upfront as part of the design. > TigerStyle
It's baffling that a technique known for 30+ years in the industry have been repackage into "tiger style" or whatever this guru-esque thing this is.
[-]
- mitchellh 39 minutes ago
  Snide and condescending (or at best: dismissive) comments like this help no one and can at the extremes stereotype an entire group in a bad light.
  I think the more constructive reality is discussing why techniques that are common in some industries such as gaming or embedded systems have had difficulty being adopted more broadly, and celebrating that this idea which is good in many contexts is now spreading more broadly! Or, sharing some others that other industries might be missing out on (and again, asking critically why they aren't present).
  Ideas in general require marketing to spread, that's literally what marketing is in the positive (in the negative its all sorts of slime!). If a coding standard used by a company is the marketing this idea needs to live and grow, then hell yeah, "tiger style" it is! Such is humanity.
  [-]
  - pjmlp 2 minutes ago
    It was a common practice in 8 and 16 bit home computing.
- jandrewrogers 46 minutes ago
  Static allocation has been around for a long time but few people consider it even in contexts where it makes a lot of sense. I’ve designed a few database engines that used pure static allocation and developers often chafe at this model because it seems easier to delegate allocation (which really just obscures the complexity).
  Allocation aside, many optimizations require knowing precisely how close to instantaneous resource limits the software actually is, so it is good practice for performance engineering generally.
  Hardly anyone does it (look at most open source implementations) so promoting it can’t hurt.
- codys 1 hour ago
  It seems it's just a part of a doc on style in tigerbeatle, in a similar way to the various "Google Style Guide" for code. These rarely have something new, but document what a particular project or organization does with respect to code style.
- addaon 1 hour ago
  Yep. Those of us who write embedded code for a living call this “writing code.”
  [-]
  - nickmonad 1 hour ago
    Author here! That's totally fair. I did learn this is a common technique in the embedded world and I had a whole section in the original draft about how it's not a super well-known technique in the typical "backend web server" world, but I wanted to keep the length of the post down so I cut that out. I think there's a lot we can learn from embedded code, especially around performance.
- testdelacc1 1 hour ago
  It is known, but not widely used outside of embedded programming. The fact that they’re using it while writing a database when they didn’t need to makes people sit up and notice. So why did they make this conscious choice?
  It’s tempting to cut people down to size, but I don’t think it’s warranted here. I think TigerBeetle have created something remarkable and their approach to programming is how they’ve created it.
dale-cooper 21 minutes ago
Maybe I'm missing something, but two thoughts:
1. Doesn't the overcommit feature lessen the benefits of this? Your initial allocation works but you can still run out of memory at runtime.
2. For a KV store, you'd still be at risk of application level use-after-free bugs since you need to keep track of what of your statically allocated memory is in use or not?
[-]
- nickmonad 15 minutes ago
  Author here! Overcommit is definitely a thing to watch out for. I believe TigerBeetle calls this out in their documentation. I think you'd have to explicitly disable it on Linux.
  For the second question, yes, we have to keep track of what's in use. The keys and values are allocated via a memory pool that uses a free-list to keep track of what's available. When a request to add a key/value pair comes in, we first check if we have space (i.e. available buffers) in both the key pool and value pool. Once those are marked as "reserved", the free-list kind of forgets about them until the buffer is released back into the pool. Hopefully that helps!
- justincormack 13 minutes ago
  You can work around overcommit by writing a byte to every allocated page at allocation time, so that it has to be actually allocated.
loeg 1 hour ago
We do a lazier version of this with a service at work. All of the large buffers and caches are statically (runtime-configured) sized, but various internal data structures assumed to be approximately de minimis can use the standard allocator to add items without worrying about it.
nromiun 47 minutes ago
> All memory must be statically allocated at startup.
But why? If you do that you are just taking memory away from other processes. Is there any significant speed improvement over just dynamic allocation?
[-]
- AnimalMuppet 41 minutes ago
  1. On modern OSes, you probably aren't "taking it away from other processes" until you actually use it. Statically allocated but untouched memory is probably just an entry in a page table somewhere.
  2. Speed improvement? No. The improvement is in your ability to reason about memory usage, and about time usage. Dynamic allocations add a very much non-deterministic amount of time to whatever you're doing.
  [-]
  - Ericson2314 34 minutes ago
    If you use it and stop using it, the OS cannot reclaim the pages, because it doesn't know that you've stopped. At best, it can offload the memory to disk, but this waste disk space, and also time for pointless writes.
    [-]
    - norir 9 minutes ago
      This is true, whether it matters is context dependent. In an embedded program, this may be irrelevant since your program is the only thing running so there is no resource contention or need to swap. In multi-tenant, you could use arenas in an identical way as single static allocation and release the arena upon completion. I agree that allocating a huge amount of memory for a long running program on a multi-tenant os is a bad idea in general, but it could be ok if for example you are running a single application like a database on the server in which you are back to embedded programming only the embedding is a database on a beefy general purpose computer.