Karl Seguin
Karl Seguin
@openmymind.net.web.brid.gy
Programming blog exploring Zig, Elixir, Go, Testing, Design and Performance

[bridged from https://openmymind.net/ on the web: https://fed.brid.gy/web/openmymind.net ]
Comparing Strings as Integers with @bitCast
In the last blog posts, we looked at different ways to compare strings in Zig. A few posts back, we introduced Zig's `@bitCast`. As a quick recap, `@bitCast` lets us force a specific type onto a value. For example, the following prints 1067282596: const std = @import("std"); pub fn main() !void { const f: f32 = 1.23; const n: u32 = @bitCast(f); std.debug.print("{d}\n", .{n}); } What's happening here is that Zig represents the 32-bit float value of `1.23` as: `[4]u8{164, 112, 157, 63}`. This is also how Zig represents the 32-bit unsigned integer value of `1067282596`. Data is just bytes; it's the type system - the compiler's knowledge of what data is what type - that controls what and how that data is manipulated. It might seem like there's something special about bitcasting from a float to an integer; they're both numbers after all. But you can `@bitCast` from any two equivalently sized types. Can you guess what this prints?: const std = @import("std"); pub fn main() !void { const data = [_]u8{3, 0, 0, 0}; const x: i32 = @bitCast(data); std.debug.print("{d}\n", .{x}); } The answer is `3`. Think about the above snippet a bit more. We're taking an array of bytes and telling the compiler to treat it like an integer. If we made `data` equal to `[_]u8{'b', 'l', 'u', 'e'}`, it would still work (and print `1702194274`). We're slowly heading towards being able to compare strings as-if they were integers. If you're wondering why 3 is encoded as `4]u8{3, 0, 0, 0}` and not `[4]u8{0, 0, 0, 3}`, I talked about binary encoding in my [Learning TCP series. From the last post, we could use multiple `std.mem.eql` or, more simply, `std.meta.stringToEnum` to complete the following method: fn parseMethod(value: []const u8) ?Method { // ... } const Method = enum { get, put, post, head, }; We can also use `@bitCast`. Let's take it step-by-step. The first thing we'll need to do is switch on `value.len`. This is necessary because the three-byte "GET" will need to be `@bitCast` to a `u24`, whereas the four-byte "POST" needs to be `@bitCast` to a `u32`: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3]))) { // TODO else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4]))) { // TODO else => {}, }, else => {}, } return null; } If you try to run this code, you'll get a compilation error: _cannot @bitCast from '*const [3]u8'_. `@bitCast` works on actual bits, but when we slice our `[]const u8` with a compile-time known range (`[0..3]`), we get a pointer to an array. We can't `@bitCast` a pointer, we can only `@bitCast` actual bits of data. For this to work, we need to derefence the pointer, i.e. use: `value[0..3].*`. This will turn our `*const [3]u8` into a `const [3]u8`. fn parseMethod(value: []const u8) ?Method { switch (value.len) { // changed: we now derefernce the value (.*) 3 => switch (@as(u24, @bitCast(value[0..3].*))) { // TODO else => {}, }, // changed: we now dereference the value (.*) 4 => switch (@as(u32, @bitCast(value[0..4].*))) { // TODO else => {}, }, else => {}, } return null; } Also, you might have noticed the `@as(u24, ...)` and `@as(u32, ...)`. `@bitCast`, like most of Zig's builtin functions, infers its return type. When we're assiging the result of a `@bitCast` to a variable of a known type, i.e: `const x: i32 = @bitCast(data);`, the return type of `i32` is inferred. In the above `switch`, we aren't assigning the result to a varible. We have to use `@as(u24, ...)` in order for `@bitCast` to kknow what it should be casting to (i.e. what its return type should be). The last thing we need to do is fill our switch blocks. Hopefully it's obvious that we can't just do: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { "GET" => return .get, "PUT" => return .put, else => {}, }, ... But you might be thinking that, while ugly, something like this might work: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast("GET".*)) => return .get, @as(u24, @bitCast("PUT".*)) => return .put, else => {}, }, ... Because `"GET"` and `"PUT"` are string literals, they're null terminated and of type `*const [3:0]u8`. When we dereference them, we get a `const [3:0]u8`. It's close, but it means that the value is 4 bytes (`[4]u8{'G', 'E', 'T', 0}`) and thus cannot be `@bitCast` into a `u24`. This is ugly, but it works: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast(@as([]const u8, "GET")[0..3].*)) => return .get, @as(u24, @bitCast(@as([]const u8, "PUT")[0..3].*)) => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { @as(u32, @bitCast(@as([]const u8, "HEAD")[0..4].*)) => return .head, @as(u32, @bitCast(@as([]const u8, "POST")[0..4].*)) => return .post, else => {}, }, else => {}, } return null; } That's a mouthful, so we can add small function to help: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { asUint(u24, "GET") => return .get, asUint(u24, "PUT") => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { asUint(u32, "HEAD") => return .head, asUint(u32, "POST") => return .post, else => {}, }, else => {}, } return null; } pub fn asUint(comptime T: type, comptime string: []const u8) T { return @bitCast(string[0..string.len].*); } Like the verbose version, the trick is to cast our null-terminated string literal into a string slice, `[]const u8`. By passing it through the `asUint` function, we get this without needing to add the explicit `@as([]const u8)`. There is a more advanced version of `asUint` which doesn't take the uint type parameter (`T`). If you think about it, the uint type can be inferred from the string's length: pub fn asUint(comptime string: []const u8) @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }) { return @bitCast(string[0..string.len].*); } Which allows us to call it with a single parameter: `asUint("GET")`. This might be your first time seeing such a return type. The `@Type` builtin is the opposite of `@typeInfo`. The latter takes a type and returns information on it in the shape of a `std.builtin.Type` union. Whereas `@Type` takes the `std.builtin.Type` and returns an actual usable type. One of these days I'll find the courage to blog about `std.builtin.Type`! As a final note, some people dislike the look of this sort of return type and rather encapsulate the logic in its own function. This is the same: pub fn asUint(comptime string: []const u8) AsUintReturn(string) { return @bitCast(string[0..string.len].*); } // Remember that, in Zig, by convention, a function should be // PascalCase if it returns a type (because types are PascalCase). fn AsUintReturn(comptime string: []const u8) type { return @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }); } ### Conclusion Of the three approaches, this is the least readable and less approachable. Is it worth it? It depends on your input and the values you're comparing against. In my benchmarks, using `@bitCast` performs roughly the same as `std.meta.stringToEnum`. But there are some cases where `@bitCast` can outperform `std.meta.stringToEnum` by as much as 50%. Perhaps that's the real value of this approach: the performance is less dependent on the input or the values being matched against. Leave a comment
www.openmymind.net
October 15, 2025 at 6:38 AM
GetOrPut With String Keys
I've previously blogged about how much I like Zig's `getOrPut` hashmap method. As a brief recap, we can visualize Zig's hashmap as two arrays: keys: values: -------- -------- | Paul | | 1234 | @mod(hash("Paul"), 5) == 0 -------- -------- | | | | -------- -------- | | | | -------- -------- | Goku | | 9001 | @mod(hash("Goku"), 5) == 3 -------- -------- | | | | -------- -------- When we call `get("Paul")`, we could think of this simplified implementation: fn get(map: *Self, key: K) ?V { const index = map.getIndexOf(key) orelse return null; return map.values[index]; } And, when we call `getPtr("Paul")`, we'd have this implementation: fn getPtr(map: *Self, key: K) ?*V { const index = map.getIndexOf(key) orelse return null; // notice the added '&' // we're taking the address of the array index return &map.values;[index]; } By taking the address of the value directly from the hashmap's array, we avoid copying the entire value. That can have performance implications (though not for the integer value we're using here). It also allows us to directly manipulate that slot of the array: const value = map.getPtr("Paul") orelse return; value.* = 10; This is a powerful feature, but a dangerous one. If the underlying array changes, as can happen when items are added to the map, `value` would become invalid. So, while `getPtr` is useful, it requires mindfulness: try to minimize the scope of such references. Currently, Zig's HashMap doesn't shrink when items are removed, so, for now, removing items doesn't invalidate any pointers into the hashmap. But expect that to change at some point. ### GetOrPut `getOrPut` builds on the above concept. It returns a pointer to the value **and** the key, as well as creating the entry in the hashmap if necessary. For example, given that we already have an entry for "Paul", if we call `map.getOrPut("Paul")`, we'd get back a `value_ptr` that points to a slot in the hahmap's `values` array, as well as a`key_ptr` that points to a slot in the hashmap's `keys` array. If the requested key _doesn't_ exist, we get back the same two pointers, and it's our responsibility to set the value. If I asked you to increment counters inside of a hashmap, without `getOrPut`, you'd end up with two hash lookups: // Go count, exists := counters["hits"] if exists == false { counters["hits"] = 1 } else { counters["hits"] = count + 1; } With `getOrPut`, it's a single hash lookup: const gop = try counters.getOrPut("hits"); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } ### getOrPut With String Keys It seems trivial, but the most important thing to understand about `getOrPut` is that it will set the key for you if the entry has to be created. In our last example, notice that even when `gop.found_existing == false`, we never set `key_ptr` - `getOrPut` automatically sets it to the key we pass in, i.e. `"hits"`. If we were to put a breakpoint after `getOrPut` returns but before we set the value, we'd see that our two arrays look something like: keys: values: -------- -------- | | | | -------- -------- | hits | | ???? | -------- -------- | | | | -------- -------- Where the entry in the `keys` array is set, but the corresponding entry in `values` is left undefined. You'll note that `getOrPut` doesn't take a value. I assume this is because, in some cases, the value might be expensive to derive, so the current API lets us avoid calculating it when `gop.found_existing == true`. This is important for keys that need to be owned by the hashmap. Most commonly strings, but this applies to any other key which we'll "manage". Taking a step back, if we wanted to track hits in a hashmap, and, most likely, we wanted the lifetime of the keys to be tied to the hashmap, you'd do something like: fn register(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); try map.put(owned, 0); } Creating your "owned" copy of `name`, frees the caller from having to maintain `name` beyond the call to `register`. Now, if this key is removed, or the entire map cleaned up, we need to free the keys. That's why I like the name "owned", it means the hash map "owns" the key (i.e. is responsible for freeing it): var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); The interaction between key ownership and `getOrPut` is worth thinking about. If we try to merge this ownership idea with our incrementing counter code, we might try: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } But this code has a potential memory leak, can you spot it? (see Appendix A for a complete runnable example) When `gop.found_existing == true`, `owned` is never used and never freed. One bad option would be to free `owned` when the entry already exists: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { // This line was added. But this is a bad solution allocator.free(owned); gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } It works, but we needlessly `dupe` `name` if the entry already exists. Rather than prematurely duping the key in case the entry doesn't exist, we want to delay our `dupe` until we know it's needed. Here's a better option: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // this line was added gop.key_ptr.* = try allocator.dupe(u8, name); gop.value_ptr.* = 1; } } It might seem reckless to pass `name` into `getOrPut`. We need the key to remain valid as long as the map entry exists. Aren't we undermining that requirement? Let's walk through the code. When `hit` is called for a new `name`, `gop.found_existing` will be false. `getOrPut` will insert `name` in our `keys` array. This is bad because we have no `guarantee` that `name` will be valid for as long as we need it to be. But the problem is immediately remedied when we overwrite `key_ptr.*`. On subsequent calls for an existing `name`, when `gop.found_existing == true`, the `name` is only used as a lookup. It's no different than doing a `getPtr`; `name` only has to be valid for the call to `getOrPut` because `getOrPut` doesn't keep a reference to it when an existing entry is found. ### Conclusion This post was a long way to say: don't be afraid to write to `key_ptr.*`. Of course you can screw up your map this way. Consider this example: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // what's this? gop.key_ptr.* = "HELLO"; gop.value_ptr.* = 1; } } Because the key is used to organize the map - find where items go and where they are, jamming random keys where they don't belong is going to cause issues. But it can also be used correctly and safely, as long as you understand the details. ### Appendix A - Memory Leak This code `should` report a memory leak. const std = @import("std"); const Allocator = std.mem.Allocator; pub fn main() !void { var gpa = std.heap.GeneralPurposeAllocator(.{}){}; const allocator = gpa.allocator(); defer _ = gpa.detectLeaks(); // I'm using the Unmanaged variant because the Managed ones are likely to // be removed (which I think is a mistake). Using Unmanaged makes this // snippet more future-proof. I explain unmanaged here: // https://www.openmymind.net/Zigs-HashMap-Part-1/#Unmanaged var map: std.StringHashMapUnmanaged(u32) = .{}; try hit(allocator, ↦, "teg"); try hit(allocator, ↦, "teg"); var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); } fn hit(allocator: Allocator, map: *std.StringHashMapUnmanaged(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(allocator, owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } Leave a comment
www.openmymind.net
October 15, 2025 at 6:38 AM
Zig's dot star syntax (value.*)
Maybe I'm the only one, but it always takes my little brain a split second to understand what's happening whenever I see, or have to write, something like `value.* = .{...}`. If we take a step back, a variable is just a convenient name for an address on the stack. When this function executes: fn isOver9000(power: i64) bool { return power > 9000; } Say, with a `power` of 593, we could visualize its stack as: power -> ------------- | 593 | ------------- If we changed our function to take a pointer to an integer: // i64 changed to *i64 fn isOver9000(power: *i64) bool { return power > 9000; } Our `power` argument would still be a label for a stack address, but instead of directly containing an number, the stack value would itself be an address. That's the _indirection_ of pointers: power -> ------------- | 1182145c0 |------------------------ ------------- | | ............. empty space | ............. or other data | | ------------- | | 593 | <---------------------- ------------- But this code doen't work: it's trying to compare a `comptime_int` (`9000`) with an `*i64`. We need to make another change to the function: // i64 changed to *i64 fn isOver9000(power: *i64) bool { // power changed to power.* return power.* > 9000; } `power.*` is how we dereference a pointer. Dereferencing means to get the value pointed to by a pointer. From our above visualization, you could say that the `.*` follows the arrow to get the value, `593`. This same syntax works for writing as well. The following is valid: fn isOver9000(power: *i64) bool { power.* = 9001; return true; } Like before, the dereferencing operator (`.*`), "follows" the pointer, but now that it's on the receiving end of an assignment, we write the value into the pointed add memory. This is all true for more complex types. Let's say we have a `User` struct with an `id` and a `name`: const User = struct { id: i32, name: []const u8, }; var user = User{ .id = 900, .name = "Teg" }; The `user` variable is a label for the location of the [start of] the user: user -> ------------- | 900 | ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- A slice in Zig, like our `[]const u8`, is a length (`3`) and a pointer to the values. Now, if we were to take the address of `user`, via `&user;`, we introduce a level of indirection. For example, imagine this code: const std = @import("std"); const User = struct { id: i32, name: []const u8, }; pub fn main() !void { var user = User{ .id = 900, .name = "Teg" }; updateUser(&user;); std.debug.print("{d}\n", .{user.id}); } fn updateUser(user: *User) void { user.id += 100000; } The `user` parameter of our `updateUser` function is pointing to the `user` on `main`'s stack: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | user -> ------------- | | 900 | <---------------------- ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Because we're referencing `main`'s `user` (rather than a copy), any changes we make will be reflected in `main`. But, we aren't limited to operating on fields of `user`, we can operate on its entire memory. Of course, we can create a copy of the id field (assignment are always copies, just an matter of knowing _what_ we're copying): fn updateUser(user: *User) void { const id = user.id // .... } And now the stack for our function looks like: user -> ------------- | 83abcc30 | id -> ------------- | 900 | ------------- But we can also copy the entire user: fn updateUser(user: *User) void { const copy = user.*; // .... } Whch gives us something like: updateUser user -> ------------- | 83abcc30 |--------------------- copy -> ------------- | | 900 | | ------------- | | 3 | | ------------- | | 3c9414e99 | --------------------|-- ------------- | | | | ............. empty space | | ............. or other data | | | | main | | user -> ------------- | | | 900 | <------------------- | ------------- | | 3 | | ------------- | | 3c9414e99 | -----------------------| ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Notice that it didn't create a copy of the value 'Teg'. You could call this copying "shallow": it copied the `900`, the `3` (name length) and the `3c9414e99` (address of the name pointer). Just like our simpler example above, we can also assign using the dereferencing operator: fn updateUser(user: *User) void { // using type inference // could be more explicit and do // user.* = User{....} user.* = .{ .id = 5, .name = "Paul", }; } This doesn't copy anything; it writes into the address that we were given, the address of the main's `user`: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | | user -> ------------- | | 5 | <---------------------- ------------- | 4 | ------------- | 9bf4a990 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | P | <---------------------- ------------- | a | ------------- | u | ------------- | l | ------------- If you're still not fully comfortable with this, and if you haven't done so already, you might be interested in the pointers and stack memory parts of my learning zig series. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
ArenaAllocator.free and Nested Arenas
What happens when you `free` with an ArenaAllocator? You might be tempted to look at the documentation for std.mem.Allocator.free which says "Free an array allocated with alloc". But this is the one thing we're sure it _won't_ do. In its current implementation, calling `free` usually does nothing: the freed memory isn't made available for subsequent allocations by the arena, and it certainly isn't released back to the operating system. However, under specific conditions `free` will make the memory re-usable by the arena. The only way to really "free" the memory is to call `deinit`. The only case when we're guaranteed that the memory will be reusable by the arena is when it was the last allocation made: const str1 = try arena.dupe(u8, "Over 9000!!!"); arena.free(str1); Above, whatever memory was allocated to duplicate our string will be available for subsequent allocations made with `arena`. In the following case, the two calls to `arena.free` do nothing: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); arena.free(str1); arena.free(str2); In order to "fix" this code, we'd need to reverse the order of the two frees: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); arena.free(str2); //swapped this line with the next arena.free(str1); Now, when we call `arena.free(str2)`, the memory allocated for `str2` will be available to subsequent allocations. But what happens when we call `arena.free(str1)`? The answer, again, is: _it depends_. It has to do with the internal state of the arena. Simplistically, an `ArenaAllocator` keeps a linked list of memory buffers. Imagine something like: buffer_list.head -> ------------ | next | -> null | ---- | | | | | | | | | | | ------------ Our linked list has a single node along with 5 bytes of available space. After we allocate `str1`, it looks like: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | | | | | | | ------------ Then, when we allocate `str2`, it looks like: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | str2 -> | 1 | | 2 | | | ------------ When we free `str2`, it goes back to how it was before: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | | | | | | | ------------ Which means that when we `arena.free(str1)`, it **will** make that memory available again. However, if instead of allocating two strings, we allocate three: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); const str3 = try arena.dupe(u8, "()"); arena.free(str3); arena.free(str2); arena.free(str1); Our first buffer doesn't have enough space for the new string, so a new node is prepended to our linked list: buffer_list.head -> ------------ ------------ | next | -> | next | -> null | ---- | | ---- | str3 -> | ( | | a | <- str1 | ) | | b | | | | 1 | <- str2 | | | 2 | | | | | ------------ ------------ When we call `arena.free(str3)`, the memory for that allocation will be made available, but subsequent frees, even if they're in the correct order (i.e. freeing `str2` then `str1`) will be noops. The ArenaAllocator doesn't have the capability to go back to act on anything but the head of our linked list, even if it's empty. In short, when we `free` the last allocation, that memory will _always_ be made available. But subsequent `frees` only behave this way if (a) they're also in order and (b) happen to be allocate within the same internal memory node. ### Nested Arenas Zig's allocator are said to be composable. When we create an `ArenaAllocator`, we pass a single parameter: an allocator. That parent allocator (1) can be any other type of allocator. You can, for example, create an `ArenaAllocator` on top of a `FixedBufferAllocator`. You can also create an `ArenaAllocator` on top of another `ArenaAllocator`. (1) Zig calls this the "child allocator", but that doesn't make any sense to me. This kind of thing often happens within libraries, where an API takes an `std.mem.Allocator` and the library creates an `ArenaAllocator`. And what happens when the provided allocator was already an arena? Libraries aside, I'm mean something like: var parent_arena = ArenaAllocator.init(gpa_allocator); const parent_allocator = parent_arena.allocator(); var inner_arena = ArenaAllocator.init(parent_allocator); const inner_allocator = inner_arena.allocator(); _ = try inner_allocator.dupe(u8, "Over "); _ = try inner_allocator.dupe(u8, "9000!"); inner_arena.deinit(); It does work, but at best, when `deinit` is called, the memory will be made available to be re-used by `inner_arena`. Except in simple cases, allocations made by `inner_arena` are likely to span multiple buffers of `parent_arena`, and of course you can still make allocations directly in `parent_arena` which can generate its own new buffers or simply make the ordering requirement impossible to fulfill. For example, if we make an allocation in `parent_arena` before `inner_arena.deinit();` is called: _ = try parent_allocator.dupe(u8, "!!!"); inner_arena.deinit(); Then the `deinit` does nothing. So while nesting ArenaAllocator's works, I don't think there's any advantage over using a single Arena. And, I think in many cases where you have an "inner_arena", like in a library, it's better if the caller provides a non-Arena parent allocator so that all the memory is really freed when the library is done with it. Of course, there's a transparency issue here. Unless the library documents exactly how it's using your provided allocator, or unless you explore the code - and assuming the implementation doesn't change - it's hard to know what you should use. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Allocator.resize
There are four important methods on Zig's `std.mem.Allocator` interface that Zig developers must be comfortable with: * `alloc(T, n)` - which creates an array of `n` items of type `T`, * `free(ptr)` - which frees memory allocate with `alloc` (although, this is implementation specific), * `create(T)` - which creates a single item of type `T`, and * `destroy(ptr)` - which destroys an item created with `create` While you might never need to use them, the `Allocator` interface has other methods which, if nothing else, can be useful to be aware of and informative to learn about. In particularly, the `resize` method is used to try and resize an existing allocation to a larger (or smaller) size. The main promise of `resize` is that it's guaranteed _not_ to move the pointer. However, to satisfy that guarantee, resize is allowed to fail, in which case nothing changes. We can imagine a simple allocation: // var buf = try allocator.alloc(u8, 5); // buf[0] = 'h' 0x102e00000 ------------------------------- buf.ptr -> | h | | | | | -------------------------------- Now, if we were to call `allocator.resize(buf, 7)`, there are be two possible outcomes. The first is that the call returns `false`, indicating that the resize operation fails, and thus nothing changed:: 0x102e00000 ------------------------------- buf.ptr -> | h | | | | | -------------------------------- However, when `resize` succeeds and returns `true`, the allocated space has grown without having relocated (i.e. moved) the pointer: 0x102e00000 ------------------------------------------- buf.ptr -> | h | | | | | | | -------------------------------------------- Now under what circumstances `resize` succeeds and fails is a black box. It depends on a lot of factors and is going to be allocator-specific. For example, for me, this code prints `false` indicating that the resize failed: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); std.debug.print("{any}\n", .{allocator.resize(buf, 20)}); allocator.free(buf); } Because we're using a `GeneralPurposeAllocator` (that name is deprecated in Zig 0.14 in favor of `DebugAllocator`) we could dive into its internals and try to leverage knowledge of its implementation to force a resize to succeed, but a simpler option is to resize our buffer to `0`: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); // change 20 -> 0 std.debug.print("{any}\n", .{allocator.resize(buf, 0)}); allocator.free(buf); } Success, the code now prints `true`, indicating that the resize succeeded. However, I also get **segfault**. Can you guess what we're doing wrong? In our above visualization, we saw how a successful resize does not move our pointer. We can confirm this by looking at the address of `buf.ptr` before and after our resize. This code still segfaults, but it prints out the information first: pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); std.debug.print("address before resize: {*}\n", .{buf.ptr}); std.debug.print("resize succeeded: {any}\n", .{allocator.resize(buf, 0)}); std.debug.print("address after resize: {*}\n", .{buf.ptr}); allocator.free(buf); } So far, we've only considered the `ptr` of our slice, but, like the criminal justice system, a slice is represented by two separate yet equally important groups: a `ptr` and a `len`. If we change our code to also look at the `len` of `buf`, the issue might become more obvious: // change the 1st and 3rd line to also print buf.len: std.debug.print("address & len before resize: {*} {d}\n", .{buf.ptr, buf.len}); std.debug.print("resize succeeded: {any}\n", .{allocator.resize(buf, 0)}); std.debug.print("address & len after resize: {*} {d}\n", .{buf.ptr, buf.len}); This is what I get: address & len before resize: usize@100280000 10 resize succeeded: true address & len after resize: usize@100280000 10 Segmentation fault at address 0x100280000 While it isn't the cleanest output, notice that even after we successfully resize our ptr, the length remains unchanged (i.e. `10`). Herein lies our bug problem. `resize` updates the underlying memory, it doesn't update the length of the slice. That's something we need to take care of. Here's a non-crashing version: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); var buf = try allocator.alloc(usize, 10); if (allocator.resize(buf, 0)) { std.debug.print("resize succeeded!\n", .{}); buf.len = 0; } else { // we need to handle the case where resize doesn't succeed } allocator.free(buf); } What's left out of the above code is handling the case where `resize` fails. This becomes application specific. In most cases, where we're likely resizing to a larger size, we'll generally need to fallback to calling `alloc` to create our larger memory, and then, most likely, `@memcpy` to copy data from the existing (now too small) buffer to the newly created larger one. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Zig's new LinkedList API (it's time to learn @fieldParentPtr)
In a recent, post-Zig 0.14 commit, Zig's `SinglyLinkedList` and `DoublyLinkedList` saw significant changes. The previous version was a generic and, with all the methods removed, looked like: pub fn SinglyLinkedList(comptime T: type) type { return struct { first: ?*Node = null, pub const Node = struct { next: ?*Node = null, data: T, }; }; } The new version isn't generic. Rather, you embed the linked list node with your data. This is known as an intrusive linked list and tends to perform better and require fewer allocations. Except in trivial examples, the data that we store in a linked list is typically stored on the heap. Because an intrusive linked list has the linked list node embedded in the data, it doesn't need its own allocation. Before we jump into an example, this is what the new structure looks like, again, with all methods removed: pub const SinglyLinkedList = struct { first: ?*Node = null, pub const Node = struct { next: ?*Node = null, }; }; Much simpler, and, notice that this has no link or reference to any of our data. Here's a working example that shows how you'd use it: const std = @import("std"); const SinglyLinkedList = std.SinglyLinkedList; pub fn main() !void { // GeneralPurposeAllocator is being renamed // to DebugAllocator. Let's get used to that name var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); var list: SinglyLinkedList = .{}; const user1 = try allocator.create(User); defer allocator.destroy(user1); user1.* = .{ .id = 1, .power = 9000, .node = .{}, }; list.prepend(&user1.node;); const user2 = try allocator.create(User); defer allocator.destroy(user2); user2.* = .{ .id = 2, .power = 9001, .node = .{}, }; list.prepend(&user2.node;); var node = list.first; while (node) |n| { std.debug.print("{any}\n", .{n}); node = n.next; } } const User = struct { id: i64, power: u32, node: SinglyLinkedList.Node, }; To run this code, you'll need a nightly release from within the last week. What do you think the output will be? You should see something like: SinglyLinkedList.Node{ .next = SinglyLinkedList.Node{ .next = null } } SinglyLinkedList.Node{ .next = null } We're only getting the nodes, and, as we can see here and from the above skeleton structure of the new `SinglyLinkedList`, there's nothing about our users. Users have nodes, but there's seemingly nothing that links a node back to its containing user. Or is there? In the past, we've described how the compiler uses the type information to figure out how to access fields. For example, when we execute `user1.power`, the compiler knows that: 1. `id` is +0 bytes from the start of the structure, 2. `power` is +8 bytes from the start of the structure (because id is an i64), and 3. `power` is an i32 With this information, the compiler knows how to access `power` from `user1` (i.e. jump forward 8 bytes, read 4 bytes and treat it as an i32). But if you think about it, that logic is simple to reverse. If we know the address of `power`, then the address of `user` has to be `address_of_power - 8`. We can prove this: const std = @import("std"); pub fn main() !void { var user = User{ .id = 1, .power = 9000, }; std.debug.print("address of user: {*}\n", .{&user;}); const address_of_power = &user.power; std.debug.print("address of power: {*}\n", .{address_of_power}); const power_offset = 8; const also_user: *User = @ptrFromInt(@intFromPtr(address_of_power) - power_offset); std.debug.print("address of also_user: {*}\n", .{also_user}); std.debug.print("also_user: {}\n", .{also_user}); } const User = struct { id: i64, power: u32, }; The magic happens here: const power_offset = 8; const also_user: *User = @ptrFromInt(@intFromPtr(address_of_power) - power_offset); We're turning the address of our user's power field, `&user.power;` into an integer, subtracting 8 (8 bytes, 64 bits), and telling the compiler that it should treat that memory as a `*User`. This code will _probably_ work for you, but it isn't safe. Specifically, unless we're using a packed or extern struct, Zig makes no guarantees about the layout of a structure. It could put `power` BEFORE `id`, in which case our `power_offset` should be 0. It could add padding after every field. It can do anything it wants. To make this code safer, we use the `@offsetOf` builtin to get the actual byte-offset of a field with respect to its struct: const power_offset = @offsetOf(User, "power"); Back to our linked list, given that we have the address of a `node` and we know that it is part of the `User` structure, we _are_ able to get the `User` from a node. Rather than use the above code though, we'll use the _slightly_ friendlier `@fieldParentPtr` builtin. Our `while` loop changes to: while (node) |n| { const user: *User = @fieldParentPtr("node", n); std.debug.print("{any}\n", .{user}); node = n.next; } We give `@fieldParentPtr` the name of the field, a pointer to that field as well as a return type (which is inferred above by the assignment to a `*User` variable), and it gives us back the instance that contains that field. Performance aside, I have mixed feelings about the new API. My initial reaction is that I dislike exposing, what I consider, a complicated builtin like `@fieldParentPtr` for something as trivial as using a linked list. However, while `@fieldParentPtr` seems esoteric, it's quite useful and developers should be familiar with it because it can help solve problems which are otherwise problematic. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Zig's new Writer
As you might have heard, Zig's `Io` namespace is being reworked. Eventually, this will mean the re-introduction of async. As a first step though, the Writer and Reader interfaces and some of the related code have been revamped. > This post is written based on a mid-July 2025 development release of Zig. It doesn't apply to Zig 0.14.x (or any previous version) and is likely to be outdated as more of the Io namespace is reworked. Not long ago, I wrote a blog post which tried to explain Zig's Writers. At best, I'd describe the current state as "confusing" with two writer interfaces while often dealing with `anytype`. And while `anytype` is convenient, it lacks developer ergonomics. Furthermore, the current design has significant performance issues for some common cases. ### Drain The new `Writer` interface is `std.Io.Writer`. At a minimum, implementations have to provide a `drain` function. Its signature looks like: fn drain(w: *Writer, data: []const []const u8, splat: usize) Error!usize You might be surprised that this is the method a custom writer needs to implemented. Not only does it take an array of strings, but what's that `splat` parameter? Like me, you might have expected a simpler `write` method: fn write(w: *Writer, data: []const u8) Error!usize It turns out that `std.Io.Writer` has buffering built-in. For example, if we want a `Writer` for an `std.fs.File`, we need to provide the buffer: var buffer: [1024]u8 = undefined; var writer = my_file.writer(&buffer;); Of course, if we don't want buffering, we can always pass an empty buffer: var writer = my_file.writer(&.{}); This explains why custom writers need to implement a `drain` method, and not something simpler like `write`. The simplest way to implement `drain`, and what a lot of the Zig standard library has been upgraded to while this larger overhaul takes place, is: fn drain(io_w: *std.Io.Writer, data: []const []const u8, splat: usize) !usize { _ = splat; const self: *@This() = @fieldParentPtr("interface", io_w); return self.writeAll(data[0]) catch return error.WriteFailed; } We ignore the `splat` parameter, and just write the first value in `data` (`data.len > 0` is guaranteed to be true). This turns `drain` into what a simpler `write` method would look like. Because we return the length of bytes written, `std.Io.Writer` will know that we potentially didn't write all the data and call `drain` again, if necessary, with the rest of the data. > If you're confused by the call to `@fieldParentPtr`, check out my post on the upcoming linked list changes. The actual implementation of `drain` for the `File` is a non-trivial ~150 lines of code. It has platform-specific code and leverages vectored I/O where possible. There's obviously flexibility to provide a simple implementation or a more optimized one. ### The Interface Much like the current state, when you do `file.writer(&buffer;)`, you don't get an `std.Io.Writer`. Instead, you get a `File.Writer`. To get an actual `std.Io.Writer`, you need to access the `interface` field. This is merely a convention, but expect it to be used throughout the standard, and third-party, library. Get ready to see a lot of `&xyz.interface;` calls! This simplification of `File` shows the relationship between the three types: pub const File = struct { pub fn writer(self: *File, buffer: []u8) Writer{ return .{ .file = self, .interface = std.Io.Writer{ .buffer = buffer, .vtable = .{.drain = Writer.drain}, } }; } pub const Writer = struct { file: *File, interface: std.Io.Writer, // this has a bunch of other fields fn drain(io_w: *std.Io.Writer, data: []const []const u8, splat: usize) !usize { const self: *Writer = @fieldParentPtr("interface", io_w); // .... } } } The instance of `File.Writer` needs to exist somewhere (e.g. on the stack) since that's where the `std.Io.Writer` interface exists. It's possible that `File` could directly have an `writer_interface: std.Io.Writer` field, but that would limit you to one writer per file and would bloat the `File` structure. We can see from the above that, while we call `Writer` an "interface", it's just a normal struct. It has a few fields beyond `buffer` and `vtable.drain`, but these are the only two with non-default values; we have to provide them. The `Writer` interface implements a lot of typical "writer" behavior, such as a `writeAll` and `print` (for formatted writing). It also has a number of methods which only a `Writer` implementation would likely care about. For example, `File.Writer.drain` has to call `consume` so that the writer's internal state can be updated. Having all of these functions listed side-by-side in the documentation confused me at first. Hopefully it's something the documentation generation will one day be able to help disentangle. ### Migrating The new `Writer` has taken over a number of methods. For example, `std.fmt.formatIntBuf` no longer exists. The replacement is the `printInt` method of `Writer`. But this requires a `Writer` instance rather than the simple `[]u8` previous required. It's easy to miss, but the `Writer.fixed([]u8) Writer` function is what you're looking for. You'll use this for any function that was migrating to `Writer` and used to work on a `buffer: []u8`. While migrating, you might run into the following error: _no field or member function named 'adaptToNewApi' in '...'_. You can see why this happens by looking at the updated implementation of `std.fmt.format`: pub fn format(writer: anytype, comptime fmt: []const u8, args: anytype) !void { var adapter = writer.adaptToNewApi(); return adapter.new_interface.print(fmt, args) catch |err| switch (err) { error.WriteFailed => return adapter.err.?, }; } Because this functionality was moved to `std.Io.Writer`, any `writer` passed into `format` has to be able to upgrade itself to the new interface. This is done, again only be convention, by having the "old" writer expose an `adaptToNewApi` method which returns a type that exposes a `new_interface: std.Io.Writer` field. This is pretty easy to implement using the basic `drain` implementation, and you can find a handful of examples in the standard library, but it's of little help if you don't control the legacy writer. ### Conclusion I'm hesitant to provide opinion on this change. I don't understand language design. However, while I think this is an improvement over the current API, I keep thinking that adding buffering directly to the `Writer` isn't ideal. I believe that most languages deal with buffering via composition. You take a reader/writer and wrap it in a BufferedReader or BufferedWriter. This approach seems both simple to understand and implement while being powerful. It can be applied to things beyond buffering and IO. Zig seems to struggle with this model. Rather than provide a cohesive and generic approach for such problems, one specific feature (buffering) for one specific API (IO) was baked into the standard library. Maybe I'm too dense to understand or maybe future changes will address this more holistically. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
I'm too dumb for Zig's new IO interface
You might have heard that Zig 0.15 introduces a new IO interface, with the focus for this release being the new std.Io.Reader and std.Io.Writer types. The old "interfaces" had problems. Like this performance issue that I opened. And it relied on a mix of types, which always confused me, and a lot of `anytype` - which is generally great, but a poor foundation to build an interface on. I've been slowly upgrading my libraries, and I ran into changes to the `tls.Client` client used by my smtp library. For the life of me, I just don't understand how it works. Zig has never been known for its documentation, but if we look at the documentation for `tls.Client.init`, we'll find: pub fn init(input: *std.Io.Reader, output: *std.Io.Writer, options: Options) InitError!Client Initiates a TLS handshake and establishes a TLSv1.2 or TLSv1.3 session. So it takes one of these new Readers and a new Writer, along with some options (sneak peak, which aren't all optional). It doesn't look like you can just give it a `net.Stream`, but `net.Stream` does expose a `reader()` and `writer()` method, so that's probably a good place to start: const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var writer = stream.writer(&.{}); var reader = stream.reader(&.{}); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{}, // options TODO ); Note that `stream.writer()` returns a `Stream.Writer` and `stream.reader()` returns a `Stream.Reader` - those aren't the types our `tls.Client` expects. To convert the `Stream.Reader` to an `*std.Io.Reader`, we need to call its `interface()` method. To get a `*std.io.Writer` from an `Stream.Writer`, we need the address of its `&interface;` field. This doesn't seem particularly consistent. Don't forget that the `writer` and `reader` need a stable address. Because I'm trying to get the simplest example working, this isn't an issue - everything will live on the stack of `main`. In a real word example, I think it means that I'll always have to wrap the `tls.Client` into my own heap-allocated type; giving the writer and reader have a cozy stable home. Speaking of allocations, you might have noticed that `stream.writer` and `stream.reader` take a parameter. It's the buffer they should use. Buffering is a first class citizen of the new Io interface - who needs composition? The documentation **does** tell me these need to be at least `std.crypto.tls.max_ciphertext_record_len` large, so we need to fix things a bit: var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); Here's where the code stands: const std = @import("std"); pub fn main() !void { var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ }, ); defer tls_client.end() catch {}; } But if you try to run it, you'll get a compilation error. Turns out we have to provide 4 options: the ca_bundle, a host, a `write_buffer` and a `read_buffer`. Normally I'd expect the options parameter to be for optional parameters, I don't understand why some parameters (input and output) are passed one way while `writer_buffer` and `read_buffer` are passed another. Let's give it what it wants AND send some data: // existing setup... var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &.{}, .write_buffer = &.{}, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); Now, if I try to run it, the program just hangs. I don't know what `write_buffer` is, but I know Zig now loves buffers, so let's try to give it something: // existing setup... // I don't know what size this should/has to be?? var write_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &.{}, .write_buffer = &write;_buf2, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); Great, now the code doesn't hang, all we need to do is read the response. `tls.Client` exposes a `reader: *std.Io.Reader` field which is "Decrypted stream from the server to the client." That sounds like what we want, but believe it or not `std.Io.Reader` doesn't have a `read` method. It has a `peak` a `takeByteSigned`, a `readSliceShort` (which seems close, but it blocks until the provided buffer is full), a `peekArray` and a lot more, but nothing like the `read` I'd expect. The closest I can find, which I think does what I want, is to stream it to a writer: var buf: [1024]u8 = undefined; var w: std.Io.Writer = .fixed(&buf;); const n = try tls_client.reader.stream(&w;, .limited(buf.len)); std.debug.print("read: {d} - {s}\n", .{n, buf[0..n]}); If we try to run the code now, it crashes. We've apparently failed an assertion regarding the length of a buffer. So it seems like we also _have_ to provide a `read_buffer`. Here's my current version (it doesn't work, but it doesn't crash!): const std = @import("std"); pub fn main() !void { var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); var write_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var read_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &read;_buf2, .write_buffer = &write;_buf2, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); var buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var w: std.Io.Writer = .fixed(&buf;); const n = try tls_client.reader.stream(&w;, .limited(buf.len)); std.debug.print("read: {d} - {s}\n", .{n, buf[0..n]}); } When I looked through Zig's source code, there's only one place using `tls.Client`. It helped to get me where where I am. I couldn't find any tests. I'll admit that during this migration, I've missed some basic things. For example, someone had to help me find `std.fmt.printInt` - the renamed version of `std.fmt.formatIntBuf`. Maybe there's a helper like: `tls.Client.init(allocator, stream)` somewhere. And maybe it makes sense that we do `reader.interface()` but `&writer.interface;` - I'm reminded of Go's `*http.Request` and `http.ResponseWrite`. And maybe Zig has some consistent rule for what parameters belong in options. And I know nothing about TLS, so maybe it makes complete sense to need 4 buffers. I feel a bit more confident about the weirdness of not having a `read(buf: []u8) !usize` function on `Reader`, but at this point I wouldn't bet on me. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Everything is a []u8
If you're coming to Zig from a more hand-holding language, one of the things worth exploring is the relationship between the compiler and memory. I think code is the best way to do that, but briefly put into words: the memory that your program uses is all just bytes; it is only the compile-time information (the type system) that gives meaning to and dictates how that memory is used and interpreted. This is meaningful in Zig and other similar languages because developers are allowed to override how the compiler interprets those bytes. > This is something I've written about before; longtime readers might find this post repetitive. Consider this code: const std = @import("std"); pub fn main() !void { std.debug.print("{d}\n", .{@sizeOf(User)}); } const User = struct { id: u32, name: []const u8, }; It _should_ print 24. The point of this post isn't _why_ it prints 24. What's important here is that when we create a `User` - whether it's on the stack or the heap - it is represented by 24 bytes of memory. If you examine those 24 bytes, there's nothing "User" about them. The memory isn't self-describing - that would be inefficient. Rather, it's the compiler itself that maintains meta data about memory. Very naively, we could imagine that the compiler maintains a lookup where the key is the variable name and the value is the memory address (our 24 bytes) + the type (`User`). The fun, and sometimes useful thing about this is that we can alter the compiler's meta data. Here's an working but impractical example: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); std.debug.print("{any}\n", .{tea}); } const User = struct { id: u32, name: []const u8, }; const Tea = struct { price: u32, type: TeaType, const TeaType = enum { black, white, green, herbal, }; }; First we create a `User` - nothing unusual about that. Next we use @ptrCast to tell the compiler to treat the memory referenced by `user` as a `*Tea`. `@ptrCast` works on addresses, which is why we give it address of (`&`) `user` and get back a pointer (`*`) to `Tea`. Here the return type of `@ptrCast` is inferred by the type it's being assigned to. You might have some questions like what does it print? Or, is it safe? And, is this ever useful? We'll dig more into the safety of this in a bit. But briefly, the main concern is about the size of our structures. If `@sizeOf(User)` is 24 bytes, we'll be able to re-interpret that memory as anything which is 24 bytes or less. The `@sizeOf(Tea)` is 8 bytes, so this is safe. I get different results on each run: .{ .price = 39897726, .type = .white } .{ .price = 75123326, .type = .white } .{ .price = 6441598, .type = .white } .{ .price = 77826686, .type = .white } .{ .price = 4950654, .type = .white } .{ .price = 69438078, .type = .white } .{ .price = 78498430, .type = .white } .{ .price = 79022718, .type = .white } It's possible (but not likely) you get consistent result. I find these results surprising. If had to imagine what the 24 bytes of `user` looks like, I'd come up with: 41, 35, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, x, x, x, x, x, x, x, x Why that? Well, I'd expect the first 8 bytes to be the id, 9001, which has a byte representation of `41, 35, 0, 0, 0, 0, 0, 0`. The next 8 bytes I think would be the string length, or `4, 0, 0, 0, 0, 0, 0, 0` The last 8 bytes would be the pointer to actual string value - an address that I have no way of guessing, so I mark it with `x, x, x, x, x, x, x, x`. > If you think the `id` should only take 4 bytes, given that it's a u32, good! But Zig will usually align struct fields, so it really will take 8 bytes. That isn't something we'll dive into this post though. Since `Tea` is only 8 bytes and since the first 8 bytes of `user` are always the same (only the pointer to the name value changes from instance to instance and from run to run), shouldn't we always get the same `Tea` value? Yes, but only if I'm correct about the contents of those 24 bytes for `user`. Unless we tell it otherwise, Zig makes no guarantees about how it lays out the fields of a struct. The fact that our `tea` keeps changing, makes me believe that, for reasons I don't know, Zig decided to put the pointer to our name at the start. The reason you might get different results is that Zig might have organized the user's memory different based on your platform or version of Zig (or any other factor, but those are the two more realistic reasons). So while this code might never crash, doesn't the lack of guarantee make it useless? No. At least not in three cases. ### Well-Defined In-Memory Layout While Zig usually doesn't make guarantees about how data will be organized, C programs **do** . In Zig, a structure declared as `extern` follows that specification. We can similarly declare a structure as `packed` which also has a well-defined memory layout (but just not necessarily the same as C's / `extern`). `extern` and `packed` structs can only contain `extern` and `packed` fields. In order for a struct to have a well-known memory layout, all of its field must have a well-known memory layout. They can't, for example, have slices - which don't have a guaranteed layout. Still, here's a reliable and realistic example: const std = @import("std"); pub fn main() !void { var manager = Manager{.id = 4, .name = "Leto", .name_len = 4, .level = 99}; const user: *User = @ptrCast(&manager;); std.debug.print("{d}: {s}\n", .{user.id, user.name[0..user.name_len]}); } const User = extern struct { id: u32, name: [*c]const u8, name_len: usize, }; const Manager = extern struct { id: u32, name: [*c]const u8, name_len: usize, level: u16, }; Part of the guarantee is that the fields are laid out in the order that they're declared. Above, when I guessed at the layout of `user`, I made that assumption - but it's only valid for `extern` structs. We can be sure that the above code will print `4: Leto` because `Manager` has the same fields as `User` and in the same order. We can, and should, make this more explicit: const Manager = extern struct { user: User, level: u16, }; Because the type information is only meta data of the compiler, both declarations of `Manager` are the same - they're the same size and have the same layout. There's no overhead to embedding the `User` into `Manager` this way. This type of memory-reinterpretation can be found in some C code and thus see in any Zig code that interacts with such a C codebase. ### Leveraging Zig Builtins While we can't assume anything about the memory layout of non-extern (or packed) struct, we can leverage various built-in functions to programmatically figure things out, such as `@sizeOf`. Probably the most useful is `@offsetOf` which gives us the offset of a field in bytes. const std = @import("std"); pub fn main() !void { std.debug.print("name offset: {d}\n", .{@offsetOf(User, "name")}); std.debug.print("id offset: {d}\n", .{@offsetOf(User, "id")}); } const User = struct { id: u32, name: []const u8, }; For me, this prints: name offset: 0 id offset: 16 This helps confirm that Zig did, in fact, put the `name` before the `id`. We saw the result of that when we treated the user's memory as an instance of `Tea`. If we wanted to create a `Tea` based on the address of `user.id` rather than `user`, we could do: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; // changed from &user; to &user.id; const tea: *Tea = @ptrCast(&user.id;); std.debug.print("{any}\n", .{tea}); } This will now always output the same result. But how would we take `tea` and get a `user` out of it? Generally speaking, this wouldn't be safe since `@sizeOf(Tea) < @sizeOf(User)` - the memory created to hold an instance of `Tea`, 8 bytes, can't represent the 24 bytes need for `User`. But for this instance of `Tea`, we know that there are 24 bytes available "around" `tea`. Where exactly those 24 bytes start depends on the relative position of `user.id` to `user` itself. If we don't adjust for that offset, we risk crashing unless the offset happens to be 0. Since we know the offset is 16, not 0, this should crash: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; var tea: *Tea = @ptrCast(&user.id;); const user2: *User = @ptrCast(&tea;); std.debug.print("{any}\n", .{user2}); } This is our `user`'s memory (as 24 contiguous bytes of memory, broken up by the 3 8-byte fields): name.ptr => x, x, x, x, x, x, x, x name.len => 4, 0, 0, 0, 0, 0, 0, 0, name.id => 41, 35, 0, 0, 0, 0, 0, 0 And when we make `tea` from `&name.id;`:   name.ptr => x, x, x, x, x, x, x, x name.len => 4, 0, 0, 0, 0, 0, 0, 0, tea => name.id => 41, 35, 0, 0, 0, 0, 0, 0 more memory, but not ours to play with If we try to cast `tea` back into a `*User`, we'll be 16 bytes off, and end up reading 16 bytes of memory adjacent to `tea` which isn't ours. To make this work, we need to take the address of `tea` and subtract the `@offset(User, "id")` from it: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user.id;); const user2: *User = @ptrFromInt(@intFromPtr(tea) - @offsetOf(User, "id")); std.debug.print("{any}\n", .{user2}); } Because we use `@offsetOf`, it no longer matters how the structure is laid out. We're always able to find the starting address of `user` based on the address of `user.id` (which is where `tea` points to) because we know `@offsetOf(User, "id")`. ### As Raw Memory The above example is convoluted. There's no relationship between the data of a `User` and of `Tea`. What does it mean to create `Tea` out of a user's `id`? Nothing. What if we forget about `user`'s data, the `id` and `name`, and treat those 24 bytes as usable space? const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); tea.* = .{.price = 2492, .type = .black}; std.debug.print("{any}\n", .{tea}); } `user` and `tea` still share the same memory. We cannot safely use `user` after writing to `tea.*` - that write might have stored data that cannot safely be interpreted as a `User`. Specifically in this case, the write to tea has probably made `name.ptr` point to invalid memory. But if we're done with `user` and know it won't be used again, we just saved a few bytes of memory by re-using its space. This can go on forever. We can safely re-use the space to create another `User`, as long as we're 100% sure that we're done with `tea:`: pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); tea.* = .{.price = 2492, .type = .black}; std.debug.print("{any}\n", .{tea}); const user2: *User = @ptrCast(@alignCast(tea)); user2.* = .{.id = 32, .name = "Son Goku"}; std.debug.print("{any}\n", .{user2}); } We can re-use those 24 bytes to represent anything that takes 24 bytes of memory or less. The best practical example of this is `std.heap.MemoryPool(T)`. The `MemoryPool` is an allocator that can create a single type, `T`. That might not sound particularly useful, but using what we've learned so far, it can efficiently at re-use memory of discarded values. We'll build a simplified version to see how it works, starting with a basic API - one without any recycling ability. Further, rather than make it generic, we'll make a `UserPool` specific for `User`: pub const UserPool = struct { allocator: Allocator, pub fn init(allocator: Allocator) UserPool { return .{ .allocator = allocator, }; } pub fn create(self: *UserPool) !*User { return self.allocator.create(User); } pub fn destroy(self; *UserPool, user: *User) void { self.allocator.destroy(user); } }; As-is, this is just a wrapper that limits what the allocator is able to create. Not particularly useful. But what if instead of destroying a `user` we made it available to subsequent `create`? One way to do that would be to hold an `std.SinglyLinkedList`. But for that to work, we'd need to make additional allocations - the linked list node has to exist somewhere. But why? The `@sizeOf(User)` is large enough to be used as-is, and whenever a `user` is destroyed, we're being told that memory is free to be used. If an application _did_ use a `user` after destroying it, it would be undefined behavior, just like it is with any other allocator. Let's add a bit of decoration to our `UserPool`: pub const UserPool = struct { allocator: Allocator, free_list: ?*FreeEntry = null, const FreeEntry = struct { next: ?*FreeEntry, }; // rest is unchanged . . . for now. }; We've added a linked list to our `UserPool`. Every `FreeEntry` points to another `*FreeEntry` or `null`, including the initial one referenced by `free_list`. Now we change `destroy`: pub const UserPool = struct { // ... pub fn destroy(self: *UserPool, user: *User) void { const entry: *FreeEntry = @ptrCast(user); entry.* = .{ .next = self.free_list }; self.free_list = entry; } }; We use the ideas we've explored above to create a simple linked list. All that's left is to change `create` to leverage it: pub const UserPool = struct { // ... pub fn create(self: *UserPool) !*User { if (self.free_list) |entry| { self.free_list = entry.next; return @ptrCast(entry); } return self.allocator.create(User); } }; If we have a `FreeEntry`, then we can turn that into a `*User`. We make sure to advanced our `free_list` to the next entry, which might be `null`. If there isn't an available `FreeEntry`, we allocate a new one. As a final step, we should add a `deinit` to free the memory held by our `free_list`: pub const UserPool = struct { // ... pub fn deinit(self: *UserPool) void { var entry = self.free_list; while (entry) |e| { entry = e.next; const user: *User = @ptrCast(e); self.allocator.destroy(user); } self.free_list = null; } }; That final `@ptrCast` from a `*FreeEntry` to a `*User` might seem unnecessary. If we're freeing the memory, why does the type matter? But allocators only know how much memory to free because the compiler tells them - based on the type. Freeing `e`, a `*FreeEntry` would only work if `@sizeOf(FreeEntry) == @sizeOf(User)` (which it isn't). In addition to being generic, Zig's actual `MemoryPool` is a bit more sophisticated, handling different alignments and even handling the case where `@sizeOf(T) < @sizeOf(FreeEntry)`, but our `UserPool` is pretty close. ### Conclusion By altering the compiler's view of our program, we can do all types of things and get into all types of trouble. While these manipulations can be done safely, they rely on understanding the lack of guarantees Zig makes. If you're programming in Zig, this is the type of thing you should try to get comfortable with. Most of this is fundamental regardless of the programming language, it's just that some languages, like Zig, give you more control. I had initially planned on writing a version of `MemoryPool` which expanded on the standard library's. I wanted to create a pool for multiple types. For example, one that can be used for both `User` and `Tea` instances The trick, of course, would be to always allocate memory for the largest supported type (`User` in this case). But this post is already long, so I leave it as an exercise for you. Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Is Zig's New Writer Unsafe?
If we wanted to write a function that takes one of Zig's new `*std.Io.Reader` and write it to stdout, we might start with something like: fn output(r: *std.Io.Reader) !void { const stdout = std.fs.File.stdout(); var buffer: [???]u8 = undefined; var writer = stdout.writer(&buffer;); _ = try r.stream(&writer.interface;, .unlimited); try writer.interface.flush(); } But what should the size of `buffer` be? If this was a one-and-done, maybe we'd leave it empty or put some seemingly sensible default, like 1K or 4K. If it was a mission critical piece of code, maybe we'd benchmark it or make it platform dependent. But unless I'm missing something, whatever size we use, this function's behavior is undefined. You see, the issue is that readers can require a specific buffer sizes on a writer (and writers can require a specific buffer size on a reader). For example, this code, with a small buffer of 64, fails an assertion in debug mode, and falls into an endless loop in release mode: const std = @import("std"); pub fn main() !void { var fixed = std.Io.Reader.fixed(&.{ 40, 181, 47, 253, 36, 110, 149, 0, 0, 88, 111, 118, 101, 114, 32, 57, 48, 48, 48, 33, 10, 1, 0, 192, 105, 241, 2, 170, 69, 248, 150 }); var decompressor = std.compress.zstd.Decompress.init(&fixed;, &.{}, .{}); try output(&decompressor.reader;); } fn output(r: *std.Io.Reader) !void { const stdout = std.fs.File.stdout(); var buffer: [64]u8 = undefined; var writer = stdout.writer(&buffer;); _ = try r.stream(&writer.interface;, .unlimited); try writer.interface.flush(); } Some might argue that this is a documentation challenge. It's true that the documentation for `zstd.Decompress` mentions what a `Writer`'s buffer must be. **But this is not a documentation problem**. There are legitimate scenarios where the nature of a `Reader` is unknown (or, at least, difficult to figure out). A type of a reader could be conditional, say based on an HTTP response header. A library developer might take a `Reader` as an input and present their own `Reader` as an output - what buffer requirement should they document? Worse is that the failure can be conditional on the input. For example, if we change our source to: var fixed = std.Io.Reader.fixed(&.{ 40, 181, 47, 253, 36, 11, 89, 0, 0, 111, 118, 101, 114, 32, 57, 48, 48, 48, 33, 10, 112, 149, 178, 212, }); Everything works, making this misconfiguration particularly hard to catch early. To me this seems almost impossible - like, I must be doing something wrong. And if I am, I'm sorry. But, if I'm not, this is a problem right? Leave a comment
www.openmymind.net
October 15, 2025 at 6:37 AM
Is Zig's New Writer Unsafe?
If we wanted to write a function that takes one of Zig's new `*std.Io.Reader` and write it to stdout, we might start with something like: fn output(r: *std.Io.Reader) !void { const stdout = std.fs.File.stdout(); var buffer: [???]u8 = undefined; var writer = stdout.writer(&buffer;); _ = try r.stream(&writer.interface;, .unlimited); try writer.interface.flush(); } But what should the size of `buffer` be? If this was a one-and-done, maybe we'd leave it empty or put some seemingly sensible default, like 1K or 4K. If it was a mission critical piece of code, maybe we'd benchmark it or make it platform dependent. But unless I'm missing something, whatever size we use, this function's behavior is undefined. You see, the issue is that readers can require a specific buffer sizes on a writer (and writers can require a specific buffer size on a reader). For example, this code, with a small buffer of 64, fails an assertion in debug mode, and falls into an endless loop in release mode: const std = @import("std"); pub fn main() !void { var fixed = std.Io.Reader.fixed(&.{ 40, 181, 47, 253, 36, 110, 149, 0, 0, 88, 111, 118, 101, 114, 32, 57, 48, 48, 48, 33, 10, 1, 0, 192, 105, 241, 2, 170, 69, 248, 150 }); var decompressor = std.compress.zstd.Decompress.init(&fixed;, &.{}, .{}); try output(&decompressor.reader;); } fn output(r: *std.Io.Reader) !void { const stdout = std.fs.File.stdout(); var buffer: [64]u8 = undefined; var writer = stdout.writer(&buffer;); _ = try r.stream(&writer.interface;, .unlimited); try writer.interface.flush(); } Some might argue that this is a documentation challenge. It's true that the documentation for `zstd.Decompress` mentions what a `Writer`'s buffer must be. **But this is not a documentation problem**. There are legitimate scenarios where the nature of a `Reader` is unknown (or, at least, difficult to figure out). A type of a reader could be conditional, say based on an HTTP response header. A library developer might take a `Reader` as an input and present their own `Reader` as an output - what buffer requirement should they document? Worse is that the failure can be conditional on the input. For example, if we change our source to: var fixed = std.Io.Reader.fixed(&.{ 40, 181, 47, 253, 36, 11, 89, 0, 0, 111, 118, 101, 114, 32, 57, 48, 48, 48, 33, 10, 112, 149, 178, 212, }); Everything works, making this misconfiguration particularly hard to catch early. To me this seems almost impossible - like, I must be doing something wrong. And if I am, I'm sorry. But, if I'm not, this is a problem right? Leave a comment
www.openmymind.net
September 20, 2025 at 2:37 AM
Allocator.resize
There are four important methods on Zig's `std.mem.Allocator` interface that Zig developers must be comfortable with: * `alloc(T, n)` - which creates an array of `n` items of type `T`, * `free(ptr)` - which frees memory allocate with `alloc` (although, this is implementation specific), * `create(T)` - which creates a single item of type `T`, and * `destroy(ptr)` - which destroys an item created with `create` While you might never need to use them, the `Allocator` interface has other methods which, if nothing else, can be useful to be aware of and informative to learn about. In particularly, the `resize` method is used to try and resize an existing allocation to a larger (or smaller) size. The main promise of `resize` is that it's guaranteed _not_ to move the pointer. However, to satisfy that guarantee, resize is allowed to fail, in which case nothing changes. We can imagine a simple allocation: // var buf = try allocator.alloc(u8, 5); // buf[0] = 'h' 0x102e00000 ------------------------------- buf.ptr -> | h | | | | | -------------------------------- Now, if we were to call `allocator.resize(buf, 7)`, there are be two possible outcomes. The first is that the call returns `false`, indicating that the resize operation fails, and thus nothing changed:: 0x102e00000 ------------------------------- buf.ptr -> | h | | | | | -------------------------------- However, when `resize` succeeds and returns `true`, the allocated space has grown without having relocated (i.e. moved) the pointer: 0x102e00000 ------------------------------------------- buf.ptr -> | h | | | | | | | -------------------------------------------- Now under what circumstances `resize` succeeds and fails is a black box. It depends on a lot of factors and is going to be allocator-specific. For example, for me, this code prints `false` indicating that the resize failed: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); std.debug.print("{any}\n", .{allocator.resize(buf, 20)}); allocator.free(buf); } Because we're using a `GeneralPurposeAllocator` (that name is deprecated in Zig 0.14 in favor of `DebugAllocator`) we could dive into its internals and try to leverage knowledge of its implementation to force a resize to succeed, but a simpler option is to resize our buffer to `0`: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); // change 20 -> 0 std.debug.print("{any}\n", .{allocator.resize(buf, 0)}); allocator.free(buf); } Success, the code now prints `true`, indicating that the resize succeeded. However, I also get **segfault**. Can you guess what we're doing wrong? In our above visualization, we saw how a successful resize does not move our pointer. We can confirm this by looking at the address of `buf.ptr` before and after our resize. This code still segfaults, but it prints out the information first: pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); const buf = try allocator.alloc(usize, 10); std.debug.print("address before resize: {*}\n", .{buf.ptr}); std.debug.print("resize succeeded: {any}\n", .{allocator.resize(buf, 0)}); std.debug.print("address after resize: {*}\n", .{buf.ptr}); allocator.free(buf); } So far, we've only considered the `ptr` of our slice, but, like the criminal justice system, a slice is represented by two separate yet equally important groups: a `ptr` and a `len`. If we change our code to also look at the `len` of `buf`, the issue might become more obvious: // change the 1st and 3rd line to also print buf.len: std.debug.print("address & len before resize: {*} {d}\n", .{buf.ptr, buf.len}); std.debug.print("resize succeeded: {any}\n", .{allocator.resize(buf, 0)}); std.debug.print("address & len after resize: {*} {d}\n", .{buf.ptr, buf.len}); This is what I get: address & len before resize: usize@100280000 10 resize succeeded: true address & len after resize: usize@100280000 10 Segmentation fault at address 0x100280000 While it isn't the cleanest output, notice that even after we successfully resize our ptr, the length remains unchanged (i.e. `10`). Herein lies our bug problem. `resize` updates the underlying memory, it doesn't update the length of the slice. That's something we need to take care of. Here's a non-crashing version: const std = @import("std"); pub fn main() !void { var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init; const allocator = gpa.allocator(); _ = gpa.detectLeaks(); var buf = try allocator.alloc(usize, 10); if (allocator.resize(buf, 0)) { std.debug.print("resize succeeded!\n", .{}); buf.len = 0; } else { // we need to handle the case where resize doesn't succeed } allocator.free(buf); } What's left out of the above code is handling the case where `resize` fails. This becomes application specific. In most cases, where we're likely resizing to a larger size, we'll generally need to fallback to calling `alloc` to create our larger memory, and then, most likely, `@memcpy` to copy data from the existing (now too small) buffer to the newly created larger one. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
Switching on Strings in Zig
Newcomers to Zig will quickly learn that you can't switch on a string (i.e. `[]const u8`). The following code gives us the unambiguous error message _cannot switch on strings_ : switch (color) { "red" => {}, "blue" => {}, "green" => {}, "pink" => {}, else => {}, } I've seen two explanations for why this isn't supported. The first is that there's ambiguity around string identity. Are two strings only considered equal if they point to the same address? Is a null-terminated string the same as its non-null-terminated counterpart? The other reason is that users of `switch` apparently] expect [certain optimizations which are not possible with strings (although, presumably, these same users would know that such optimizations aren't possible with string). Instead, in Zig, there are two common methods for comparing strings. ### std.mem.eql The most common way to compare strings is using `std.mem.eql` with `if / else if / else`: if (std.mem.eql(u8, color, "red") == true) { } else if (std.mem.eql(u8, color, "blue") == true) { } else if (std.mem.eql(u8, color, "green") == true) { } else if (std.mem.eql(u8, color, "pink") == true) { } else { } The implementation for `std.mem.eql` depends on what's being compared. Specifically, it has an optimized code path when comparing strings. Although that's what we're interested in, let's look at the non-optimized version: pub fn eql(comptime T: type, a: []const T, b: []const T) bool { if (a.len != b.len) return false; if (a.len == 0 or a.ptr == b.ptr) return true; for (a, b) |a_elem, b_elem| { if (a_elem != b_elem) return false; } return true; } Whether we're dealing with slices of bytes or some other type, if they're of different length, they can't be equal. Once we know that they're the same length, if they point to the same memory, then they must be equal. I'm not a fan of this second check; it might be cheap, but I think it's quite uncommon. Once those initial checks are done, we compare each element (each byte of our string) one at a time. The optimized version, which _is_ used for strings, is much more involved. But it's fundamentally the same as the above with SIMD to compare multiple bytes at once. The nature of string comparison means that real-world performance is dependent on the values being compared. We know that if we have 100 `if / else if` branches then, at the worse case, we'll need to call `std.mem.eql` 100 times. But comparing strings of different lengths or strings which differ early will be significantly faster. For example, consider these three cases: { const str1 = "a" ** 10_000 ++ "1"; const str2 = "a" ** 10_000 ++ "2"; _ = std.mem.eql(u8, str1, str2); } { const str1 = "1" ++ a" ** 10_000; const str2 = "2" ++ a" ** 10_000; _ = std.mem.eql(u8, str1, str2); } { const str1 = "a" ** 999_999; const str2 = "a" ** 1_000_000; _ = std.mem.eql(u8, str1, str2); } For me, the first comparison takes ~270ns, whereas the other two take ~20ns - despite the last one involving much larger strings. The second case is faster because the difference is early in the string allowing the `for` loop to return after only one comparison. The third case is faster because the strings are of a different length: `false` is returned by the initial `len` check. ### std.meta.stringToEnum The `std.meta.stringToEnum` takes an enum type and a string value and returns the corresponding enum value or null. This code prints "you picked: blue" const std = @import("std"); const Color = enum { red, blue, green, pink, }; pub fn main() !void { const color = std.meta.stringToEnum(Color, "blue") orelse { return error.InvalidChoice; }; switch (color) { .red => std.debug.print("you picked: red\n", .{}), .blue => std.debug.print("you picked: blue\n", .{}), .green => std.debug.print("you picked: green\n", .{}), .pink => std.debug.print("you picked: pink\n", .{}), } } If you don't need the enum type (i.e. `Color`) beyond this check, you can leverage Zig's anonymous types. This is equivalent: const std = @import("std"); pub fn main() !void { const color = std.meta.stringToEnum(enum { red, blue, green, pink, }, "blue") orelse return error.InvalidChoice; switch (color) { .red => std.debug.print("you picked: red\n", .{}), .blue => std.debug.print("you picked: blue\n", .{}), .green => std.debug.print("you picked: green\n", .{}), .pink => std.debug.print("you picked: pink\n", .{}), } } It's **not** obvious how this should perform versus the straightforward `if / else if` approach. Yes, we now have a `switch` statement that the compiler can [hopefully] optimize, but `std.meta.stringToEnum` still has convert our input, `"blue"`, into an enum. The implementation of `std.meta.stringToEnum` depends on the number of possible values, i.e. the number of enum values. Currently, if there are more than 100 values, it'll fallback to using the same `if / else if` that we explored above. Thus, with more than 100 values it does the `if / else if` check PLUS the switch. This should improve in the future. However, with 100 or fewer values, `std.meta.stringToEnum` creates a comptime `std.StaticStringMap` which can then be used to lookup the value. `std.StaticStringMap` isn't something we've looked at before. It's a specialized map that buckets keys by their length. Its advantage over Zig's other hash maps is that it can be constructed at compile-time. For our `Color` enum, the internal state of a `StaticStringMap` would look something like: // keys are ordered by length keys: ["red", "blue", "pink", "green"]; // values[N] corresponds to keys[N] values: [.red, .blue, .pink, .green]; // What's this though? indexes: [0, 0, 0, 0, 1, 3]; It might not be obvious how `indexes` is used. Let's write our own `get` implementation, simulating the above `StaticStringMap` state: fn get(str: []const u8) ?Color { // Simulate the state of the StaticStringMap which // stringToMeta built at compile-time. const keys = [_][]const u8{"red", "blue", "pink", "green"}; const values = [_]Color{.red, .blue, .pink, .green}; const indexes = [_]usize{0, 0, 0, 0, 1, 3}; if (str.len >= indexes.len) { // our map has no strings of this length return null; } var index = indexes[str.len]; while (index < keys.len) { const key = keys[index]; if (key.len != str.len) { // we've gone into the next bucket, everything after // this is longer and thus can't be a match return null; } if (std.mem.eql(u8, key, str)) { return values[index]; } index += 1; } return null; } Take note that `keys` are ordered by length. As a naive implementation, we could iterate through the keys until we either find a match or find a key with a longer length. Once we find a key with a longer length, we can stop searching, as all remaining candidates won't match - they'll all be too long. `StaticStringMap` goes a step further and records the index within `keys` where entries of a specific length begin. `indexes[3]` tells us where to start looking for keys with a length of 3 (at index 0). `indexes[5]` tells us where to start looking for keys with a length of 5 (at index 3). Above, we fallback to using `std.mem.eql` for any key which is the same length as our target string. `StaticStringMap` uses its own "optimized" version: pub fn defaultEql(a: []const u8, b: []const u8) bool { if (a.ptr == b.ptr) return true; for (a, b) |a_elem, b_elem| { if (a_elem != b_elem) return false; } return true; } This is the same as the simple `std.mem.eql` implementation, minus the length check. This is done because the `eql` within our `while` loop is only ever called for values with matching length. On the flip side, `StaticStringMap`'s `eql` doesn't use SIMD, so it would be slower for large strings. `StaticStringMap` is a wrapper to `StaticStringMapWithEql` which accept a custom `eql` function, so if you _did_ want to use it for long strings or some other purposes, you have a reasonable amount of flexibility. You even have the option to use `std.static_string_map.eqlAsciiIgnoreCase` for ASCII-aware case-insensitive comparison. ### Conclusion In my own benchmarks, in general, I've seen little difference between the two approaches. It does seem like `std.meta.stringToEnum` is generally as fast or faster. It also results in more concise code and is ideal if the resulting enum is useful beyond the comparison. You usually don't have long enum values, so the lack of SIMD-optimization isn't a concern. However, if you're considering building your own `StaticStringMap` at compile time with long keys, you should benchmark with a custom `eql` function based on `std.mem.eql`. We could manually bucket those `if / else if` branches ourselves, similar to what the `StaticStringMap` does. Something like: switch (color.len) { 3 => { if (std.mem.eql(u8, color, "red") == true) { // ... return; } }, 4 => { if (std.mem.eql(u8, color, "blue") == true) { // ... return; } if (std.mem.eql(u8, color, "pink") == true) { // ... return; } }, 5 => { if (std.mem.eql(u8, color, "green") == true) { // ... return; } }, else => {}, } // not found Ughhh. This highlights the convenience of using `std.meta.stringToEnum` to generate similar code. Also, do remember that `std.mem.eql` quickly discards strings of different lengths, which helps to explain why both approaches generally perform similarly. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
GetOrPut With String Keys
I've previously blogged about how much I like Zig's `getOrPut` hashmap method. As a brief recap, we can visualize Zig's hashmap as two arrays: keys: values: -------- -------- | Paul | | 1234 | @mod(hash("Paul"), 5) == 0 -------- -------- | | | | -------- -------- | | | | -------- -------- | Goku | | 9001 | @mod(hash("Goku"), 5) == 3 -------- -------- | | | | -------- -------- When we call `get("Paul")`, we could think of this simplified implementation: fn get(map: *Self, key: K) ?V { const index = map.getIndexOf(key) orelse return null; return map.values[index]; } And, when we call `getPtr("Paul")`, we'd have this implementation: fn getPtr(map: *Self, key: K) ?*V { const index = map.getIndexOf(key) orelse return null; // notice the added '&' // we're taking the address of the array index return &map.values;[index]; } By taking the address of the value directly from the hashmap's array, we avoid copying the entire value. That can have performance implications (though not for the integer value we're using here). It also allows us to directly manipulate that slot of the array: const value = map.getPtr("Paul") orelse return; value.* = 10; This is a powerful feature, but a dangerous one. If the underlying array changes, as can happen when items are added to the map, `value` would become invalid. So, while `getPtr` is useful, it requires mindfulness: try to minimize the scope of such references. Currently, Zig's HashMap doesn't shrink when items are removed, so, for now, removing items doesn't invalidate any pointers into the hashmap. But expect that to change at some point. ### GetOrPut `getOrPut` builds on the above concept. It returns a pointer to the value **and** the key, as well as creating the entry in the hashmap if necessary. For example, given that we already have an entry for "Paul", if we call `map.getOrPut("Paul")`, we'd get back a `value_ptr` that points to a slot in the hahmap's `values` array, as well as a`key_ptr` that points to a slot in the hashmap's `keys` array. If the requested key _doesn't_ exist, we get back the same two pointers, and it's our responsibility to set the value. If I asked you to increment counters inside of a hashmap, without `getOrPut`, you'd end up with two hash lookups: // Go count, exists := counters["hits"] if exists == false { counters["hits"] = 1 } else { counters["hits"] = count + 1; } With `getOrPut`, it's a single hash lookup: const gop = try counters.getOrPut("hits"); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } ### getOrPut With String Keys It seems trivial, but the most important thing to understand about `getOrPut` is that it will set the key for you if the entry has to be created. In our last example, notice that even when `gop.found_existing == false`, we never set `key_ptr` - `getOrPut` automatically sets it to the key we pass in, i.e. `"hits"`. If we were to put a breakpoint after `getOrPut` returns but before we set the value, we'd see that our two arrays look something like: keys: values: -------- -------- | | | | -------- -------- | hits | | ???? | -------- -------- | | | | -------- -------- Where the entry in the `keys` array is set, but the corresponding entry in `values` is left undefined. You'll note that `getOrPut` doesn't take a value. I assume this is because, in some cases, the value might be expensive to derive, so the current API lets us avoid calculating it when `gop.found_existing == true`. This is important for keys that need to be owned by the hashmap. Most commonly strings, but this applies to any other key which we'll "manage". Taking a step back, if we wanted to track hits in a hashmap, and, most likely, we wanted the lifetime of the keys to be tied to the hashmap, you'd do something like: fn register(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); try map.put(owned, 0); } Creating your "owned" copy of `name`, frees the caller from having to maintain `name` beyond the call to `register`. Now, if this key is removed, or the entire map cleaned up, we need to free the keys. That's why I like the name "owned", it means the hash map "owns" the key (i.e. is responsible for freeing it): var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); The interaction between key ownership and `getOrPut` is worth thinking about. If we try to merge this ownership idea with our incrementing counter code, we might try: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } But this code has a potential memory leak, can you spot it? (see Appendix A for a complete runnable example) When `gop.found_existing == true`, `owned` is never used and never freed. One bad option would be to free `owned` when the entry already exists: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { // This line was added. But this is a bad solution allocator.free(owned); gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } It works, but we needlessly `dupe` `name` if the entry already exists. Rather than prematurely duping the key in case the entry doesn't exist, we want to delay our `dupe` until we know it's needed. Here's a better option: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // this line was added gop.key_ptr.* = try allocator.dupe(u8, name); gop.value_ptr.* = 1; } } It might seem reckless to pass `name` into `getOrPut`. We need the key to remain valid as long as the map entry exists. Aren't we undermining that requirement? Let's walk through the code. When `hit` is called for a new `name`, `gop.found_existing` will be false. `getOrPut` will insert `name` in our `keys` array. This is bad because we have no `guarantee` that `name` will be valid for as long as we need it to be. But the problem is immediately remedied when we overwrite `key_ptr.*`. On subsequent calls for an existing `name`, when `gop.found_existing == true`, the `name` is only used as a lookup. It's no different than doing a `getPtr`; `name` only has to be valid for the call to `getOrPut` because `getOrPut` doesn't keep a reference to it when an existing entry is found. ### Conclusion This post was a long way to say: don't be afraid to write to `key_ptr.*`. Of course you can screw up your map this way. Consider this example: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // what's this? gop.key_ptr.* = "HELLO"; gop.value_ptr.* = 1; } } Because the key is used to organize the map - find where items go and where they are, jamming random keys where they don't belong is going to cause issues. But it can also be used correctly and safely, as long as you understand the details. ### Appendix A - Memory Leak This code `should` report a memory leak. const std = @import("std"); const Allocator = std.mem.Allocator; pub fn main() !void { var gpa = std.heap.GeneralPurposeAllocator(.{}){}; const allocator = gpa.allocator(); defer _ = gpa.detectLeaks(); // I'm using the Unmanaged variant because the Managed ones are likely to // be removed (which I think is a mistake). Using Unmanaged makes this // snippet more future-proof. I explain unmanaged here: // https://www.openmymind.net/Zigs-HashMap-Part-1/#Unmanaged var map: std.StringHashMapUnmanaged(u32) = .{}; try hit(allocator, ↦, "teg"); try hit(allocator, ↦, "teg"); var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); } fn hit(allocator: Allocator, map: *std.StringHashMapUnmanaged(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(allocator, owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
Everything is a []u8
If you're coming to Zig from a more hand-holding language, one of the things worth exploring is the relationship between the compiler and memory. I think code is the best way to do that, but briefly put into words: the memory that your program uses is all just bytes; it is only the compile-time information (the type system) that gives meaning to and dictates how that memory is used and interpreted. This is meaningful in Zig and other similar languages because developers are allowed to override how the compiler interprets those bytes. > This is something I've written about before; longtime readers might find this post repetitive. Consider this code: const std = @import("std"); pub fn main() !void { std.debug.print("{d}\n", .{@sizeOf(User)}); } const User = struct { id: u32, name: []const u8, }; It _should_ print 24. The point of this post isn't _why_ it prints 24. What's important here is that when we create a `User` - whether it's on the stack or the heap - it is represented by 24 bytes of memory. If you examine those 24 bytes, there's nothing "User" about them. The memory isn't self-describing - that would be inefficient. Rather, it's the compiler itself that maintains meta data about memory. Very naively, we could imagine that the compiler maintains a lookup where the key is the variable name and the value is the memory address (our 24 bytes) + the type (`User`). The fun, and sometimes useful thing about this is that we can alter the compiler's meta data. Here's an working but impractical example: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); std.debug.print("{any}\n", .{tea}); } const User = struct { id: u32, name: []const u8, }; const Tea = struct { price: u32, type: TeaType, const TeaType = enum { black, white, green, herbal, }; }; First we create a `User` - nothing unusual about that. Next we use @ptrCast to tell the compiler to treat the memory referenced by `user` as a `*Tea`. `@ptrCast` works on addresses, which is why we give it address of (`&`) `user` and get back a pointer (`*`) to `Tea`. Here the return type of `@ptrCast` is inferred by the type it's being assigned to. You might have some questions like what does it print? Or, is it safe? And, is this ever useful? We'll dig more into the safety of this in a bit. But briefly, the main concern is about the size of our structures. If `@sizeOf(User)` is 24 bytes, we'll be able to re-interpret that memory as anything which is 24 bytes or less. The `@sizeOf(Tea)` is 8 bytes, so this is safe. I get different results on each run: .{ .price = 39897726, .type = .white } .{ .price = 75123326, .type = .white } .{ .price = 6441598, .type = .white } .{ .price = 77826686, .type = .white } .{ .price = 4950654, .type = .white } .{ .price = 69438078, .type = .white } .{ .price = 78498430, .type = .white } .{ .price = 79022718, .type = .white } It's possible (but not likely) you get consistent result. I find these results surprising. If had to imagine what the 24 bytes of `user` looks like, I'd come up with: 41, 35, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, x, x, x, x, x, x, x, x Why that? Well, I'd expect the first 8 bytes to be the id, 9001, which has a byte representation of `41, 35, 0, 0, 0, 0, 0, 0`. The next 8 bytes I think would be the string length, or `4, 0, 0, 0, 0, 0, 0, 0` The last 8 bytes would be the pointer to actual string value - an address that I have no way of guessing, so I mark it with `x, x, x, x, x, x, x, x`. > If you think the `id` should only take 4 bytes, given that it's a u32, good! But Zig will usually align struct fields, so it really will take 8 bytes. That isn't something we'll dive into this post though. Since `Tea` is only 8 bytes and since the first 8 bytes of `user` are always the same (only the pointer to the name value changes from instance to instance and from run to run), shouldn't we always get the same `Tea` value? Yes, but only if I'm correct about the contents of those 24 bytes for `user`. Unless we tell it otherwise, Zig makes no guarantees about how it lays out the fields of a struct. The fact that our `tea` keeps changing, makes me believe that, for reasons I don't know, Zig decided to put the pointer to our name at the start. The reason you might get different results is that Zig might have organized the user's memory different based on your platform or version of Zig (or any other factor, but those are the two more realistic reasons). So while this code might never crash, doesn't the lack of guarantee make it useless? No. At least not in three cases. ### Well-Defined In-Memory Layout While Zig usually doesn't make guarantees about how data will be organized, C programs **do** . In Zig, a structure declared as `extern` follows that specification. We can similarly declare a structure as `packed` which also has a well-defined memory layout (but just not necessarily the same as C's / `extern`). `extern` and `packed` structs can only contain `extern` and `packed` fields. In order for a struct to have a well-known memory layout, all of its field must have a well-known memory layout. They can't, for example, have slices - which don't have a guaranteed layout. Still, here's a reliable and realistic example: const std = @import("std"); pub fn main() !void { var manager = Manager{.id = 4, .name = "Leto", .name_len = 4, .level = 99}; const user: *User = @ptrCast(&manager;); std.debug.print("{d}: {s}\n", .{user.id, user.name[0..user.name_len]}); } const User = extern struct { id: u32, name: [*c]const u8, name_len: usize, }; const Manager = extern struct { id: u32, name: [*c]const u8, name_len: usize, level: u16, }; Part of the guarantee is that the fields are laid out in the order that they're declared. Above, when I guessed at the layout of `user`, I made that assumption - but it's only valid for `extern` structs. We can be sure that the above code will print `4: Leto` because `Manager` has the same fields as `User` and in the same order. We can, and should, make this more explicit: const Manager = extern struct { user: User, level: u16, }; Because the type information is only meta data of the compiler, both declarations of `Manager` are the same - they're the same size and have the same layout. There's no overhead to embedding the `User` into `Manager` this way. This type of memory-reinterpretation can be found in some C code and thus see in any Zig code that interacts with such a C codebase. ### Leveraging Zig Builtins While we can't assume anything about the memory layout of non-extern (or packed) struct, we can leverage various built-in functions to programmatically figure things out, such as `@sizeOf`. Probably the most useful is `@offsetOf` which gives us the offset of a field in bytes. const std = @import("std"); pub fn main() !void { std.debug.print("name offset: {d}\n", .{@offsetOf(User, "name")}); std.debug.print("id offset: {d}\n", .{@offsetOf(User, "id")}); } const User = struct { id: u32, name: []const u8, }; For me, this prints: name offset: 0 id offset: 16 This helps confirm that Zig did, in fact, put the `name` before the `id`. We saw the result of that when we treated the user's memory as an instance of `Tea`. If we wanted to create a `Tea` based on the address of `user.id` rather than `user`, we could do: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; // changed from &user; to &user.id; const tea: *Tea = @ptrCast(&user.id;); std.debug.print("{any}\n", .{tea}); } This will now always output the same result. But how would we take `tea` and get a `user` out of it? Generally speaking, this wouldn't be safe since `@sizeOf(Tea) < @sizeOf(User)` - the memory created to hold an instance of `Tea`, 8 bytes, can't represent the 24 bytes need for `User`. But for this instance of `Tea`, we know that there are 24 bytes available "around" `tea`. Where exactly those 24 bytes start depends on the relative position of `user.id` to `user` itself. If we don't adjust for that offset, we risk crashing unless the offset happens to be 0. Since we know the offset is 16, not 0, this should crash: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; var tea: *Tea = @ptrCast(&user.id;); const user2: *User = @ptrCast(&tea;); std.debug.print("{any}\n", .{user2}); } This is our `user`'s memory (as 24 contiguous bytes of memory, broken up by the 3 8-byte fields): name.ptr => x, x, x, x, x, x, x, x name.len => 4, 0, 0, 0, 0, 0, 0, 0, name.id => 41, 35, 0, 0, 0, 0, 0, 0 And when we make `tea` from `&name.id;`:   name.ptr => x, x, x, x, x, x, x, x name.len => 4, 0, 0, 0, 0, 0, 0, 0, tea => name.id => 41, 35, 0, 0, 0, 0, 0, 0 more memory, but not ours to play with If we try to cast `tea` back into a `*User`, we'll be 16 bytes off, and end up reading 16 bytes of memory adjacent to `tea` which isn't ours. To make this work, we need to take the address of `tea` and subtract the `@offset(User, "id")` from it: const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user.id;); const user2: *User = @ptrFromInt(@intFromPtr(tea) - @offsetOf(User, "id")); std.debug.print("{any}\n", .{user2}); } Because we use `@offsetOf`, it no longer matters how the structure is laid out. We're always able to find the starting address of `user` based on the address of `user.id` (which is where `tea` points to) because we know `@offsetOf(User, "id")`. ### As Raw Memory The above example is convoluted. There's no relationship between the data of a `User` and of `Tea`. What does it mean to create `Tea` out of a user's `id`? Nothing. What if we forget about `user`'s data, the `id` and `name`, and treat those 24 bytes as usable space? const std = @import("std"); pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); tea.* = .{.price = 2492, .type = .black}; std.debug.print("{any}\n", .{tea}); } `user` and `tea` still share the same memory. We cannot safely use `user` after writing to `tea.*` - that write might have stored data that cannot safely be interpreted as a `User`. Specifically in this case, the write to tea has probably made `name.ptr` point to invalid memory. But if we're done with `user` and know it won't be used again, we just saved a few bytes of memory by re-using its space. This can go on forever. We can safely re-use the space to create another `User`, as long as we're 100% sure that we're done with `tea:`: pub fn main() !void { var user = User{.id = 9001, .name = "Goku"}; const tea: *Tea = @ptrCast(&user;); tea.* = .{.price = 2492, .type = .black}; std.debug.print("{any}\n", .{tea}); const user2: *User = @ptrCast(@alignCast(tea)); user2.* = .{.id = 32, .name = "Son Goku"}; std.debug.print("{any}\n", .{user2}); } We can re-use those 24 bytes to represent anything that takes 24 bytes of memory or less. The best practical example of this is `std.heap.MemoryPool(T)`. The `MemoryPool` is an allocator that can create a single type, `T`. That might not sound particularly useful, but using what we've learned so far, it can efficiently at re-use memory of discarded values. We'll build a simplified version to see how it works, starting with a basic API - one without any recycling ability. Further, rather than make it generic, we'll make a `UserPool` specific for `User`: pub const UserPool = struct { allocator: Allocator, pub fn init(allocator: Allocator) UserPool { return .{ .allocator = allocator, }; } pub fn create(self: *UserPool) !*User { return self.allocator.create(User); } pub fn destroy(self; *UserPool, user: *User) void { self.allocator.destroy(user); } }; As-is, this is just a wrapper that limits what the allocator is able to create. Not particularly useful. But what if instead of destroying a `user` we made it available to subsequent `create`? One way to do that would be to hold an `std.SinglyLinkedList`. But for that to work, we'd need to make additional allocations - the linked list node has to exist somewhere. But why? The `@sizeOf(User)` is large enough to be used as-is, and whenever a `user` is destroyed, we're being told that memory is free to be used. If an application _did_ use a `user` after destroying it, it would be undefined behavior, just like it is with any other allocator. Let's add a bit of decoration to our `UserPool`: pub const UserPool = struct { allocator: Allocator, free_list: ?*FreeEntry = null, const FreeEntry = struct { next: ?*FreeEntry, }; // rest is unchanged . . . for now. }; We've added a linked list to our `UserPool`. Every `FreeEntry` points to another `*FreeEntry` or `null`, including the initial one referenced by `free_list`. Now we change `destroy`: pub const UserPool = struct { // ... pub fn destroy(self: *UserPool, user: *User) void { const entry: *FreeEntry = @ptrCast(user); entry.* = .{ .next = self.free_list }; self.free_list = entry; } }; We use the ideas we've explored above to create a simple linked list. All that's left is to change `create` to leverage it: pub const UserPool = struct { // ... pub fn create(self: *UserPool) !*User { if (self.free_list) |entry| { self.free_list = entry.next; return @ptrCast(entry); } return self.allocator.create(User); } }; If we have a `FreeEntry`, then we can turn that into a `*User`. We make sure to advanced our `free_list` to the next entry, which might be `null`. If there isn't an available `FreeEntry`, we allocate a new one. As a final step, we should add a `deinit` to free the memory held by our `free_list`: pub const UserPool = struct { // ... pub fn deinit(self: *UserPool) void { var entry = self.free_list; while (entry) |e| { entry = e.next; const user: *User = @ptrCast(e); self.allocator.destroy(user); } self.free_list = null; } }; That final `@ptrCast` from a `*FreeEntry` to a `*User` might seem unnecessary. If we're freeing the memory, why does the type matter? But allocators only know how much memory to free because the compiler tells them - based on the type. Freeing `e`, a `*FreeEntry` would only work if `@sizeOf(FreeEntry) == @sizeOf(User)` (which it isn't). In addition to being generic, Zig's actual `MemoryPool` is a bit more sophisticated, handling different alignments and even handling the case where `@sizeOf(T) < @sizeOf(FreeEntry)`, but our `UserPool` is pretty close. ### Conclusion By altering the compiler's view of our program, we can do all types of things and get into all types of trouble. While these manipulations can be done safely, they rely on understanding the lack of guarantees Zig makes. If you're programming in Zig, this is the type of thing you should try to get comfortable with. Most of this is fundamental regardless of the programming language, it's just that some languages, like Zig, give you more control. I had initially planned on writing a version of `MemoryPool` which expanded on the standard library's. I wanted to create a pool for multiple types. For example, one that can be used for both `User` and `Tea` instances The trick, of course, would be to always allocate memory for the largest supported type (`User` in this case). But this post is already long, so I leave it as an exercise for you. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
Comparing Strings as Integers with @bitCast
In the last blog posts, we looked at different ways to compare strings in Zig. A few posts back, we introduced Zig's `@bitCast`. As a quick recap, `@bitCast` lets us force a specific type onto a value. For example, the following prints 1067282596: const std = @import("std"); pub fn main() !void { const f: f32 = 1.23; const n: u32 = @bitCast(f); std.debug.print("{d}\n", .{n}); } What's happening here is that Zig represents the 32-bit float value of `1.23` as: `[4]u8{164, 112, 157, 63}`. This is also how Zig represents the 32-bit unsigned integer value of `1067282596`. Data is just bytes; it's the type system - the compiler's knowledge of what data is what type - that controls what and how that data is manipulated. It might seem like there's something special about bitcasting from a float to an integer; they're both numbers after all. But you can `@bitCast` from any two equivalently sized types. Can you guess what this prints?: const std = @import("std"); pub fn main() !void { const data = [_]u8{3, 0, 0, 0}; const x: i32 = @bitCast(data); std.debug.print("{d}\n", .{x}); } The answer is `3`. Think about the above snippet a bit more. We're taking an array of bytes and telling the compiler to treat it like an integer. If we made `data` equal to `[_]u8{'b', 'l', 'u', 'e'}`, it would still work (and print `1702194274`). We're slowly heading towards being able to compare strings as-if they were integers. If you're wondering why 3 is encoded as `4]u8{3, 0, 0, 0}` and not `[4]u8{0, 0, 0, 3}`, I talked about binary encoding in my [Learning TCP series. From the last post, we could use multiple `std.mem.eql` or, more simply, `std.meta.stringToEnum` to complete the following method: fn parseMethod(value: []const u8) ?Method { // ... } const Method = enum { get, put, post, head, }; We can also use `@bitCast`. Let's take it step-by-step. The first thing we'll need to do is switch on `value.len`. This is necessary because the three-byte "GET" will need to be `@bitCast` to a `u24`, whereas the four-byte "POST" needs to be `@bitCast` to a `u32`: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3]))) { // TODO else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4]))) { // TODO else => {}, }, else => {}, } return null; } If you try to run this code, you'll get a compilation error: _cannot @bitCast from '*const [3]u8'_. `@bitCast` works on actual bits, but when we slice our `[]const u8` with a compile-time known range (`[0..3]`), we get a pointer to an array. We can't `@bitCast` a pointer, we can only `@bitCast` actual bits of data. For this to work, we need to derefence the pointer, i.e. use: `value[0..3].*`. This will turn our `*const [3]u8` into a `const [3]u8`. fn parseMethod(value: []const u8) ?Method { switch (value.len) { // changed: we now derefernce the value (.*) 3 => switch (@as(u24, @bitCast(value[0..3].*))) { // TODO else => {}, }, // changed: we now dereference the value (.*) 4 => switch (@as(u32, @bitCast(value[0..4].*))) { // TODO else => {}, }, else => {}, } return null; } Also, you might have noticed the `@as(u24, ...)` and `@as(u32, ...)`. `@bitCast`, like most of Zig's builtin functions, infers its return type. When we're assiging the result of a `@bitCast` to a variable of a known type, i.e: `const x: i32 = @bitCast(data);`, the return type of `i32` is inferred. In the above `switch`, we aren't assigning the result to a varible. We have to use `@as(u24, ...)` in order for `@bitCast` to kknow what it should be casting to (i.e. what its return type should be). The last thing we need to do is fill our switch blocks. Hopefully it's obvious that we can't just do: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { "GET" => return .get, "PUT" => return .put, else => {}, }, ... But you might be thinking that, while ugly, something like this might work: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast("GET".*)) => return .get, @as(u24, @bitCast("PUT".*)) => return .put, else => {}, }, ... Because `"GET"` and `"PUT"` are string literals, they're null terminated and of type `*const [3:0]u8`. When we dereference them, we get a `const [3:0]u8`. It's close, but it means that the value is 4 bytes (`[4]u8{'G', 'E', 'T', 0}`) and thus cannot be `@bitCast` into a `u24`. This is ugly, but it works: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast(@as([]const u8, "GET")[0..3].*)) => return .get, @as(u24, @bitCast(@as([]const u8, "PUT")[0..3].*)) => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { @as(u32, @bitCast(@as([]const u8, "HEAD")[0..4].*)) => return .head, @as(u32, @bitCast(@as([]const u8, "POST")[0..4].*)) => return .post, else => {}, }, else => {}, } return null; } That's a mouthful, so we can add small function to help: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { asUint(u24, "GET") => return .get, asUint(u24, "PUT") => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { asUint(u32, "HEAD") => return .head, asUint(u32, "POST") => return .post, else => {}, }, else => {}, } return null; } pub fn asUint(comptime T: type, comptime string: []const u8) T { return @bitCast(string[0..string.len].*); } Like the verbose version, the trick is to cast our null-terminated string literal into a string slice, `[]const u8`. By passing it through the `asUint` function, we get this without needing to add the explicit `@as([]const u8)`. There is a more advanced version of `asUint` which doesn't take the uint type parameter (`T`). If you think about it, the uint type can be inferred from the string's length: pub fn asUint(comptime string: []const u8) @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }) { return @bitCast(string[0..string.len].*); } Which allows us to call it with a single parameter: `asUint("GET")`. This might be your first time seeing such a return type. The `@Type` builtin is the opposite of `@typeInfo`. The latter takes a type and returns information on it in the shape of a `std.builtin.Type` union. Whereas `@Type` takes the `std.builtin.Type` and returns an actual usable type. One of these days I'll find the courage to blog about `std.builtin.Type`! As a final note, some people dislike the look of this sort of return type and rather encapsulate the logic in its own function. This is the same: pub fn asUint(comptime string: []const u8) AsUintReturn(string) { return @bitCast(string[0..string.len].*); } // Remember that, in Zig, by convention, a function should be // PascalCase if it returns a type (because types are PascalCase). fn AsUintReturn(comptime string: []const u8) type { return @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }); } ### Conclusion Of the three approaches, this is the least readable and less approachable. Is it worth it? It depends on your input and the values you're comparing against. In my benchmarks, using `@bitCast` performs roughly the same as `std.meta.stringToEnum`. But there are some cases where `@bitCast` can outperform `std.meta.stringToEnum` by as much as 50%. Perhaps that's the real value of this approach: the performance is less dependent on the input or the values being matched against. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
Zig's dot star syntax (value.*)
Maybe I'm the only one, but it always takes my little brain a split second to understand what's happening whenever I see, or have to write, something like `value.* = .{...}`. If we take a step back, a variable is just a convenient name for an address on the stack. When this function executes: fn isOver9000(power: i64) bool { return power > 9000; } Say, with a `power` of 593, we could visualize its stack as: power -> ------------- | 593 | ------------- If we changed our function to take a pointer to an integer: // i64 changed to *i64 fn isOver9000(power: *i64) bool { return power > 9000; } Our `power` argument would still be a label for a stack address, but instead of directly containing an number, the stack value would itself be an address. That's the _indirection_ of pointers: power -> ------------- | 1182145c0 |------------------------ ------------- | | ............. empty space | ............. or other data | | ------------- | | 593 | <---------------------- ------------- But this code doen't work: it's trying to compare a `comptime_int` (`9000`) with an `*i64`. We need to make another change to the function: // i64 changed to *i64 fn isOver9000(power: *i64) bool { // power changed to power.* return power.* > 9000; } `power.*` is how we dereference a pointer. Dereferencing means to get the value pointed to by a pointer. From our above visualization, you could say that the `.*` follows the arrow to get the value, `593`. This same syntax works for writing as well. The following is valid: fn isOver9000(power: *i64) bool { power.* = 9001; return true; } Like before, the dereferencing operator (`.*`), "follows" the pointer, but now that it's on the receiving end of an assignment, we write the value into the pointed add memory. This is all true for more complex types. Let's say we have a `User` struct with an `id` and a `name`: const User = struct { id: i32, name: []const u8, }; var user = User{ .id = 900, .name = "Teg" }; The `user` variable is a label for the location of the [start of] the user: user -> ------------- | 900 | ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- A slice in Zig, like our `[]const u8`, is a length (`3`) and a pointer to the values. Now, if we were to take the address of `user`, via `&user;`, we introduce a level of indirection. For example, imagine this code: const std = @import("std"); const User = struct { id: i32, name: []const u8, }; pub fn main() !void { var user = User{ .id = 900, .name = "Teg" }; updateUser(&user;); std.debug.print("{d}\n", .{user.id}); } fn updateUser(user: *User) void { user.id += 100000; } The `user` parameter of our `updateUser` function is pointing to the `user` on `main`'s stack: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | user -> ------------- | | 900 | <---------------------- ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Because we're referencing `main`'s `user` (rather than a copy), any changes we make will be reflected in `main`. But, we aren't limited to operating on fields of `user`, we can operate on its entire memory. Of course, we can create a copy of the id field (assignment are always copies, just an matter of knowing _what_ we're copying): fn updateUser(user: *User) void { const id = user.id // .... } And now the stack for our function looks like: user -> ------------- | 83abcc30 | id -> ------------- | 900 | ------------- But we can also copy the entire user: fn updateUser(user: *User) void { const copy = user.*; // .... } Whch gives us something like: updateUser user -> ------------- | 83abcc30 |--------------------- copy -> ------------- | | 900 | | ------------- | | 3 | | ------------- | | 3c9414e99 | --------------------|-- ------------- | | | | ............. empty space | | ............. or other data | | | | main | | user -> ------------- | | | 900 | <------------------- | ------------- | | 3 | | ------------- | | 3c9414e99 | -----------------------| ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Notice that it didn't create a copy of the value 'Teg'. You could call this copying "shallow": it copied the `900`, the `3` (name length) and the `3c9414e99` (address of the name pointer). Just like our simpler example above, we can also assign using the dereferencing operator: fn updateUser(user: *User) void { // using type inference // could be more explicit and do // user.* = User{....} user.* = .{ .id = 5, .name = "Paul", }; } This doesn't copy anything; it writes into the address that we were given, the address of the main's `user`: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | | user -> ------------- | | 5 | <---------------------- ------------- | 4 | ------------- | 9bf4a990 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | P | <---------------------- ------------- | a | ------------- | u | ------------- | l | ------------- If you're still not fully comfortable with this, and if you haven't done so already, you might be interested in the pointers and stack memory parts of my learning zig series. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
Zig's new LinkedList API (it's time to learn @fieldParentPtr)
In a recent, post-Zig 0.14 commit, Zig's `SinglyLinkedList` and `DoublyLinkedList` saw significant changes. The previous version was a generic and, with all the methods removed, looked like: pub fn SinglyLinkedList(comptime T: type) type { return struct { first: ?*Node = null, pub const Node = struct { next: ?*Node = null, data: T, }; }; } The new version isn't generic. Rather, you embed the linked list node with your data. This is known as an intrusive linked list and tends to perform better and require fewer allocations. Except in trivial examples, the data that we store in a linked list is typically stored on the heap. Because an intrusive linked list has the linked list node embedded in the data, it doesn't need its own allocation. Before we jump into an example, this is what the new structure looks like, again, with all methods removed: pub const SinglyLinkedList = struct { first: ?*Node = null, pub const Node = struct { next: ?*Node = null, }; }; Much simpler, and, notice that this has no link or reference to any of our data. Here's a working example that shows how you'd use it: const std = @import("std"); const SinglyLinkedList = std.SinglyLinkedList; pub fn main() !void { // GeneralPurposeAllocator is being renamed // to DebugAllocator. Let's get used to that name var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); var list: SinglyLinkedList = .{}; const user1 = try allocator.create(User); defer allocator.destroy(user1); user1.* = .{ .id = 1, .power = 9000, .node = .{}, }; list.prepend(&user1.node;); const user2 = try allocator.create(User); defer allocator.destroy(user2); user2.* = .{ .id = 2, .power = 9001, .node = .{}, }; list.prepend(&user2.node;); var node = list.first; while (node) |n| { std.debug.print("{any}\n", .{n}); node = n.next; } } const User = struct { id: i64, power: u32, node: SinglyLinkedList.Node, }; To run this code, you'll need a nightly release from within the last week. What do you think the output will be? You should see something like: SinglyLinkedList.Node{ .next = SinglyLinkedList.Node{ .next = null } } SinglyLinkedList.Node{ .next = null } We're only getting the nodes, and, as we can see here and from the above skeleton structure of the new `SinglyLinkedList`, there's nothing about our users. Users have nodes, but there's seemingly nothing that links a node back to its containing user. Or is there? In the past, we've described how the compiler uses the type information to figure out how to access fields. For example, when we execute `user1.power`, the compiler knows that: 1. `id` is +0 bytes from the start of the structure, 2. `power` is +8 bytes from the start of the structure (because id is an i64), and 3. `power` is an i32 With this information, the compiler knows how to access `power` from `user1` (i.e. jump forward 8 bytes, read 4 bytes and treat it as an i32). But if you think about it, that logic is simple to reverse. If we know the address of `power`, then the address of `user` has to be `address_of_power - 8`. We can prove this: const std = @import("std"); pub fn main() !void { var user = User{ .id = 1, .power = 9000, }; std.debug.print("address of user: {*}\n", .{&user;}); const address_of_power = &user.power; std.debug.print("address of power: {*}\n", .{address_of_power}); const power_offset = 8; const also_user: *User = @ptrFromInt(@intFromPtr(address_of_power) - power_offset); std.debug.print("address of also_user: {*}\n", .{also_user}); std.debug.print("also_user: {}\n", .{also_user}); } const User = struct { id: i64, power: u32, }; The magic happens here: const power_offset = 8; const also_user: *User = @ptrFromInt(@intFromPtr(address_of_power) - power_offset); We're turning the address of our user's power field, `&user.power;` into an integer, subtracting 8 (8 bytes, 64 bits), and telling the compiler that it should treat that memory as a `*User`. This code will _probably_ work for you, but it isn't safe. Specifically, unless we're using a packed or extern struct, Zig makes no guarantees about the layout of a structure. It could put `power` BEFORE `id`, in which case our `power_offset` should be 0. It could add padding after every field. It can do anything it wants. To make this code safer, we use the `@offsetOf` builtin to get the actual byte-offset of a field with respect to its struct: const power_offset = @offsetOf(User, "power"); Back to our linked list, given that we have the address of a `node` and we know that it is part of the `User` structure, we _are_ able to get the `User` from a node. Rather than use the above code though, we'll use the _slightly_ friendlier `@fieldParentPtr` builtin. Our `while` loop changes to: while (node) |n| { const user: *User = @fieldParentPtr("node", n); std.debug.print("{any}\n", .{user}); node = n.next; } We give `@fieldParentPtr` the name of the field, a pointer to that field as well as a return type (which is inferred above by the assignment to a `*User` variable), and it gives us back the instance that contains that field. Performance aside, I have mixed feelings about the new API. My initial reaction is that I dislike exposing, what I consider, a complicated builtin like `@fieldParentPtr` for something as trivial as using a linked list. However, while `@fieldParentPtr` seems esoteric, it's quite useful and developers should be familiar with it because it can help solve problems which are otherwise problematic. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
ArenaAllocator.free and Nested Arenas
What happens when you `free` with an ArenaAllocator? You might be tempted to look at the documentation for std.mem.Allocator.free which says "Free an array allocated with alloc". But this is the one thing we're sure it _won't_ do. In its current implementation, calling `free` usually does nothing: the freed memory isn't made available for subsequent allocations by the arena, and it certainly isn't released back to the operating system. However, under specific conditions `free` will make the memory re-usable by the arena. The only way to really "free" the memory is to call `deinit`. The only case when we're guaranteed that the memory will be reusable by the arena is when it was the last allocation made: const str1 = try arena.dupe(u8, "Over 9000!!!"); arena.free(str1); Above, whatever memory was allocated to duplicate our string will be available for subsequent allocations made with `arena`. In the following case, the two calls to `arena.free` do nothing: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); arena.free(str1); arena.free(str2); In order to "fix" this code, we'd need to reverse the order of the two frees: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); arena.free(str2); //swapped this line with the next arena.free(str1); Now, when we call `arena.free(str2)`, the memory allocated for `str2` will be available to subsequent allocations. But what happens when we call `arena.free(str1)`? The answer, again, is: _it depends_. It has to do with the internal state of the arena. Simplistically, an `ArenaAllocator` keeps a linked list of memory buffers. Imagine something like: buffer_list.head -> ------------ | next | -> null | ---- | | | | | | | | | | | ------------ Our linked list has a single node along with 5 bytes of available space. After we allocate `str1`, it looks like: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | | | | | | | ------------ Then, when we allocate `str2`, it looks like: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | str2 -> | 1 | | 2 | | | ------------ When we free `str2`, it goes back to how it was before: buffer_list.head -> ------------ | next | -> null | ---- | str1 -> | a | | b | | | | | | | ------------ Which means that when we `arena.free(str1)`, it **will** make that memory available again. However, if instead of allocating two strings, we allocate three: const str1 = try arena.dupe(u8, "ab"); const str2 = try arena.dupe(u8, "12"); const str3 = try arena.dupe(u8, "()"); arena.free(str3); arena.free(str2); arena.free(str1); Our first buffer doesn't have enough space for the new string, so a new node is prepended to our linked list: buffer_list.head -> ------------ ------------ | next | -> | next | -> null | ---- | | ---- | str3 -> | ( | | a | <- str1 | ) | | b | | | | 1 | <- str2 | | | 2 | | | | | ------------ ------------ When we call `arena.free(str3)`, the memory for that allocation will be made available, but subsequent frees, even if they're in the correct order (i.e. freeing `str2` then `str1`) will be noops. The ArenaAllocator doesn't have the capability to go back to act on anything but the head of our linked list, even if it's empty. In short, when we `free` the last allocation, that memory will _always_ be made available. But subsequent `frees` only behave this way if (a) they're also in order and (b) happen to be allocate within the same internal memory node. ### Nested Arenas Zig's allocator are said to be composable. When we create an `ArenaAllocator`, we pass a single parameter: an allocator. That parent allocator (1) can be any other type of allocator. You can, for example, create an `ArenaAllocator` on top of a `FixedBufferAllocator`. You can also create an `ArenaAllocator` on top of another `ArenaAllocator`. (1) Zig calls this the "child allocator", but that doesn't make any sense to me. This kind of thing often happens within libraries, where an API takes an `std.mem.Allocator` and the library creates an `ArenaAllocator`. And what happens when the provided allocator was already an arena? Libraries aside, I'm mean something like: var parent_arena = ArenaAllocator.init(gpa_allocator); const parent_allocator = parent_arena.allocator(); var inner_arena = ArenaAllocator.init(parent_allocator); const inner_allocator = inner_arena.allocator(); _ = try inner_allocator.dupe(u8, "Over "); _ = try inner_allocator.dupe(u8, "9000!"); inner_arena.deinit(); It does work, but at best, when `deinit` is called, the memory will be made available to be re-used by `inner_arena`. Except in simple cases, allocations made by `inner_arena` are likely to span multiple buffers of `parent_arena`, and of course you can still make allocations directly in `parent_arena` which can generate its own new buffers or simply make the ordering requirement impossible to fulfill. For example, if we make an allocation in `parent_arena` before `inner_arena.deinit();` is called: _ = try parent_allocator.dupe(u8, "!!!"); inner_arena.deinit(); Then the `deinit` does nothing. So while nesting ArenaAllocator's works, I don't think there's any advantage over using a single Arena. And, I think in many cases where you have an "inner_arena", like in a library, it's better if the caller provides a non-Arena parent allocator so that all the memory is really freed when the library is done with it. Of course, there's a transparency issue here. Unless the library documents exactly how it's using your provided allocator, or unless you explore the code - and assuming the implementation doesn't change - it's hard to know what you should use. Leave a comment
www.openmymind.net
September 19, 2025 at 12:34 AM
I'm too dumb for Zig's new IO interface
You might have heard that Zig 0.15 introduces a new IO interface, with the focus for this release being the new std.Io.Reader and std.Io.Writer types. The old "interfaces" had problems. Like this performance issue that I opened. And it relied on a mix of types, which always confused me, and a lot of `anytype` - which is generally great, but a poor foundation to build an interface on. I've been slowly upgrading my libraries, and I ran into changes to the `tls.Client` client used by my smtp library. For the life of me, I just don't understand how it works. Zig has never been known for its documentation, but if we look at the documentation for `tls.Client.init`, we'll find: pub fn init(input: *std.Io.Reader, output: *std.Io.Writer, options: Options) InitError!Client Initiates a TLS handshake and establishes a TLSv1.2 or TLSv1.3 session. So it takes one of these new Readers and a new Writer, along with some options (sneak peak, which aren't all optional). It doesn't look like you can just give it a `net.Stream`, but `net.Stream` does expose a `reader()` and `writer()` method, so that's probably a good place to start: const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var writer = stream.writer(&.{}); var reader = stream.reader(&.{}); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{}, // options TODO ); Note that `stream.writer()` returns a `Stream.Writer` and `stream.reader()` returns a `Stream.Reader` - those aren't the types our `tls.Client` expects. To convert the `Stream.Reader` to an `*std.Io.Reader`, we need to call its `interface()` method. To get a `*std.io.Writer` from an `Stream.Writer`, we need the address of its `&interface;` field. This doesn't seem particularly consistent. Don't forget that the `writer` and `reader` need a stable address. Because I'm trying to get the simplest example working, this isn't an issue - everything will live on the stack of `main`. In a real word example, I think it means that I'll always have to wrap the `tls.Client` into my own heap-allocated type; giving the writer and reader have a cozy stable home. Speaking of allocations, you might have noticed that `stream.writer` and `stream.reader` take a parameter. It's the buffer they should use. Buffering is a first class citizen of the new Io interface - who needs composition? The documentation **does** tell me these need to be at least `std.crypto.tls.max_ciphertext_record_len` large, so we need to fix things a bit: var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); Here's where the code stands: const std = @import("std"); pub fn main() !void { var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ }, ); defer tls_client.end() catch {}; } But if you try to run it, you'll get a compilation error. Turns out we have to provide 4 options: the ca_bundle, a host, a `write_buffer` and a `read_buffer`. Normally I'd expect the options parameter to be for optional parameters, I don't understand why some parameters (input and output) are passed one way while `writer_buffer` and `read_buffer` are passed another. Let's give it what it wants AND send some data: // existing setup... var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &.{}, .write_buffer = &.{}, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); Now, if I try to run it, the program just hangs. I don't know what `write_buffer` is, but I know Zig now loves buffers, so let's try to give it something: // existing setup... // I don't know what size this should/has to be?? var write_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &.{}, .write_buffer = &write;_buf2, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); Great, now the code doesn't hang, all we need to do is read the response. `tls.Client` exposes a `reader: *std.Io.Reader` field which is "Decrypted stream from the server to the client." That sounds like what we want, but believe it or not `std.Io.Reader` doesn't have a `read` method. It has a `peak` a `takeByteSigned`, a `readSliceShort` (which seems close, but it blocks until the provided buffer is full), a `peekArray` and a lot more, but nothing like the `read` I'd expect. The closest I can find, which I think does what I want, is to stream it to a writer: var buf: [1024]u8 = undefined; var w: std.Io.Writer = .fixed(&buf;); const n = try tls_client.reader.stream(&w;, .limited(buf.len)); std.debug.print("read: {d} - {s}\n", .{n, buf[0..n]}); If we try to run the code now, it crashes. We've apparently failed an assertion regarding the length of a buffer. So it seems like we also _have_ to provide a `read_buffer`. Here's my current version (it doesn't work, but it doesn't crash!): const std = @import("std"); pub fn main() !void { var gpa: std.heap.DebugAllocator(.{}) = .init; const allocator = gpa.allocator(); const stream = try std.net.tcpConnectToHost(allocator, "www.openmymind.net", 443); defer stream.close(); var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var writer = stream.writer(&write;_buf); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = stream.reader(&read;_buf); var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); var write_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var read_buf2: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var tls_client = try std.crypto.tls.Client.init( reader.interface(), &writer.interface;, .{ .ca = .{.bundle = bundle}, .host = .{ .explicit = "www.openmymind.net" } , .read_buffer = &read;_buf2, .write_buffer = &write;_buf2, }, ); defer tls_client.end() catch {}; try tls_client.writer.writeAll("GET / HTTP/1.1\r\n\r\n"); var buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var w: std.Io.Writer = .fixed(&buf;); const n = try tls_client.reader.stream(&w;, .limited(buf.len)); std.debug.print("read: {d} - {s}\n", .{n, buf[0..n]}); } When I looked through Zig's source code, there's only one place using `tls.Client`. It helped to get me where where I am. I couldn't find any tests. I'll admit that during this migration, I've missed some basic things. For example, someone had to help me find `std.fmt.printInt` - the renamed version of `std.fmt.formatIntBuf`. Maybe there's a helper like: `tls.Client.init(allocator, stream)` somewhere. And maybe it makes sense that we do `reader.interface()` but `&writer.interface;` - I'm reminded of Go's `*http.Request` and `http.ResponseWrite`. And maybe Zig has some consistent rule for what parameters belong in options. And I know nothing about TLS, so maybe it makes complete sense to need 4 buffers. I feel a bit more confident about the weirdness of not having a `read(buf: []u8) !usize` function on `Reader`, but at this point I wouldn't bet on me. Leave a comment
www.openmymind.net
September 19, 2025 at 12:33 AM
Zig's new Writer
As you might have heard, Zig's `Io` namespace is being reworked. Eventually, this will mean the re-introduction of async. As a first step though, the Writer and Reader interfaces and some of the related code have been revamped. > This post is written based on a mid-July 2025 development release of Zig. It doesn't apply to Zig 0.14.x (or any previous version) and is likely to be outdated as more of the Io namespace is reworked. Not long ago, I wrote a blog post which tried to explain Zig's Writers. At best, I'd describe the current state as "confusing" with two writer interfaces while often dealing with `anytype`. And while `anytype` is convenient, it lacks developer ergonomics. Furthermore, the current design has significant performance issues for some common cases. ### Drain The new `Writer` interface is `std.Io.Writer`. At a minimum, implementations have to provide a `drain` function. Its signature looks like: fn drain(w: *Writer, data: []const []const u8, splat: usize) Error!usize You might be surprised that this is the method a custom writer needs to implemented. Not only does it take an array of strings, but what's that `splat` parameter? Like me, you might have expected a simpler `write` method: fn write(w: *Writer, data: []const u8) Error!usize It turns out that `std.Io.Writer` has buffering built-in. For example, if we want a `Writer` for an `std.fs.File`, we need to provide the buffer: var buffer: [1024]u8 = undefined; var writer = my_file.writer(&buffer;); Of course, if we don't want buffering, we can always pass an empty buffer: var writer = my_file.writer(&.{}); This explains why custom writers need to implement a `drain` method, and not something simpler like `write`. The simplest way to implement `drain`, and what a lot of the Zig standard library has been upgraded to while this larger overhaul takes place, is: fn drain(io_w: *std.Io.Writer, data: []const []const u8, splat: usize) !usize { _ = splat; const self: *@This() = @fieldParentPtr("interface", io_w); return self.writeAll(data[0]) catch return error.WriteFailed; } We ignore the `splat` parameter, and just write the first value in `data` (`data.len > 0` is guaranteed to be true). This turns `drain` into what a simpler `write` method would look like. Because we return the length of bytes written, `std.Io.Writer` will know that we potentially didn't write all the data and call `drain` again, if necessary, with the rest of the data. > If you're confused by the call to `@fieldParentPtr`, check out my post on the upcoming linked list changes. The actual implementation of `drain` for the `File` is a non-trivial ~150 lines of code. It has platform-specific code and leverages vectored I/O where possible. There's obviously flexibility to provide a simple implementation or a more optimized one. ### The Interface Much like the current state, when you do `file.writer(&buffer;)`, you don't get an `std.Io.Writer`. Instead, you get a `File.Writer`. To get an actual `std.Io.Writer`, you need to access the `interface` field. This is merely a convention, but expect it to be used throughout the standard, and third-party, library. Get ready to see a lot of `&xyz.interface;` calls! This simplification of `File` shows the relationship between the three types: pub const File = struct { pub fn writer(self: *File, buffer: []u8) Writer{ return .{ .file = self, .interface = std.Io.Writer{ .buffer = buffer, .vtable = .{.drain = Writer.drain}, } }; } pub const Writer = struct { file: *File, interface: std.Io.Writer, // this has a bunch of other fields fn drain(io_w: *std.Io.Writer, data: []const []const u8, splat: usize) !usize { const self: *Writer = @fieldParentPtr("interface", io_w); // .... } } } The instance of `File.Writer` needs to exist somewhere (e.g. on the stack) since that's where the `std.Io.Writer` interface exists. It's possible that `File` could directly have an `writer_interface: std.Io.Writer` field, but that would limit you to one writer per file and would bloat the `File` structure. We can see from the above that, while we call `Writer` an "interface", it's just a normal struct. It has a few fields beyond `buffer` and `vtable.drain`, but these are the only two with non-default values; we have to provide them. The `Writer` interface implements a lot of typical "writer" behavior, such as a `writeAll` and `print` (for formatted writing). It also has a number of methods which only a `Writer` implementation would likely care about. For example, `File.Writer.drain` has to call `consume` so that the writer's internal state can be updated. Having all of these functions listed side-by-side in the documentation confused me at first. Hopefully it's something the documentation generation will one day be able to help disentangle. ### Migrating The new `Writer` has taken over a number of methods. For example, `std.fmt.formatIntBuf` no longer exists. The replacement is the `printInt` method of `Writer`. But this requires a `Writer` instance rather than the simple `[]u8` previous required. It's easy to miss, but the `Writer.fixed([]u8) Writer` function is what you're looking for. You'll use this for any function that was migrating to `Writer` and used to work on a `buffer: []u8`. While migrating, you might run into the following error: _no field or member function named 'adaptToNewApi' in '...'_. You can see why this happens by looking at the updated implementation of `std.fmt.format`: pub fn format(writer: anytype, comptime fmt: []const u8, args: anytype) !void { var adapter = writer.adaptToNewApi(); return adapter.new_interface.print(fmt, args) catch |err| switch (err) { error.WriteFailed => return adapter.err.?, }; } Because this functionality was moved to `std.Io.Writer`, any `writer` passed into `format` has to be able to upgrade itself to the new interface. This is done, again only be convention, by having the "old" writer expose an `adaptToNewApi` method which returns a type that exposes a `new_interface: std.Io.Writer` field. This is pretty easy to implement using the basic `drain` implementation, and you can find a handful of examples in the standard library, but it's of little help if you don't control the legacy writer. ### Conclusion I'm hesitant to provide opinion on this change. I don't understand language design. However, while I think this is an improvement over the current API, I keep thinking that adding buffering directly to the `Writer` isn't ideal. I believe that most languages deal with buffering via composition. You take a reader/writer and wrap it in a BufferedReader or BufferedWriter. This approach seems both simple to understand and implement while being powerful. It can be applied to things beyond buffering and IO. Zig seems to struggle with this model. Rather than provide a cohesive and generic approach for such problems, one specific feature (buffering) for one specific API (IO) was baked into the standard library. Maybe I'm too dense to understand or maybe future changes will address this more holistically. Leave a comment
www.openmymind.net
September 19, 2025 at 12:33 AM
Switching on Strings in Zig
Newcomers to Zig will quickly learn that you can't switch on a string (i.e. `[]const u8`). The following code gives us the unambiguous error message _cannot switch on strings_ : switch (color) { "red" => {}, "blue" => {}, "green" => {}, "pink" => {}, else => {}, } I've seen two explanations for why this isn't supported. The first is that there's ambiguity around string identity. Are two strings only considered equal if they point to the same address? Is a null-terminated string the same as its non-null-terminated counterpart? The other reason is that users of `switch` apparently] expect [certain optimizations which are not possible with strings (although, presumably, these same users would know that such optimizations aren't possible with string). Instead, in Zig, there are two common methods for comparing strings. ### std.mem.eql The most common way to compare strings is using `std.mem.eql` with `if / else if / else`: if (std.mem.eql(u8, color, "red") == true) { } else if (std.mem.eql(u8, color, "blue") == true) { } else if (std.mem.eql(u8, color, "green") == true) { } else if (std.mem.eql(u8, color, "pink") == true) { } else { } The implementation for `std.mem.eql` depends on what's being compared. Specifically, it has an optimized code path when comparing strings. Although that's what we're interested in, let's look at the non-optimized version: pub fn eql(comptime T: type, a: []const T, b: []const T) bool { if (a.len != b.len) return false; if (a.len == 0 or a.ptr == b.ptr) return true; for (a, b) |a_elem, b_elem| { if (a_elem != b_elem) return false; } return true; } Whether we're dealing with slices of bytes or some other type, if they're of different length, they can't be equal. Once we know that they're the same length, if they point to the same memory, then they must be equal. I'm not a fan of this second check; it might be cheap, but I think it's quite uncommon. Once those initial checks are done, we compare each element (each byte of our string) one at a time. The optimized version, which _is_ used for strings, is much more involved. But it's fundamentally the same as the above with SIMD to compare multiple bytes at once. The nature of string comparison means that real-world performance is dependent on the values being compared. We know that if we have 100 `if / else if` branches then, at the worse case, we'll need to call `std.mem.eql` 100 times. But comparing strings of different lengths or strings which differ early will be significantly faster. For example, consider these three cases: { const str1 = "a" ** 10_000 ++ "1"; const str2 = "a" ** 10_000 ++ "2"; _ = std.mem.eql(u8, str1, str2); } { const str1 = "1" ++ a" ** 10_000; const str2 = "2" ++ a" ** 10_000; _ = std.mem.eql(u8, str1, str2); } { const str1 = "a" ** 999_999; const str2 = "a" ** 1_000_000; _ = std.mem.eql(u8, str1, str2); } For me, the first comparison takes ~270ns, whereas the other two take ~20ns - despite the last one involving much larger strings. The second case is faster because the difference is early in the string allowing the `for` loop to return after only one comparison. The third case is faster because the strings are of a different length: `false` is returned by the initial `len` check. ### std.meta.stringToEnum The `std.meta.stringToEnum` takes an enum type and a string value and returns the corresponding enum value or null. This code prints "you picked: blue" const std = @import("std"); const Color = enum { red, blue, green, pink, }; pub fn main() !void { const color = std.meta.stringToEnum(Color, "blue") orelse { return error.InvalidChoice; }; switch (color) { .red => std.debug.print("you picked: red\n", .{}), .blue => std.debug.print("you picked: blue\n", .{}), .green => std.debug.print("you picked: green\n", .{}), .pink => std.debug.print("you picked: pink\n", .{}), } } If you don't need the enum type (i.e. `Color`) beyond this check, you can leverage Zig's anonymous types. This is equivalent: const std = @import("std"); pub fn main() !void { const color = std.meta.stringToEnum(enum { red, blue, green, pink, }, "blue") orelse return error.InvalidChoice; switch (color) { .red => std.debug.print("you picked: red\n", .{}), .blue => std.debug.print("you picked: blue\n", .{}), .green => std.debug.print("you picked: green\n", .{}), .pink => std.debug.print("you picked: pink\n", .{}), } } It's **not** obvious how this should perform versus the straightforward `if / else if` approach. Yes, we now have a `switch` statement that the compiler can [hopefully] optimize, but `std.meta.stringToEnum` still has convert our input, `"blue"`, into an enum. The implementation of `std.meta.stringToEnum` depends on the number of possible values, i.e. the number of enum values. Currently, if there are more than 100 values, it'll fallback to using the same `if / else if` that we explored above. Thus, with more than 100 values it does the `if / else if` check PLUS the switch. This should improve in the future. However, with 100 or fewer values, `std.meta.stringToEnum` creates a comptime `std.StaticStringMap` which can then be used to lookup the value. `std.StaticStringMap` isn't something we've looked at before. It's a specialized map that buckets keys by their length. Its advantage over Zig's other hash maps is that it can be constructed at compile-time. For our `Color` enum, the internal state of a `StaticStringMap` would look something like: // keys are ordered by length keys: ["red", "blue", "pink", "green"]; // values[N] corresponds to keys[N] values: [.red, .blue, .pink, .green]; // What's this though? indexes: [0, 0, 0, 0, 1, 3]; It might not be obvious how `indexes` is used. Let's write our own `get` implementation, simulating the above `StaticStringMap` state: fn get(str: []const u8) ?Color { // Simulate the state of the StaticStringMap which // stringToMeta built at compile-time. const keys = [_][]const u8{"red", "blue", "pink", "green"}; const values = [_]Color{.red, .blue, .pink, .green}; const indexes = [_]usize{0, 0, 0, 0, 1, 3}; if (str.len >= indexes.len) { // our map has no strings of this length return null; } var index = indexes[str.len]; while (index < keys.len) { const key = keys[index]; if (key.len != str.len) { // we've gone into the next bucket, everything after // this is longer and thus can't be a match return null; } if (std.mem.eql(u8, key, str)) { return values[index]; } index += 1; } return null; } Take note that `keys` are ordered by length. As a naive implementation, we could iterate through the keys until we either find a match or find a key with a longer length. Once we find a key with a longer length, we can stop searching, as all remaining candidates won't match - they'll all be too long. `StaticStringMap` goes a step further and records the index within `keys` where entries of a specific length begin. `indexes[3]` tells us where to start looking for keys with a length of 3 (at index 0). `indexes[5]` tells us where to start looking for keys with a length of 5 (at index 3). Above, we fallback to using `std.mem.eql` for any key which is the same length as our target string. `StaticStringMap` uses its own "optimized" version: pub fn defaultEql(a: []const u8, b: []const u8) bool { if (a.ptr == b.ptr) return true; for (a, b) |a_elem, b_elem| { if (a_elem != b_elem) return false; } return true; } This is the same as the simple `std.mem.eql` implementation, minus the length check. This is done because the `eql` within our `while` loop is only ever called for values with matching length. On the flip side, `StaticStringMap`'s `eql` doesn't use SIMD, so it would be slower for large strings. `StaticStringMap` is a wrapper to `StaticStringMapWithEql` which accept a custom `eql` function, so if you _did_ want to use it for long strings or some other purposes, you have a reasonable amount of flexibility. You even have the option to use `std.static_string_map.eqlAsciiIgnoreCase` for ASCII-aware case-insensitive comparison. ### Conclusion In my own benchmarks, in general, I've seen little difference between the two approaches. It does seem like `std.meta.stringToEnum` is generally as fast or faster. It also results in more concise code and is ideal if the resulting enum is useful beyond the comparison. You usually don't have long enum values, so the lack of SIMD-optimization isn't a concern. However, if you're considering building your own `StaticStringMap` at compile time with long keys, you should benchmark with a custom `eql` function based on `std.mem.eql`. We could manually bucket those `if / else if` branches ourselves, similar to what the `StaticStringMap` does. Something like: switch (color.len) { 3 => { if (std.mem.eql(u8, color, "red") == true) { // ... return; } }, 4 => { if (std.mem.eql(u8, color, "blue") == true) { // ... return; } if (std.mem.eql(u8, color, "pink") == true) { // ... return; } }, 5 => { if (std.mem.eql(u8, color, "green") == true) { // ... return; } }, else => {}, } // not found Ughhh. This highlights the convenience of using `std.meta.stringToEnum` to generate similar code. Also, do remember that `std.mem.eql` quickly discards strings of different lengths, which helps to explain why both approaches generally perform similarly. Leave a comment
www.openmymind.net
September 7, 2025 at 6:17 PM
Comparing Strings as Integers with @bitCast
In the last blog posts, we looked at different ways to compare strings in Zig. A few posts back, we introduced Zig's `@bitCast`. As a quick recap, `@bitCast` lets us force a specific type onto a value. For example, the following prints 1067282596: const std = @import("std"); pub fn main() !void { const f: f32 = 1.23; const n: u32 = @bitCast(f); std.debug.print("{d}\n", .{n}); } What's happening here is that Zig represents the 32-bit float value of `1.23` as: `[4]u8{164, 112, 157, 63}`. This is also how Zig represents the 32-bit unsigned integer value of `1067282596`. Data is just bytes; it's the type system - the compiler's knowledge of what data is what type - that controls what and how that data is manipulated. It might seem like there's something special about bitcasting from a float to an integer; they're both numbers after all. But you can `@bitCast` from any two equivalently sized types. Can you guess what this prints?: const std = @import("std"); pub fn main() !void { const data = [_]u8{3, 0, 0, 0}; const x: i32 = @bitCast(data); std.debug.print("{d}\n", .{x}); } The answer is `3`. Think about the above snippet a bit more. We're taking an array of bytes and telling the compiler to treat it like an integer. If we made `data` equal to `[_]u8{'b', 'l', 'u', 'e'}`, it would still work (and print `1702194274`). We're slowly heading towards being able to compare strings as-if they were integers. If you're wondering why 3 is encoded as `4]u8{3, 0, 0, 0}` and not `[4]u8{0, 0, 0, 3}`, I talked about binary encoding in my [Learning TCP series. From the last post, we could use multiple `std.mem.eql` or, more simply, `std.meta.stringToEnum` to complete the following method: fn parseMethod(value: []const u8) ?Method { // ... } const Method = enum { get, put, post, head, }; We can also use `@bitCast`. Let's take it step-by-step. The first thing we'll need to do is switch on `value.len`. This is necessary because the three-byte "GET" will need to be `@bitCast` to a `u24`, whereas the four-byte "POST" needs to be `@bitCast` to a `u32`: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3]))) { // TODO else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4]))) { // TODO else => {}, }, else => {}, } return null; } If you try to run this code, you'll get a compilation error: _cannot @bitCast from '*const [3]u8'_. `@bitCast` works on actual bits, but when we slice our `[]const u8` with a compile-time known range (`[0..3]`), we get a pointer to an array. We can't `@bitCast` a pointer, we can only `@bitCast` actual bits of data. For this to work, we need to derefence the pointer, i.e. use: `value[0..3].*`. This will turn our `*const [3]u8` into a `const [3]u8`. fn parseMethod(value: []const u8) ?Method { switch (value.len) { // changed: we now derefernce the value (.*) 3 => switch (@as(u24, @bitCast(value[0..3].*))) { // TODO else => {}, }, // changed: we now dereference the value (.*) 4 => switch (@as(u32, @bitCast(value[0..4].*))) { // TODO else => {}, }, else => {}, } return null; } Also, you might have noticed the `@as(u24, ...)` and `@as(u32, ...)`. `@bitCast`, like most of Zig's builtin functions, infers its return type. When we're assiging the result of a `@bitCast` to a variable of a known type, i.e: `const x: i32 = @bitCast(data);`, the return type of `i32` is inferred. In the above `switch`, we aren't assigning the result to a varible. We have to use `@as(u24, ...)` in order for `@bitCast` to kknow what it should be casting to (i.e. what its return type should be). The last thing we need to do is fill our switch blocks. Hopefully it's obvious that we can't just do: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { "GET" => return .get, "PUT" => return .put, else => {}, }, ... But you might be thinking that, while ugly, something like this might work: 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast("GET".*)) => return .get, @as(u24, @bitCast("PUT".*)) => return .put, else => {}, }, ... Because `"GET"` and `"PUT"` are string literals, they're null terminated and of type `*const [3:0]u8`. When we dereference them, we get a `const [3:0]u8`. It's close, but it means that the value is 4 bytes (`[4]u8{'G', 'E', 'T', 0}`) and thus cannot be `@bitCast` into a `u24`. This is ugly, but it works: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { @as(u24, @bitCast(@as([]const u8, "GET")[0..3].*)) => return .get, @as(u24, @bitCast(@as([]const u8, "PUT")[0..3].*)) => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { @as(u32, @bitCast(@as([]const u8, "HEAD")[0..4].*)) => return .head, @as(u32, @bitCast(@as([]const u8, "POST")[0..4].*)) => return .post, else => {}, }, else => {}, } return null; } That's a mouthful, so we can add small function to help: fn parseMethod(value: []const u8) ?Method { switch (value.len) { 3 => switch (@as(u24, @bitCast(value[0..3].*))) { asUint(u24, "GET") => return .get, asUint(u24, "PUT") => return .put, else => {}, }, 4 => switch (@as(u32, @bitCast(value[0..4].*))) { asUint(u32, "HEAD") => return .head, asUint(u32, "POST") => return .post, else => {}, }, else => {}, } return null; } pub fn asUint(comptime T: type, comptime string: []const u8) T { return @bitCast(string[0..string.len].*); } Like the verbose version, the trick is to cast our null-terminated string literal into a string slice, `[]const u8`. By passing it through the `asUint` function, we get this without needing to add the explicit `@as([]const u8)`. There is a more advanced version of `asUint` which doesn't take the uint type parameter (`T`). If you think about it, the uint type can be inferred from the string's length: pub fn asUint(comptime string: []const u8) @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }) { return @bitCast(string[0..string.len].*); } Which allows us to call it with a single parameter: `asUint("GET")`. This might be your first time seeing such a return type. The `@Type` builtin is the opposite of `@typeInfo`. The latter takes a type and returns information on it in the shape of a `std.builtin.Type` union. Whereas `@Type` takes the `std.builtin.Type` and returns an actual usable type. One of these days I'll find the courage to blog about `std.builtin.Type`! As a final note, some people dislike the look of this sort of return type and rather encapsulate the logic in its own function. This is the same: pub fn asUint(comptime string: []const u8) AsUintReturn(string) { return @bitCast(string[0..string.len].*); } // Remember that, in Zig, by convention, a function should be // PascalCase if it returns a type (because types are PascalCase). fn AsUintReturn(comptime string: []const u8) type { return @Type(.{ .int = .{ // bits, not bytes, hence * 8 .bits = string.len * 8, .signedness = .unsigned, }, }); } ### Conclusion Of the three approaches, this is the least readable and less approachable. Is it worth it? It depends on your input and the values you're comparing against. In my benchmarks, using `@bitCast` performs roughly the same as `std.meta.stringToEnum`. But there are some cases where `@bitCast` can outperform `std.meta.stringToEnum` by as much as 50%. Perhaps that's the real value of this approach: the performance is less dependent on the input or the values being matched against. Leave a comment
www.openmymind.net
September 7, 2025 at 6:17 PM
GetOrPut With String Keys
I've previously blogged about how much I like Zig's `getOrPut` hashmap method. As a brief recap, we can visualize Zig's hashmap as two arrays: keys: values: -------- -------- | Paul | | 1234 | @mod(hash("Paul"), 5) == 0 -------- -------- | | | | -------- -------- | | | | -------- -------- | Goku | | 9001 | @mod(hash("Goku"), 5) == 3 -------- -------- | | | | -------- -------- When we call `get("Paul")`, we could think of this simplified implementation: fn get(map: *Self, key: K) ?V { const index = map.getIndexOf(key) orelse return null; return map.values[index]; } And, when we call `getPtr("Paul")`, we'd have this implementation: fn getPtr(map: *Self, key: K) ?*V { const index = map.getIndexOf(key) orelse return null; // notice the added '&' // we're taking the address of the array index return &map.values;[index]; } By taking the address of the value directly from the hashmap's array, we avoid copying the entire value. That can have performance implications (though not for the integer value we're using here). It also allows us to directly manipulate that slot of the array: const value = map.getPtr("Paul") orelse return; value.* = 10; This is a powerful feature, but a dangerous one. If the underlying array changes, as can happen when items are added to the map, `value` would become invalid. So, while `getPtr` is useful, it requires mindfulness: try to minimize the scope of such references. Currently, Zig's HashMap doesn't shrink when items are removed, so, for now, removing items doesn't invalidate any pointers into the hashmap. But expect that to change at some point. ### GetOrPut `getOrPut` builds on the above concept. It returns a pointer to the value **and** the key, as well as creating the entry in the hashmap if necessary. For example, given that we already have an entry for "Paul", if we call `map.getOrPut("Paul")`, we'd get back a `value_ptr` that points to a slot in the hahmap's `values` array, as well as a`key_ptr` that points to a slot in the hashmap's `keys` array. If the requested key _doesn't_ exist, we get back the same two pointers, and it's our responsibility to set the value. If I asked you to increment counters inside of a hashmap, without `getOrPut`, you'd end up with two hash lookups: // Go count, exists := counters["hits"] if exists == false { counters["hits"] = 1 } else { counters["hits"] = count + 1; } With `getOrPut`, it's a single hash lookup: const gop = try counters.getOrPut("hits"); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } ### getOrPut With String Keys It seems trivial, but the most important thing to understand about `getOrPut` is that it will set the key for you if the entry has to be created. In our last example, notice that even when `gop.found_existing == false`, we never set `key_ptr` - `getOrPut` automatically sets it to the key we pass in, i.e. `"hits"`. If we were to put a breakpoint after `getOrPut` returns but before we set the value, we'd see that our two arrays look something like: keys: values: -------- -------- | | | | -------- -------- | hits | | ???? | -------- -------- | | | | -------- -------- Where the entry in the `keys` array is set, but the corresponding entry in `values` is left undefined. You'll note that `getOrPut` doesn't take a value. I assume this is because, in some cases, the value might be expensive to derive, so the current API lets us avoid calculating it when `gop.found_existing == true`. This is important for keys that need to be owned by the hashmap. Most commonly strings, but this applies to any other key which we'll "manage". Taking a step back, if we wanted to track hits in a hashmap, and, most likely, we wanted the lifetime of the keys to be tied to the hashmap, you'd do something like: fn register(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); try map.put(owned, 0); } Creating your "owned" copy of `name`, frees the caller from having to maintain `name` beyond the call to `register`. Now, if this key is removed, or the entire map cleaned up, we need to free the keys. That's why I like the name "owned", it means the hash map "owns" the key (i.e. is responsible for freeing it): var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); The interaction between key ownership and `getOrPut` is worth thinking about. If we try to merge this ownership idea with our incrementing counter code, we might try: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } But this code has a potential memory leak, can you spot it? (see Appendix A for a complete runnable example) When `gop.found_existing == true`, `owned` is never used and never freed. One bad option would be to free `owned` when the entry already exists: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(owned); if (gop.found_existing) { // This line was added. But this is a bad solution allocator.free(owned); gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } It works, but we needlessly `dupe` `name` if the entry already exists. Rather than prematurely duping the key in case the entry doesn't exist, we want to delay our `dupe` until we know it's needed. Here's a better option: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // this line was added gop.key_ptr.* = try allocator.dupe(u8, name); gop.value_ptr.* = 1; } } It might seem reckless to pass `name` into `getOrPut`. We need the key to remain valid as long as the map entry exists. Aren't we undermining that requirement? Let's walk through the code. When `hit` is called for a new `name`, `gop.found_existing` will be false. `getOrPut` will insert `name` in our `keys` array. This is bad because we have no `guarantee` that `name` will be valid for as long as we need it to be. But the problem is immediately remedied when we overwrite `key_ptr.*`. On subsequent calls for an existing `name`, when `gop.found_existing == true`, the `name` is only used as a lookup. It's no different than doing a `getPtr`; `name` only has to be valid for the call to `getOrPut` because `getOrPut` doesn't keep a reference to it when an existing entry is found. ### Conclusion This post was a long way to say: don't be afraid to write to `key_ptr.*`. Of course you can screw up your map this way. Consider this example: fn hit(allocator: Allocator, map: *std.StringHashMap(u32), name: []const u8) !void { // we use `name` for the lookup. const gop = try map.getOrPut(name); if (gop.found_existing) { gop.value_ptr.* += 1; } else { // what's this? gop.key_ptr.* = "HELLO"; gop.value_ptr.* = 1; } } Because the key is used to organize the map - find where items go and where they are, jamming random keys where they don't belong is going to cause issues. But it can also be used correctly and safely, as long as you understand the details. ### Appendix A - Memory Leak This code `should` report a memory leak. const std = @import("std"); const Allocator = std.mem.Allocator; pub fn main() !void { var gpa = std.heap.GeneralPurposeAllocator(.{}){}; const allocator = gpa.allocator(); defer _ = gpa.detectLeaks(); // I'm using the Unmanaged variant because the Managed ones are likely to // be removed (which I think is a mistake). Using Unmanaged makes this // snippet more future-proof. I explain unmanaged here: // https://www.openmymind.net/Zigs-HashMap-Part-1/#Unmanaged var map: std.StringHashMapUnmanaged(u32) = .{}; try hit(allocator, ↦, "teg"); try hit(allocator, ↦, "teg"); var it = map.keyIterator(); while (it.next()) |key_ptr| { allocator.free(key_ptr.*); } map.deinit(allocator); } fn hit(allocator: Allocator, map: *std.StringHashMapUnmanaged(u32), name: []const u8) !void { const owned = try allocator.dupe(u8, name); const gop = try map.getOrPut(allocator, owned); if (gop.found_existing) { gop.value_ptr.* += 1; } else { gop.value_ptr.* = 1; } } Leave a comment
www.openmymind.net
September 7, 2025 at 6:17 PM
Zig's dot star syntax (value.*)
Maybe I'm the only one, but it always takes my little brain a split second to understand what's happening whenever I see, or have to write, something like `value.* = .{...}`. If we take a step back, a variable is just a convenient name for an address on the stack. When this function executes: fn isOver9000(power: i64) bool { return power > 9000; } Say, with a `power` of 593, we could visualize its stack as: power -> ------------- | 593 | ------------- If we changed our function to take a pointer to an integer: // i64 changed to *i64 fn isOver9000(power: *i64) bool { return power > 9000; } Our `power` argument would still be a label for a stack address, but instead of directly containing an number, the stack value would itself be an address. That's the _indirection_ of pointers: power -> ------------- | 1182145c0 |------------------------ ------------- | | ............. empty space | ............. or other data | | ------------- | | 593 | <---------------------- ------------- But this code doen't work: it's trying to compare a `comptime_int` (`9000`) with an `*i64`. We need to make another change to the function: // i64 changed to *i64 fn isOver9000(power: *i64) bool { // power changed to power.* return power.* > 9000; } `power.*` is how we dereference a pointer. Dereferencing means to get the value pointed to by a pointer. From our above visualization, you could say that the `.*` follows the arrow to get the value, `593`. This same syntax works for writing as well. The following is valid: fn isOver9000(power: *i64) bool { power.* = 9001; return true; } Like before, the dereferencing operator (`.*`), "follows" the pointer, but now that it's on the receiving end of an assignment, we write the value into the pointed add memory. This is all true for more complex types. Let's say we have a `User` struct with an `id` and a `name`: const User = struct { id: i32, name: []const u8, }; var user = User{ .id = 900, .name = "Teg" }; The `user` variable is a label for the location of the [start of] the user: user -> ------------- | 900 | ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- A slice in Zig, like our `[]const u8`, is a length (`3`) and a pointer to the values. Now, if we were to take the address of `user`, via `&user;`, we introduce a level of indirection. For example, imagine this code: const std = @import("std"); const User = struct { id: i32, name: []const u8, }; pub fn main() !void { var user = User{ .id = 900, .name = "Teg" }; updateUser(&user;); std.debug.print("{d}\n", .{user.id}); } fn updateUser(user: *User) void { user.id += 100000; } The `user` parameter of our `updateUser` function is pointing to the `user` on `main`'s stack: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | user -> ------------- | | 900 | <---------------------- ------------- | 3 | ------------- | 3c9414e99 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Because we're referencing `main`'s `user` (rather than a copy), any changes we make will be reflected in `main`. But, we aren't limited to operating on fields of `user`, we can operate on its entire memory. Of course, we can create a copy of the id field (assignment are always copies, just an matter of knowing _what_ we're copying): fn updateUser(user: *User) void { const id = user.id // .... } And now the stack for our function looks like: user -> ------------- | 83abcc30 | id -> ------------- | 900 | ------------- But we can also copy the entire user: fn updateUser(user: *User) void { const copy = user.*; // .... } Whch gives us something like: updateUser user -> ------------- | 83abcc30 |--------------------- copy -> ------------- | | 900 | | ------------- | | 3 | | ------------- | | 3c9414e99 | --------------------|-- ------------- | | | | ............. empty space | | ............. or other data | | | | main | | user -> ------------- | | | 900 | <------------------- | ------------- | | 3 | | ------------- | | 3c9414e99 | -----------------------| ------------- | | ............. empty space | ............. or other data | | ------------- | | T | <---------------------- ------------- | e | ------------- | g | ------------- Notice that it didn't create a copy of the value 'Teg'. You could call this copying "shallow": it copied the `900`, the `3` (name length) and the `3c9414e99` (address of the name pointer). Just like our simpler example above, we can also assign using the dereferencing operator: fn updateUser(user: *User) void { // using type inference // could be more explicit and do // user.* = User{....} user.* = .{ .id = 5, .name = "Paul", }; } This doesn't copy anything; it writes into the address that we were given, the address of the main's `user`: updateUser user -> ------------- | 83abcc30 |------------------------ ------------- | | ............. empty space | ............. or other data | | main | | user -> ------------- | | 5 | <---------------------- ------------- | 4 | ------------- | 9bf4a990 | ----------------------- ------------- | | ............. empty space | ............. or other data | | ------------- | | P | <---------------------- ------------- | a | ------------- | u | ------------- | l | ------------- If you're still not fully comfortable with this, and if you haven't done so already, you might be interested in the pointers and stack memory parts of my learning zig series. Leave a comment
www.openmymind.net
September 7, 2025 at 6:17 PM