Zaid Humayun
redixhumayun.bsky.social
Zaid Humayun
@redixhumayun.bsky.social
Engineer @conviva | Figuring out databases
Seems like it’s all in preparation for an IPO
May 6, 2025 at 8:01 AM
I’m fairly certain Amazon also does an LLM chatbot thing where it’s pretending to be a person on chat.

The formulaic responses give it away.
February 4, 2025 at 10:34 AM
Isn’t this the exact same drama that happened with Ted Tso’o?
February 4, 2025 at 7:52 AM
If you found this interesting, give the full article a read which goes into much more depth and contains links to more interesting tidbits around Rust compiler optimizations

redixhumayun.github.io/performance/...
Cache Conscious Hash Maps
Here’s a link to the code on GitHub
redixhumayun.github.io
January 27, 2025 at 2:44 PM
Now, we see much better performance when comparing cache stats for the 3 implementations.

It isn't enough to have contiguous memory access, you also need to be aware of the cache hardware and how many entries can fit per cache line.
January 27, 2025 at 2:44 PM
Doing all of the above requires you to know some information about your hardware, like the size of your cache and the size of a cache line.
January 27, 2025 at 2:44 PM
We can do a more compact open addressing scheme which will fit 4 entries per cache line

Also, using u64 instead of Strings leads to better performance since Strings are always heap allocated
January 27, 2025 at 2:44 PM
The problem is the memory layout of the structs used for chaining and open addressing

With chaining, we only have pointers which are 8 bytes each so that's 8 per cache line in a 64 byte cache line

With open addressing, it's 24 bytes each so that's 2 per cache line with padding
January 27, 2025 at 2:44 PM
Profiling, however, shows very different results. It shows chaining and open addressing displaying roughly the same levels of cache performance.

Open addressing ends up performing better but that's only due to fewer overall instructions
January 27, 2025 at 2:44 PM
You can handle collisions via open addressing (linear probing). This is where you just search through every subsequent entry until you find one where you can insert the key-value pair.

This should be more cache friendly since the entire memory is contiguously allocated
January 27, 2025 at 2:44 PM
You can handle collisions via chaining where each entry becomes a linked list. This is simple but tends to lead to fragmented memory since each node is allocated separately.

Fragmented memory should theoretically lead to poorer cache performance.
January 27, 2025 at 2:44 PM
Building a hash map is easy enough - a few public methods and a couple of internal methods to track when to resize the hash map
January 27, 2025 at 2:44 PM