Lightnews — Scholar-powered news

Zaid Humayun

@redixhumayun.bsky.social

Seems like it’s all in preparation for an IPO

May 6, 2025 at 8:01 AM

Zaid Humayun

@redixhumayun.bsky.social

I’m fairly certain Amazon also does an LLM chatbot thing where it’s pretending to be a person on chat.

The formulaic responses give it away.

February 4, 2025 at 10:34 AM

Zaid Humayun

@redixhumayun.bsky.social

Isn’t this the exact same drama that happened with Ted Tso’o?

February 4, 2025 at 7:52 AM

Zaid Humayun

@redixhumayun.bsky.social

If you found this interesting, give the full article a read which goes into much more depth and contains links to more interesting tidbits around Rust compiler optimizations

redixhumayun.github.io/performance/...

Cache Conscious Hash Maps

Here’s a link to the code on GitHub

redixhumayun.github.io

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

Now, we see much better performance when comparing cache stats for the 3 implementations.

It isn't enough to have contiguous memory access, you also need to be aware of the cache hardware and how many entries can fit per cache line.

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

Doing all of the above requires you to know some information about your hardware, like the size of your cache and the size of a cache line.

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

We can do a more compact open addressing scheme which will fit 4 entries per cache line

Also, using u64 instead of Strings leads to better performance since Strings are always heap allocated

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

The problem is the memory layout of the structs used for chaining and open addressing

With chaining, we only have pointers which are 8 bytes each so that's 8 per cache line in a 64 byte cache line

With open addressing, it's 24 bytes each so that's 2 per cache line with padding

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

Profiling, however, shows very different results. It shows chaining and open addressing displaying roughly the same levels of cache performance.

Open addressing ends up performing better but that's only due to fewer overall instructions

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

You can handle collisions via open addressing (linear probing). This is where you just search through every subsequent entry until you find one where you can insert the key-value pair.

This should be more cache friendly since the entire memory is contiguously allocated

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

You can handle collisions via chaining where each entry becomes a linked list. This is simple but tends to lead to fragmented memory since each node is allocated separately.

Fragmented memory should theoretically lead to poorer cache performance.

January 27, 2025 at 2:44 PM

Zaid Humayun

@redixhumayun.bsky.social

Building a hash map is easy enough - a few public methods and a couple of internal methods to track when to resize the hash map

January 27, 2025 at 2:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news