Lightnews — Scholar-powered news

AJ Stuyvenberg

@ajs.bsky.social

I use AWS a ton but Lambda still astounds me. Throw some code in a function, send 1m requests as fast as you can.

It ate up all available file descriptors on my little t3 box and still ran 18k RPS with a p99 of 0.3479s. Not many services can go from 0 to 18k RPS instantaneously with this p99.

September 18, 2025 at 1:03 PM

AJ Stuyvenberg

@ajs.bsky.social

Lambda now charges for init time, so it's useful to count sandboxes which are proactively initialized but never receive a request.

Here's what happens after a 10k request burst. Hundreds of sandbox shutdowns, along with 22 sandboxes which were spun up but never received a request.

August 5, 2025 at 5:51 PM

AJ Stuyvenberg

@ajs.bsky.social

Happy Lambda Init Billing day to those who celebrate. Fix your cold starts!

August 1, 2025 at 3:04 PM

AJ Stuyvenberg

@ajs.bsky.social

NEW: Lambda can now send up to 200mb payloads using response streaming! I assume this is mostly directed at LLM inference workloads, where chatbots can stream large amounts of data over the wire as it becomes available.

August 1, 2025 at 1:10 AM

AJ Stuyvenberg

@ajs.bsky.social

I've long been an advocate for the Lambda Web Adapter project which lets anyone pretty easily ship an app to Lambda without learning about the event model/API.

Honestly AWS should simply support this natively.

July 30, 2025 at 5:43 PM

AJ Stuyvenberg

@ajs.bsky.social

Here's another 33% cold start reduction, which comes from deferring expensive decryption calls made to AWS Secrets Manager until the secret is actually needed.

Lazy loading is great!

July 22, 2025 at 3:31 PM

AJ Stuyvenberg

@ajs.bsky.social

Here's how to visualize a 100% memory allocation improvement!

A recent stress test revealed that malloc calls bottlenecked when sending > 100k spans through the API and aggregator pipelines in Lambda.

July 18, 2025 at 3:07 PM

AJ Stuyvenberg

@ajs.bsky.social

NEW: A recent blog post went viral in the AWS ecosystem, about how there's a silent crash in AWS Lambda's NodeJS runtime.

Today I'll step you through the actual Lambda runtime code which causes this confusing issue, and walk you through how to safely perform async work in Lambda:

July 17, 2025 at 3:29 PM

AJ Stuyvenberg

@ajs.bsky.social

Lambda's fleet management shutdown algorithm is learning faster!

I'm calling this function every 8 or so minutes. At first the gap from invocation to shutdown is about 5-6 minutes, which was the fastest I've observed during previous experiments.

July 14, 2025 at 4:07 PM

AJ Stuyvenberg

@ajs.bsky.social

NEW: AWS is rolling out a new free tier beginning July 15th!!

New accounts get $100 in credits to start and can earn $100 exploring AWS resources. You can now explore AWS without worrying about incurring a huge bill, this is great!

docs.aws.amazon.com/awsaccountbi...

July 11, 2025 at 3:23 PM

AJ Stuyvenberg

@ajs.bsky.social

You should care about your p99! By improving the function cold start time, the service on the left performed:
2x faster in RPS and thus, duration.
p99 from 1.52s -> .949s

The code and functionality is identical, but improving the cold start from 816ms to 301ms made all the difference.

July 10, 2025 at 3:36 PM

AJ Stuyvenberg

@ajs.bsky.social

Quick PSA to make sure you're using a DLQ and setting a max receive count for SQS, otherwise you may find yourself looking at a flamegraph like this.

Hundreds of attempts, multiple messages in queue and not burning down and average age of message ticking up! Seems common knowledge, but...

July 8, 2025 at 3:55 PM

AJ Stuyvenberg

@ajs.bsky.social

Monitoring Lambda sandbox shutdowns reveals an interesting scaling behavior

After a request spike, Lambda waits ~10m before reaping 2/3rds of sandboxes. 5m later it begins reaping the rest.

Presumably this helps smooth latency during retry storms, or if traffic returns!

May 29, 2025 at 3:55 PM

AJ Stuyvenberg

@ajs.bsky.social

So the strategy looks something like this:

May 21, 2025 at 5:04 PM

AJ Stuyvenberg

@ajs.bsky.social

The hard part is figuring out what to do when function invocation rates slow down and you need to adapt back to flushing data and blocking /next until it completes, because otherwise you'll drop data.

Lambda functions can't coordinate so you have to figure this out another way

May 21, 2025 at 5:04 PM

AJ Stuyvenberg

@ajs.bsky.social

How? In short, I cheated.

If the invocations are frequent enough (more frequent than the TCP Keep Alive/server timeout), you can drive one flush request asynchronously across multiple invocations and it costs you $0 in billed time.

That's not the hard part though.

May 21, 2025 at 5:03 PM

AJ Stuyvenberg

@ajs.bsky.social

Here's another huge p99 cliff!

One unreasonably effective way to lower the average latency of a service is to minimize the causes of p99 events.

Here, we've managed to absolutely crush the Max Post Runtime Duration from ~80ms to 500µs!

May 21, 2025 at 4:34 PM

AJ Stuyvenberg

@ajs.bsky.social

Truly incredible to learn that Epic Games isn't self hosted on a VPS.

May 6, 2025 at 3:55 PM

AJ Stuyvenberg

@ajs.bsky.social

PRICE CUT: Lambda slashes the price of cloudwatch logs for high volume. From $.50/gb down to $0.05/gb after 50TB.

You've gotta be spending a decent chunk of $$$ on cloudwatch logs for this to help, but still – a price cut is a price cut!

May 1, 2025 at 7:55 PM

AJ Stuyvenberg

@ajs.bsky.social

Combining profiler data with claude code feels super powerful.

Just drop a pprof file into a project, explain the dimensions of the profile, then let the LLM make suggestions to solve the hot spots.

Instant performance boost

May 1, 2025 at 4:29 PM

AJ Stuyvenberg

@ajs.bsky.social

NEW: AWS Lambda will now begin billing for the INIT phase for all runtimes (not just custom/container runtimes). Your cold starts cost money now!

My hottest take is that AWS should have done this years ago.

April 29, 2025 at 7:49 PM

AJ Stuyvenberg

@ajs.bsky.social

The evolution of my program when each branch is benchmarked & profiled before merging

April 23, 2025 at 4:52 PM

AJ Stuyvenberg

@ajs.bsky.social

Check out that p99 cliff!

I spent 2024 shipping our next-generation Lambda Extension, which offers an 82% improvement in cold start time, better aggregation and flushing options, and lower overall overhead – all in a substantially smaller binary package.

More: www.datadoghq.com/blog/enginee...

April 10, 2025 at 4:21 PM

AJ Stuyvenberg

@ajs.bsky.social

Cloudflare isn't pulling any punches with this twitter outage. Absolutely brutal:

March 10, 2025 at 5:10 PM

AJ Stuyvenberg

@ajs.bsky.social

🚨 NEW VIDEO: Good clocks could be huge for us devs – so now I've finally made a video about databases, how they are distributed, and why precise clocks can change things.

This can also go horribly wrong, so we'll cover that too.

Check it out – https://buff.ly/4hhkVtw

January 22, 2025 at 7:21 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news