Lightnews — Scholar-powered news

P-Y

@p-y.wtf

TIL that you can easily slice a pizza or pie in 5 pieces: make a paper pentagon, place it at the center, use corners as slice starting points.

Paper pentagon? Tie a simple overhand knot with a long strip of paper, keeping it flat, cut off the ends.

Wow. Life changing trick 😊

November 8, 2025 at 2:51 PM

P-Y

@p-y.wtf

Waiting for the build

A laptop building an Android app, with the Mediterranean sea in the background

October 30, 2025 at 12:25 PM

P-Y

@p-y.wtf

That's where ChatGTP pointed me to the Hodges–Lehmann estimator. It seems to be a fairly simple concept (see screenshots)

October 24, 2025 at 1:13 PM

P-Y

@p-y.wtf

In my experience, in a perf context (benchmarks or prod data), if you manage to isolate dimensions, you often end up with overlapping normal distributions (e.g. in the previous example 3 gaussians)

I need to look into this clustering method called Gaussian Mixture Models

October 24, 2025 at 1:13 PM

P-Y

@p-y.wtf

Even with a test that does ONLY scrolling frames, the distribution different peaks/modes (e.g. fast, less fast, slow)

Why? Some frames just move things already drawn by a few px, others have to redraw a whole new item.

Saying "the P50 shifted from 6 ms to 4.6 ms" hides these clusters.

October 24, 2025 at 1:13 PM

P-Y

@p-y.wtf

Frames durations are ANYTHING BUT normally distributed.

First, maybe your test sucks.

Here, both "before" & "after" show a slow 1st (starting activity) & 40th frame (load images)

That test isn't correctly focused on scrolling frames, it includes initial loading frames.

October 24, 2025 at 1:13 PM

P-Y

@p-y.wtf

Anyway, thanks ChatGPT, it does look like the Hodges–Lehmann estimator is what I was looking for to compute the P50 with CI shift in frame duration between a before & after benchmark of 100 iterations each (where each iteration has N frame and computes a P50 frame time)

cc @rahulrav.com

October 24, 2025 at 12:02 PM

P-Y

@p-y.wtf

rods... WORDS, I MEANT WORDS godamnit

October 24, 2025 at 11:50 AM

P-Y

@p-y.wtf

Basically I can stop being a lazy ass asking the LLMs to write the code and instead be a lazy ass that uses your code

October 22, 2025 at 9:08 PM

P-Y

@p-y.wtf

Sometimes it's also helpful to visualize the values in order instead of as a distribution. For example, in this blog post I'm writing, you can see how the first frame is super slow, and then some other frame later is slow, in both before & after benchmark runs.

October 22, 2025 at 8:52 PM

P-Y

@p-y.wtf

- This would be most useful if you could load TWO json files, the before & after, and visually compare the distributions / see shifts
- I'm guess you picked 100 bins as default, and (max-min)/ bin_count as bucket width. Not always good. Try other things, e.g. en.wikipedia.org/wiki/Freedma... ?

October 22, 2025 at 8:49 PM

P-Y

@p-y.wtf

Neat! I gave it a quick try, some feedback

- The Drag&Drop box should be the entire screen (if you miss the rectangle, Chrome just opens your json file)
- I loaded a file, it showed a nice chart, but then bottom seems to have something else trying to load (I see a loader) and nothing is loading.

October 22, 2025 at 8:45 PM

P-Y

@p-y.wtf

Great crowd with the Paris Android User Group at Radio France

Showing how frame metrocs & percentiles are computed in Macrobenchmark, then I went on to explain how to draw a distribution, compute confidence intervals, and do bootstrapping. That was a little ambitious, but tons of fun

#AndroidFun

October 21, 2025 at 10:51 PM

P-Y

@p-y.wtf

Thanks! One cool trick is you can buy a letter set then imprint messages for extra sparkles

October 19, 2025 at 5:54 PM

P-Y

@p-y.wtf

Neat! I like to make 2mm thick cookies (using a special rolling pin) and cook them in between 2 sylpain mats for half the bake, keeps them flat + adds a neat detailed pattern

October 19, 2025 at 4:16 PM

P-Y

@p-y.wtf

This is in less than an hour!

I'll be talking about Benchmarking with confidence.. wish me luck, live demos incoming

www.meetup.com/new-york-and...

#AndroidDev

October 16, 2025 at 8:09 PM

P-Y

@p-y.wtf

[French]

Mardi 21 Octobre (dans 2 semaines!) à 19h, le Paris Android User Group organise un meetup chez Radio France. Il paraît que le lieu est magnifique!

Je serai là en 2e partie pour vous parler de benchmarks dignes de confiance.

www.meetup.com/android-pari...

#AndroidDev #paris

Talk 2. Des benchmarks dignes de confiance
👱‍♂️ Pierre-Yves Ricau

Jetpack Macrobenchmark, à première vue ça paraît assez simple: j'écris un scénario de test, je fais tourner le benchmark avant et après une optimisation, et hop, 20% d'amélioration de la médiane, je prends la confiance et je file demander une augmentation ! Hélas, un collègue a la mauvaise idée de relancer le même benchmark et obtient seulement 15% d'amélioration. Qui a raison, quel résultat communiquer?

L'objectif de cette présentation est de vous apprendre à creuser au-delà du résultat d'un benchmark, obtenir une meilleure appréhension de ce que vous avez vraiment mesuré, et saupoudrer le tout d'un peu de stats pour avoir l'air intelligent en soirée.

October 7, 2025 at 8:59 AM

P-Y

@p-y.wtf

@swank.ca and I casually sharing war stories over coffee.. in front of 3000 Block engineers

September 25, 2025 at 5:52 AM

P-Y

@p-y.wtf

Packed schedule:

- Last week, Block offsite in California with 8000 employees... Sleeping from midnight (party!) to 4am (jetlag!) every night.. + 2 talk engs talks to give
- then Octoberfest in Munich this weekend
- then Droidcon Berlin next week, with 5mn in keynote + 40mm talk

🫩

#AndroidDev

September 21, 2025 at 2:13 PM

P-Y

@p-y.wtf

Ah, so that's what that was!

www.anthropic.com/engineering/...

2. Output corruption
On August 25, we deployed a misconfiguration to the Claude API TPU servers that caused an error during token generation. An issue caused by a runtime performance optimization occasionally assigned a high probability to tokens that should rarely be produced given the context, for example producing Thai or Chinese characters in response to English prompts, or producing obvious syntax errors in code. A small subset of users that asked a question in English might have seen "สวัสดี" in the middle of the response, for example.

September 20, 2025 at 2:42 PM

P-Y

@p-y.wtf

LOL. Right. If you're going to scam or phish, do a little better..

September 3, 2025 at 6:52 PM

P-Y

@p-y.wtf

This isn't a valid way to present improvements.

Let's say we run a benchmark again with no change and get a slight different aggregate. E.g. for "before" P50 -5 instead of -4.2. Suddenly the improvement vs "after" looks less good!

What now? Which one do you pick?

This is why we need stats.

August 31, 2025 at 7:19 AM

P-Y

@p-y.wtf

Android devices have all sorts of screen sizes, so you can't really predict how many items will be on / off screen. Feels like something Compose could help with, dynamically switching Lazy off in those cases? idk

August 31, 2025 at 6:28 AM

P-Y

@p-y.wtf

Ahem. This can't be serious advice, how could one audit a large codebase for this, when it's not even documented which APIs trigger binder calls?

AOSP should document APIs that make binder calls + have custom trace sections for all binder calls.

August 31, 2025 at 6:22 AM

P-Y

@p-y.wtf

I love this simplified example, but the colors are too vibrant, Perfetto could never produce this 😉. Trivia: span color in Perfetto is actually a hash of the trace section string

August 31, 2025 at 6:17 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news