Falvyu
falvyu.bsky.social
Falvyu
@falvyu.bsky.social
PhD | French | Hardware-aware Algorithm design | Image Processing | HPC

SIMD-friends: #SSE, #AVX512, #NEON, #RVV
(Opinions are my own)
I don't know anything about Raku, but these ads/posters are quite cool. #FOSDEM
February 1, 2025 at 5:16 PM
Hello Brussels ! #FOSDEM
January 31, 2025 at 2:06 PM
Some neat tricks for computing bit-wise prefix-or and segmented-prefix-or within scalar registers.
December 29, 2024 at 1:57 AM
It turns that the 'unrolled loop' for the 64-bits bitwise-or segmented scan can actually be reduced down to a few instructions.
(not sure if that's a known trick)

Note: compilers optimize `((~mreset) | v)` so that it is only computed once.
December 28, 2024 at 11:26 PM
*screams internally*
December 26, 2024 at 1:51 AM
I have the RLE bandwidth (in giga-pixels), and cycles-per-pixels.
Here, the cpp is derived from the duration & frequency (i.e. it's meant to be a rough approximation of 'hardware cycles').
I do have 'proper cycles' measurements, but not at this level of granularity.
December 6, 2024 at 11:38 PM
A closer look at one of the 'main' SIMD part, which is a Run Length Encoding algorithm (8-bits pixels => 16-bits segments) show promising results for RVV (even if AVX512 and Neon on the M1 provide a larger speedup).
December 6, 2024 at 1:48 AM
A 'rough' determination of the efficiency of each SIMD implementation can be done by measuring them against their scalar versions.
It is worth noting that only portions of the algorithm have been vectorized, and the performance of non-SIMD parts (which can be branch-heavy) will also vary.
December 6, 2024 at 1:48 AM
A performance comparison against SotA algorithms show good results (note: not *just* because of SIMD, other transformations have also been proposed).
December 6, 2024 at 1:48 AM
I defended my PhD last Friday (design of efficient data-dependent image processing algorithms).

I'd like to thank everyone who has been involved in it (jury members, advisors, colleagues, and of course my family).
This has been a long adventure, and I'm now looking forward to the next thing.
December 3, 2024 at 6:31 PM
MFW
November 27, 2024 at 8:03 PM