Rudolf Arseni Braun
fasttosmile.bsky.social
Rudolf Arseni Braun
@fasttosmile.bsky.social
Here for AI stuff. Currently ASR@AWS

Sometimes write on rudolfarseni.me
Maybe it's good to play games because it's a way to get a verifiable reward signal for one's thoughts.
March 19, 2025 at 6:21 PM
Reposted by Rudolf Arseni Braun
I just learned that Torch ctc_loss calculates the wrong gradient (but when there was log_softmax before, it does not matter).

For the grad ctc_loss w.r.t. log_probs, it calculates exp(log_probs) - y, but correct would be -y. Some workaround: github.com/pytorch/pyto...

PS: First Bluesky post.
CTCLoss gradient is incorrect · Issue #52241 · pytorch/pytorch
🐛 Bug Hi, While working on some CTC extensions, I noticed that torch's CTCLoss was computing incorrect gradient. At least when using CPU (I have not tested on GPU yet). I observed this problem on b...
github.com
November 26, 2024 at 11:16 PM
Reposted by Rudolf Arseni Braun
Observing the responses here and on twitter made me reflect, realize and act. There is a real difference, and I describe it (as well as some of the non-differences) here:

gist.github.com/yoavg/9142e5...
November 23, 2024 at 10:35 PM