Akhil
akhilakella.bsky.social
Akhil
@akhilakella.bsky.social
I ♥️ GNNs, LLM's, non-eucl Geom, uncertainty quantification, and Science of Science. Research Scientist at @KelloggCSSI. Opinions = own, RT/Like != endorsement.
I just asked "what is the last word in this sentence ?". Someone should adjust the training mix to support diverse length rewards i guess
June 5, 2025 at 2:41 PM
From an RL experiment on a small dataset
1. yellow (no explicit instructions) was never gaining any rewards from learning.
2. red (added one extra sentence) started to improve and explore the reward.
3. green (added a whole sentence) best performance.

prompting matters i guess.
April 9, 2025 at 12:06 AM
In case you're wondering how does the end output look like..
April 2, 2025 at 4:35 PM
It was wonderful giving a guest lecture in Prof. alhoori's (NIU CS) class on developing reasoning models using GPRO for scientific texts. Here are a couple of cool slides from the presentation....
April 2, 2025 at 4:34 PM