shane caldwell
shanecaldwell.bsky.social
shane caldwell
@shanecaldwell.bsky.social
synthetic data, RL, hackbots - writing at https://hackbot.dad/
The ease with which you can change environment or action space representations is legitimately magic. Using a judge as a reward function just feels like cheating.

I'd recommend anyone interested try out @openpipe.bsky.social's ART and @willccbb.bsky.social's Verifiers.
August 25, 2025 at 2:16 PM