Jason
jas-ho.bsky.social
Jason
@jas-ho.bsky.social
Co-Director Apart Research | apartresearch.com
Aligning AIs is hard and even knowing what to aim for is non-trivial. Very excited about the work by @jacyanthis.bsky.social
and team on this important problem and very proud that
@apartresearch.bsky.social
was able to support this project!
LLM agents are optimized for thumbs-up instant gratification. RLHF -> sycophancy

We propose human agency as a new alignment target in HumanAgencyBench, made possible by AI simulation/evals. We find e.g., Claude most supports agency but also most tries to steer user values 👇 arxiv.org/abs/2509.08494
September 16, 2025 at 10:49 AM
Reposted by Jason
If you wanted to see how little attention folks are paying to the possibility of AGI (however defined) no matter how much the labs publicly discuss it, here is an official course from Google Deepmind whose first session is "we are on a path to superhuman capabilities"

It has less than 1,000 views.
April 3, 2025 at 3:05 PM