Zygi
banner
nonagon.bsky.social
Zygi
@nonagon.bsky.social
Making computers solve problems we can't. Occasional cypherpunk. DMs open
Examples:
- GCG, PGD and other token-space optimization algorithms (optimize discrete F: token-sequence -> R)
- SAEs (why does a simple sparseness hack work so well?)
November 25, 2024 at 7:53 AM
so any environment where there's a reliable "action done" event that you can catch programmatically works much much better
November 24, 2024 at 9:52 PM
example trace from one of my experiments:
- claude clicks "expand"
- expand takes 2.5 seconds, so the next screenshot sent to claude (after ~2s) has no changes
- claude clicks again, but by that point expansion already happened, so the menu gets collapsed
- claude gets very confused and gives up
November 24, 2024 at 9:51 PM
but also, some manual integration remains necessary. claude doesn't see a video, it sees screenshots with long gaps in between (the reference impl is like 2 seconds). how does claude know the application finished responding to the click, vs is still loading, vs the click didn't go through? it's hard
November 24, 2024 at 9:48 PM
sorry i meant that desktop has those automations already so the marginal improvement is smaller than mobile.
November 24, 2024 at 9:46 PM
but the killer use case will prob be smartphone app automation, it's something that's kinda hard to automate nowadays through other means. on desktop it's easier with things like applescript or windows desktop automate
November 24, 2024 at 8:51 PM
imo it's still a little too dumb to be useful. we might need 6 months.
November 24, 2024 at 8:41 PM