earlence.bsky.social
@earlence.bsky.social
(Assistant) Professor at @UCSanDiego. I hacked a Stop sign once, and it is now in a museum. Also hacked a bicycle. I mostly spend my time building stuff though.
Building a more robust model definitely helps. But it cannot be the only line of defense. You have to sandbox the model, just like we sandbox OS processes to contain the damage of a memory corruption vuln.
May 6, 2025 at 4:56 AM
FEEL THE AGI!
December 30, 2024 at 3:33 PM
Most work has focused on privesc for some "forbidden knowledge" and IMO this has muddied JB a LOT. If you ignore the "make me a bomb" type issues, you will realize there's a lot more that can be done with JB attacks.
December 14, 2024 at 5:41 PM
its got that 70s look
December 3, 2024 at 4:58 PM
@mattburgess1.bsky.social has covered AI security stuff.
November 20, 2024 at 4:54 PM