Lightnews — Scholar-powered news

Simon Lermen

@simonlermen.bsky.social

130 followers 330 following 25 posts

I work on AI safety and AI in cybersecurity

Posts Replies Media Videos

Pinned

Simon Lermen @simonlermen.bsky.social · Jan 4

I published a human study with @fredheiding.bsky.social
We use AI agents built from GPT-4o and Claude 3.5 Sonnet to search the web for available information on a target and use this for highly personalized phishing messages. achieved click-through rates above 50%
www.lesswrong.com/posts/GCHyDK...

Human study on AI spear phishing campaigns — LessWrong

TL;DR: We ran a human subject study on whether language models can successfully spear-phish people. We use AI agents built from GPT-4o and Claude 3.5…

www.lesswrong.com

Simon Lermen

@simonlermen.bsky.social

Our paper on AI-powered spear phishing, co-authored with @fredheiding.bsky.social , has been accepted at the ICML 2025 Workshop on Reliable and Responsible Foundation Models!
openreview.net/pdf?id=f0uFp...

openreview.net

July 4, 2025 at 10:49 PM

Simon Lermen

@simonlermen.bsky.social

Grok's DeepSearch was launched with Zero safety features, you can ask it about assasslnations, dru*gs. This has been online for a few days now with no changes.

February 25, 2025 at 1:38 PM

Simon Lermen

@simonlermen.bsky.social

Human study on AI spear phishing campaigns — LessWrong

TL;DR: We ran a human subject study on whether language models can successfully spear-phish people. We use AI agents built from GPT-4o and Claude 3.5…

www.lesswrong.com

January 4, 2025 at 1:48 PM

Simon Lermen

@simonlermen.bsky.social

I'll be at the SafeGenAI workshop on Sunday presenting on research I did on safety in AI agents.
I will talk about results from these two blog posts:
www.lesswrong.com/posts/ZoFxTq...
And:
www.lesswrong.com/posts/Lgq2Dc...

Current safety training techniques do not fully transfer to the agent setting — LessWrong

TL;DR: We are presenting three recent papers which all share a similar finding, i.e. the safety training techniques for chat models don’t transfer we…

www.lesswrong.com

December 13, 2024 at 6:56 PM

Reposted by Simon Lermen

Arthur Conmy

@arthurconmy.bsky.social

I'm very bullish on automated research engineering soon, but even I was surprised that AI agents are twice as good as humans with 5+ years of experience or from a top AGI or safety lab at doing tasks in 2 hours. Paper: metr.org/AI_R_D_Evalu...

metr.org

November 22, 2024 at 10:21 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news