Lightnews — Scholar-powered news

Artificial Intelligence Security

@aisecurity.bsky.social

270 followers 7 following 9 posts

I do AI Security.
I work in AI Security.
I advocate AI Security.
👉 www.arewesafeyet.com

Posts Replies Media Videos

Artificial Intelligence Security

@aisecurity.bsky.social

Researchers showed that Anthropic's new "Agent Skills" feature can be hijacked with almost laughable ease. Security-by-design still hasn't made it onto the AI industry's to-do list.

www.arewesafeyet.com/when-ai-brea...

November 5, 2025 at 10:35 PM

Artificial Intelligence Security

@aisecurity.bsky.social

The AI systems we increasingly depend on are fundamentally vulnerable. NIST’s latest report makes that reality plain, exposing the limits of today’s AI security measures and highlighting a growing disconnect between how AI is deployed and how it’s defended.

www.arewesafeyet.com/adversarial-...

April 24, 2025 at 10:53 AM

Artificial Intelligence Security

@aisecurity.bsky.social

A new paper reveals that fine-tuning large language models on a seemingly narrow task – like writing insecure code – can trigger broad and deeply harmful behaviors. These include promoting violence, expressing authoritarian ideology, and encouraging self-harm.

www.arewesafeyet.com/emergent-mis...

April 3, 2025 at 9:52 AM

Artificial Intelligence Security

@aisecurity.bsky.social

The UK realized AI might do more harm as a weapon than as an insensitive chatbot. They’ve rebranded their AI ‘Safety’ Institute to ‘Security’ Institute to focus on actual threats like cyberattacks. And yet, geopolitics pushed this change more than common sense.
www.arewesafeyet.com/safety-is-de...

February 26, 2025 at 4:03 PM

Artificial Intelligence Security

@aisecurity.bsky.social

A new research paper introduces Indiana Jones, a highly effective method for jailbreaking large language models. It uses dialogues between multiple specialized AI systems and historically framed prompts to achieve high success rates.

www.arewesafeyet.com/indiana-jone...

February 22, 2025 at 1:34 PM

Artificial Intelligence Security

@aisecurity.bsky.social

According to Penn researchers, AI robots are fantastic at following orders.

The problem? They don’t care if those orders come from you or a hacker.

Safety features? Working on it.

www.arewesafeyet.com/ai-robots-ar...

October 23, 2024 at 8:45 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news