Lightnews — Scholar-powered news

earlence.bsky.social

@earlence.bsky.social

Prompt injection attacks are the AI version of stack smashing from the 90s. Yet, most efforts are trying to defend against this by hoping to build better robust models (aka, computer programs). Do you see the issue here?

May 6, 2025 at 4:56 AM

Reposted

Somesh Jha

@someshjha.bsky.social

SAGAI'25 will investigate the safety, security, and privacy of GenAI agents from a system design perspective. We are experimenting with a new "Dagstuhl" like seminar with invited speakers and discussion. Really excited about this workshop at IEEE Security and Privacy Symposium.

SAGAI'25 @ IEEE S&P

Goal The workshop will investigate the safety, security, and privacy of GenAI agents from a system design perspective. We believe that this new category of important and critical system components req...

sites.google.com

March 31, 2025 at 7:32 PM

earlence.bsky.social

@earlence.bsky.social

We found a way to compute optimization-based LLM prompt injections on proprietary models by misusing the fine tuning interface. Set learning rate to near zero, you get loss values on candidate attack tokens, without really changing base model. Tested on Gemini.

arxiv.org/abs/2501.09798

Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API

We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can levera...

arxiv.org

January 21, 2025 at 3:26 PM

earlence.bsky.social

@earlence.bsky.social

I'm teaching a grad course on LLM Security at UCSD. In addition to academic papers, I've included material from the broader community.

I'm looking for 1 good article on LLM agent security. Send me recs!

cseweb.ucsd.edu/~efernandes/...

CSE 291: LLM Security

cseweb.ucsd.edu

January 2, 2025 at 4:11 PM

earlence.bsky.social

@earlence.bsky.social

2024 is ending and it marks just over 2 years at UCSD. Here is a short summary of things we've been doing.

www.linkedin.com/pulse/some-s...

Some stuff we've been doing at UCSD

It has been little more than two years since I moved to sunny San Diego and restarted my research group (now with Luoxi Meng, Nishit Pandya, Andrey Labunets and Xiaohan Fu, and occasionally Ashish Hoo...

www.linkedin.com

December 28, 2024 at 5:14 PM

earlence.bsky.social

@earlence.bsky.social

Is there a GenAI service (or services) that will allow me to upload an image and then specify some text that modifies the image, and get back a new image with those modifications? Eg, say I upload a picture of spiderman in a seated position with text "convert this spiderman into standing position"

December 25, 2024 at 3:14 PM

earlence.bsky.social

@earlence.bsky.social

I think that I've finally come to a reasonable definition of GenAI jailbreaking. A jailbreak is a privilege escalation. It allows the attacker to force the model to undertake arbitrary instructions, regardless of whatever safeguards might be in place.

December 14, 2024 at 5:41 PM

earlence.bsky.social

@earlence.bsky.social

I will go one step further. To become a bike lane/traffic planner, you have to ride the bike lane yourself.

David Hope @davidhope.ca · Dec 9

Adding stock-image children to a bike lane design should be a necessary planning step.

Eric Bunch @ericbunch.bsky.social · Dec 9

Now if you really want to highlight the absurdity of this design, photoshop a couple of three year olds into it.

December 9, 2024 at 11:56 PM

Reposted

Matt Burgess (WIRED)

@mattburgess1.bsky.social

NEW: For the last few months, officials at Britain’s NCA have explained to me how they discovered and disrupted two massive Russian money laundering rings.

The networks have moved billions each year and—unusually—have been caught swapping cash for crypto with drugs gangs

🧵 A wild thread...

She Was a Russian Socialite and Influencer. Cops Say She’s a Crypto Laundering Kingpin

Western authorities say they’ve identified a network that found a new way to clean drug gangs’ dirty cash. WIRED gained exclusive access to the investigation.

www.wired.com

December 4, 2024 at 3:48 PM

earlence.bsky.social

@earlence.bsky.social

A good explainer on the security pitfalls of "AI Agents"

spectrum.ieee.org/ai-agents

Explainer: What Are AI Agents?

Here's how AI agents work, why people are jazzed about them, and what risks they hold

spectrum.ieee.org

November 26, 2024 at 4:28 PM

Reposted

The Citizen Lab

@citizenlab.ca

📢 Our latest report reveals that the US storefront of Amazon uses a system to restrict shipments of certain products. We found 17k+ products that were restricted from being shipped to specific regions, with the most common type of product being books 📚.
citizenlab.ca/2024/11/anal...

Banned Books: Analysis of Censorship on Amazon.com - The Citizen Lab

We analyze the system Amazon deploys on the US “amazon.com” storefront to restrict shipments of certain products to specific regions. We found 17,050 products that Amazon restricted from being shipped...

citizenlab.ca

November 25, 2024 at 8:37 PM

earlence.bsky.social

@earlence.bsky.social

My Christmas break plan is to learn Rust. Any pointers to resources that you found particularly useful?

November 21, 2024 at 9:29 PM

Reposted

Matt Burgess (WIRED)

@mattburgess1.bsky.social

STORY with @lhn.bsky.social: Meta is speaking out about pig butchering scams for the first time—it says it has removed 2 million pig butchering accounts this year.

In one instance, OpenAI alerted Meta to criminals using ChatGPT to generate comments used in scams

Meta Finally Breaks Its Silence on Pig Butchering

The company gave details for the first time on its approach to combating organized criminal networks behind the devastating scams.

www.wired.com

November 21, 2024 at 6:21 PM

earlence.bsky.social

@earlence.bsky.social

I will be adopting this terminology as well.

Kristopher Micinski @krismicinski.bsky.social · Nov 18

On another note, I will permanently be pronouncing this app in the same way as “brewski”

November 18, 2024 at 7:47 PM

Reposted

Loris D'Antoni

@lorisdanto.bsky.social

My postdoc Charlie Murphy is on the academic job market this fall. He's doing really hard technical work on building constraint solvers and synthesis engines. You should interview him
pages.cs.wisc.edu/~tcmurphy4/

Charlie Murphy

pages.cs.wisc.edu

November 16, 2024 at 7:50 AM

earlence.bsky.social

@earlence.bsky.social

New idea for Anthropic's computer use agent. Task it with going thru my Twitter, finding those folks here, and following them.

November 16, 2024 at 2:49 AM

earlence.bsky.social

@earlence.bsky.social

first thing I did after joining this new twitter was follow a bunch of PL folks. And some security folks.

November 16, 2024 at 2:46 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news