Leon Derczynski
banner
leonderczynski.bsky.social
Leon Derczynski
@leonderczynski.bsky.social
LLM Security at NVIDIA

Prof in CS/NLP at IT University of Copenhagen

garak guy, garak.ai

"berømt skikkelse"
"like a gazelle"

Copenhagen/Seattle
Pinned
further recommendations welcome!

go.bsky.app/5PrJYrj
😂 arXiv is cute when it pretends to have standards!
November 3, 2025 at 5:53 AM
Come to LLMSEC at ACL & hear Niloofar's keynote

"What does it mean for agentic AI to preserve privacy?" - Niloofar Mireshghallah, Meta/CMU

(Friday 1st Aug, 11.00; Austria Center Vienna Hall B)

See you there!

#acl2025 #acl2025nlp
July 28, 2025 at 3:19 PM
Reposted by Leon Derczynski
logging on
July 10, 2025 at 11:35 AM
new garak, llm vuln scanner rls (v0.12.0)

* Audio attacks, for multimodal models
* More training data membership inference attacks
* Multilingual attacks can now also use GCP
* Detailed eval summary in one JSONL row/object

+more :)

details: github.com/NVIDIA/garak...
Release v0.12.0 · NVIDIA/garak
What's Changed New plugins Add audio NIM model and audio probes by @erickgalinkin in #1163 Leakreplay refactor by @dchiitmalla in #1264 probes: refactor fact snippet mixin by @leondz in #1187 New...
github.com
July 2, 2025 at 3:32 PM
the dying but clinging on battery in the bathroom's Frozen-branded soap dispenser reminds me that it's only 4-5 months til Bublé & Let It Go season. aren't you looking forward
June 27, 2025 at 5:01 AM
why do academics send and expect so much weekend email and work. not healthy
June 22, 2025 at 6:10 AM
computer scientists encountering the concept of "desirable difficulty"
New research from MIT found that those who used ChatGPT can’t remember any of the content of their essays.

Key takeaway: the product doesn’t suffer, but the process does. And when it comes to essays, the process *is* how they learn.

arxiv.org/pdf/2506.088...
June 19, 2025 at 4:15 PM
remembering the time i checked in to my reasonably classy russian business hotel late with my wife, and the staff said "sir, this... girl.. not allowed"

she's a serious professor

we went through to the room, opened the balcony door, and buried a bottle of champagne in the metre of snow

good times
June 18, 2025 at 6:56 AM
@jjvincent.bsky.social woah ur really famous! love this attack also. I automate and run it for a living

www.instagram.com/reel/DKz9ezj...
Login • Instagram
Welcome back to Instagram. Sign in to check out what your friends, family & interests have been capturing & sharing around the world.
www.instagram.com
June 14, 2025 at 6:06 AM
Reposted by Leon Derczynski
Great to see our work uncovering dangerous issues in commercial LLM "therapists" getting some coverage: futurism.com/stanford-the...
June 14, 2025 at 4:01 AM
"natwirkung"

"wirk smorter nat horder"

accents dreamed up by the utterly deranged

(what is going on with that 🇺🇸 vowel sheft)
June 8, 2025 at 10:05 AM
i need you to understand that "alternate uses" is a terrible test/definition of creativity and has been for some time. it's extremely narrow, very shallow, and misses almost everything we know about creativity
June 3, 2025 at 5:09 AM
3² + 4² = 5² ? big if true
May 21, 2025 at 10:24 AM
if overleaf being down slows "ai progress", i'm not sure "ai progress" is particularly well defined
May 15, 2025 at 5:17 AM
is a dropped copula a dropula
May 14, 2025 at 8:29 AM
Here's my "Most Inappropriate Demo" trophy at NVIDIA, 2024. For garak's "atkgen.Tox" probe, an unfettered LLM used to goad other LLMs into being toxic.
March 19, 2025 at 1:30 PM
Reposted by Leon Derczynski
“If she wants to know something specific, but doesn’t want people to notice her asking questions, she should simply make incorrect statements while in the company of experts. Her companions will correct her, especially if they're men.”

- Advice for female agents in WW2, provided during SOE training
March 17, 2025 at 11:52 AM
Reposted by Leon Derczynski
its amazing how chatgpt knows everything about subjects I know nothing about, but is wrong like 40% of the time in things im an expert on. not going to think about this any further
March 8, 2025 at 12:13 AM
was about to dump all my practical knowledge and "I've been thinking about" crap on agent security into a blog post but i do not think the web can take yet another one of those. drank wine instead
February 21, 2025 at 8:22 PM
Reposted by Leon Derczynski
they are openly advocating for the use of physiognomy in recruitment

make it stop
February 21, 2025 at 5:50 PM
things i'm genuinely enjoying rn:

* successfully not reading any news
* getting to do 50h of work in one week (it was enjoyable, usual caveats apply)
* finally a largely healthy family
February 21, 2025 at 6:43 AM
it's a weekday where I dont have to take pacific time calls
February 17, 2025 at 5:35 PM
my aunt in law has a shetland pony in her freezer for the dogs
February 16, 2025 at 11:43 AM
you know the field has changed when the foreign event you were speaking at is on the tv news on the bus home
February 13, 2025 at 9:18 AM
Will be representing NVIDIA at the EU AI Summit in Paris. I'll be talking about how we build & help others build safe, secure AI systems.

On 11.2 you can see me at:

* AI Assurance and Testing: Global Perspectives

* Building trustworthy AI: balancing innovation, responsibility, and democratization
February 8, 2025 at 11:35 AM