Emilio Ferrara
banner
emilioferrara.bsky.social
Emilio Ferrara
@emilioferrara.bsky.social
Prof of Computer Science at USC
AI, social media, society, networks, data, and
HUMANS LABS http://www.emilio.ferrara.name
We created a framework for auditing and characterizing the undesireable effects of alignment safeguards in LLMs, that can result in censorship or information suppression. And we tested DeepSeek against potentially sensitive topics!
October 16, 2025 at 10:44 PM
We uncovered different overt and covert information suppression dynamics, as well as even more subtle ways DeepSeek answers are internally moderated, selectively presented, and at times even framed with ideological alignment to state sponsored propaganda narratives.

arxiv.org/abs/2506.12349
Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek
This study examines information suppression mechanisms in DeepSeek, an open-source large language model (LLM) developed in China. We propose an auditing framework and use it to analyze the model's res...
arxiv.org
June 22, 2025 at 11:52 PM
Thx! Very useful!
June 22, 2025 at 11:47 PM
wait until they hear matplotlib...
April 8, 2025 at 3:58 PM
lol insisting is indeed one possible strategy; the screenshots maybe are not that clear but I asked exactly the same thing four times in a row and once I got a no redacted answer!
January 31, 2025 at 1:15 AM
Once the response composition is completed, however, the entire answer is deleted and replaced by the famous error message “Sorry, that's beyond my current scope. Let’s talk about something else.”

Up to us, as researchers, to decide what kind of model alignments we find acceptable.
January 30, 2025 at 6:09 PM