Emilio Ferrara
@emilioferrara.bsky.social
Prof of Computer Science at USC
AI, social media, society, networks, data, and
HUMANS LABS http://www.emilio.ferrara.name
AI, social media, society, networks, data, and
HUMANS LABS http://www.emilio.ferrara.name
We created a framework for auditing and characterizing the undesireable effects of alignment safeguards in LLMs, that can result in censorship or information suppression. And we tested DeepSeek against potentially sensitive topics!
October 16, 2025 at 10:44 PM
We created a framework for auditing and characterizing the undesireable effects of alignment safeguards in LLMs, that can result in censorship or information suppression. And we tested DeepSeek against potentially sensitive topics!
We uncovered different overt and covert information suppression dynamics, as well as even more subtle ways DeepSeek answers are internally moderated, selectively presented, and at times even framed with ideological alignment to state sponsored propaganda narratives.
arxiv.org/abs/2506.12349
arxiv.org/abs/2506.12349
Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek
This study examines information suppression mechanisms in DeepSeek, an open-source large language model (LLM) developed in China. We propose an auditing framework and use it to analyze the model's res...
arxiv.org
June 22, 2025 at 11:52 PM
We uncovered different overt and covert information suppression dynamics, as well as even more subtle ways DeepSeek answers are internally moderated, selectively presented, and at times even framed with ideological alignment to state sponsored propaganda narratives.
arxiv.org/abs/2506.12349
arxiv.org/abs/2506.12349
Thx! Very useful!
June 22, 2025 at 11:47 PM
Thx! Very useful!
Reposted by Emilio Ferrara
Paper here:
arxiv.org/abs/2505.21729
arxiv.org/abs/2505.21729
Bridging the Narrative Divide: Cross-Platform Discourse Networks in Fragmented Ecosystems
Political discourse has grown increasingly fragmented across different social platforms, making it challenging to trace how narratives spread and evolve within such a fragmented information ecosystem....
arxiv.org
June 22, 2025 at 6:45 PM
Paper here:
arxiv.org/abs/2505.21729
arxiv.org/abs/2505.21729
wait until they hear matplotlib...
April 8, 2025 at 3:58 PM
wait until they hear matplotlib...
Reposted by Emilio Ferrara
🤩Cool collaboration w/ @jinyiye.bsky.social @emilioferrara.bsky.social @luceriluc.bsky.social
🔍Read more: arxiv.org/abs/2502.11248
📊Resources available: github.com/angelayejiny...
🔍Read more: arxiv.org/abs/2502.11248
📊Resources available: github.com/angelayejiny...
Prevalence, Sharing Patterns, and Spreaders of Multimodal AI-Generated Content on X during the 2024 U.S. Presidential Election
While concerns about the risks of AI-generated content (AIGC) to the integrity of social media discussions have been raised, little is known about its scale and the actors responsible for its dissemin...
arxiv.org
February 21, 2025 at 6:27 AM
🤩Cool collaboration w/ @jinyiye.bsky.social @emilioferrara.bsky.social @luceriluc.bsky.social
🔍Read more: arxiv.org/abs/2502.11248
📊Resources available: github.com/angelayejiny...
🔍Read more: arxiv.org/abs/2502.11248
📊Resources available: github.com/angelayejiny...
lol insisting is indeed one possible strategy; the screenshots maybe are not that clear but I asked exactly the same thing four times in a row and once I got a no redacted answer!
January 31, 2025 at 1:15 AM
lol insisting is indeed one possible strategy; the screenshots maybe are not that clear but I asked exactly the same thing four times in a row and once I got a no redacted answer!
Once the response composition is completed, however, the entire answer is deleted and replaced by the famous error message “Sorry, that's beyond my current scope. Let’s talk about something else.”
Up to us, as researchers, to decide what kind of model alignments we find acceptable.
Up to us, as researchers, to decide what kind of model alignments we find acceptable.
January 30, 2025 at 6:09 PM
Once the response composition is completed, however, the entire answer is deleted and replaced by the famous error message “Sorry, that's beyond my current scope. Let’s talk about something else.”
Up to us, as researchers, to decide what kind of model alignments we find acceptable.
Up to us, as researchers, to decide what kind of model alignments we find acceptable.